CN112016507B

CN112016507B - Super-resolution-based vehicle detection method, device, equipment and storage medium

Info

Publication number: CN112016507B
Application number: CN202010926415.7A
Authority: CN
Inventors: 林春伟; 刘莉红; 刘玉宇
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2020-09-07
Filing date: 2020-09-07
Publication date: 2023-10-31
Anticipated expiration: 2040-09-07
Also published as: CN112016507A

Abstract

The invention relates to the technical field of artificial intelligence, and provides a vehicle detection method, device, equipment and storage medium based on super-resolution, which are used for improving the efficiency and accuracy of vehicle detection on low-resolution images. The vehicle detection method based on super resolution comprises the following steps: sequentially performing super-resolution reconstruction processing and detection on a preset target vehicle image training set through a super-resolution reconstruction sub-network and a target detection sub-network in an initial neural network to obtain detection information; acquiring a target loss function value according to the target image set and the detection information; iteratively adjusting parameters of the initial neural network according to the target loss function value to obtain a target neural network; and carrying out vehicle detection on a preset target vehicle image test set through a target neural network to obtain a target vehicle detection result. The method is suitable for the intelligent traffic field, and can further promote the construction of intelligent cities. In addition, the invention also relates to a blockchain technology, and the target vehicle detection result can be stored in the blockchain.

Description

Super-resolution-based vehicle detection method, device, equipment and storage medium

Technical Field

The invention relates to the technical field of computer vision and deep learning of artificial intelligence, in particular to a vehicle detection method, device and equipment based on super resolution and a storage medium.

Background

The road monitoring camera plays an important role in traffic monitoring. The computer vision technology can detect vehicles in the monitoring picture by utilizing a target detection algorithm, and helps traffic and public security departments to find specific vehicles. However, since the camera is at a certain distance from the road vehicle, the vehicles displayed on part of the monitoring image are smaller, and the vehicles usually run at a higher speed, and the vehicles on part of the image are blurred, the existing target detection algorithm is difficult to detect the vehicles in the low-resolution and/or blurred images. Super resolution techniques are generally employed to solve the above problems.

Super-resolution (super-resolution) refers to a technique of reconstructing a corresponding high-resolution image from an observed low-resolution image. With super-resolution techniques, a high resolution picture may be obtained from a low resolution image, or blur in a blurred image may be removed. The image is processed in super resolution, and the existing target detection algorithm is utilized, so that the accuracy of detecting the vehicle can be improved to a certain extent.

However, due to the difference of the super-resolution algorithm and the target detection algorithm, independent neural networks are sequentially used, so that the front and rear neural networks lack consistency and cannot furthest mine the internal rules of the data, and therefore the improvement of the vehicle detection result is not obvious. Some super resolution methods even do not facilitate target detection of vehicles in the image. In addition, before the existing target detection neural network, the super-resolution neural network is added, so that the calculated amount is increased, and the processing speed is obviously reduced. Thus, in the related art, the efficiency and accuracy of vehicle detection on a low resolution image are low.

Disclosure of Invention

The invention mainly aims to solve the problems of low efficiency and low accuracy of vehicle detection on a low-resolution image in the prior art.

The first aspect of the present invention provides a vehicle detection method based on super resolution, comprising:

performing super-resolution reconstruction processing on a preset target vehicle image training set through a super-resolution reconstruction sub-network in a preset initial neural network to obtain a target image set, wherein the super-resolution reconstruction sub-network comprises a feature extraction layer, a multi-level feature division multiplexing layer and a transposition convolution layer, and the resolution of the target image set is higher than that of the target vehicle image training set;

Detecting the target image set through a target detection sub-network in the initial neural network to obtain detection information, wherein the target detection sub-network comprises a central point network layer, a region generation network layer, an environment information enhancement layer and a spatial attention layer;

acquiring a target loss function value of the initial neural network according to the target image set and the detection information;

according to the target loss function value, iteratively adjusting parameters of the initial neural network until the target loss function value converges to obtain a target neural network;

and carrying out vehicle detection on a preset target vehicle image test set through the target neural network to obtain a target vehicle detection result.

Optionally, in a first implementation manner of the first aspect of the present invention, the acquiring the objective loss function value of the initial neural network according to the objective image set and the detection information includes:

calculating a first loss function value of a super-resolution reconstruction sub-network in the initial neural network through the target image set and a preset mean square error loss function;

calculating a second loss function value of a target detection sub-network in the initial neural network through the detection information and a preset regression loss function;

And calculating a weighted sum of the first loss function value and the second loss function value according to a preset weight value to obtain a target loss function value of the initial neural network.

Optionally, in a second implementation manner of the first aspect of the present invention, the performing, by using a super-resolution reconstruction sub-network in a preset initial neural network, super-resolution reconstruction processing on a preset target vehicle image training set to obtain a target image set, where the super-resolution reconstruction sub-network includes a feature extraction layer, a multi-level feature division multiplexing layer, and a transpose convolution layer, and the resolution of the target image set is higher than that of the target vehicle image training set, including:

performing feature extraction on a preset target vehicle image training set through a feature extraction layer in a preset initial neural network to obtain convolution features, wherein the super-resolution reconstruction sub-network comprises a feature extraction layer, a multi-level feature division multiplexing layer and a transposition convolution layer;

sequentially carrying out multi-level convolution, multi-level feature shunting and dimension reduction on the convolution features through the multi-level feature shunting multiplexing layer to obtain features to be transposed;

and performing convolution kernel matrix calculation and convolution kernel matrix transposition processing on the feature to be transposed through the transposition convolution layer to obtain a target image set, wherein the resolution of the target image set is higher than that of the target vehicle image training set.

Optionally, in a third implementation manner of the first aspect of the present invention, the sequentially performing multi-level convolution, multi-level feature splitting and dimension reduction on the convolution feature by using the multi-level feature splitting multiplexing layer to obtain a feature to be transposed includes:

performing convolution processing on the convolution characteristics through a characteristic division multiplexing unit in the multistage characteristic division multiplexing layer to obtain initial characteristics, wherein the multistage characteristic division multiplexing layer comprises a characteristic division multiplexing unit and a dimension reduction unit;

the initial characteristics are subjected to split flow treatment through preset channel dimensions, so that initial multi-layer characteristics and initial one-layer characteristics are obtained;

performing convolution processing and shunt processing of a preset level on the initial multilayer feature to obtain a target multilayer feature and a plurality of candidate one-layer features;

the initial one-layer feature and the plurality of candidate one-layer features are subjected to the serial connection and convolution processing of the preset level through the dimension reduction unit, so that a target one-layer feature is obtained;

and carrying out element-by-element addition processing on the target multilayer characteristic and the target one-layer characteristic to obtain a characteristic to be transposed.

Optionally, in a fourth implementation manner of the first aspect of the present invention, the detecting, by using a target detection sub-network in the initial neural network, the target image set to obtain detection information, where the target detection sub-network includes a central point network layer, a region generation network layer, an environmental information enhancement layer, and a spatial attention layer, includes:

Generating a multi-level feature map of the target image set through a preset central point network layer, wherein the target detection sub-network comprises a central point network layer, a region generation network layer, an environment information enhancement layer and a spatial attention layer;

generating a multi-scale feature map of the multi-level feature map through the region generation network layer, wherein the region generation network layer comprises a convolution layer with a convolution kernel of 5*5 and a convolution layer with 256 output channels and 1*1 convolution kernels;

sequentially carrying out convolution processing and fusion processing on the multi-level feature map through the environment information enhancement layer to obtain environment information enhancement features;

and carrying out fusion processing and classification processing on the feature graphs of the multiple scales and the environmental information enhancement features through the spatial attention layer and a preset convolutional neural network to obtain detection information.

Optionally, in a fifth implementation manner of the first aspect of the present invention, the performing, by the environmental information enhancement layer, convolution processing and fusion processing on the multi-level feature map sequentially to obtain environmental information enhancement features includes:

the multi-level feature images are ranked according to the order from small scale to large scale, the first multi-level feature image is determined to be an initial feature image, and the multi-level feature images except the initial feature image in the multi-level feature images are determined to be a plurality of candidate feature images;

Respectively carrying out 2 times up-sampling treatment and global average pooling treatment on the initial feature map through the environment information enhancement layer to obtain a pooling feature map and a sampling feature map;

respectively carrying out convolution processing on the candidate feature images to obtain a plurality of target feature images;

and carrying out matrix addition processing on the pooled feature map, the sampling feature map and the plurality of target feature maps to obtain the environment information enhancement feature.

Optionally, in a sixth implementation manner of the first aspect of the present invention, before performing the super-resolution reconstruction processing on the preset target vehicle image training set through the super-resolution reconstruction sub-network in the preset initial neural network to obtain the target image set, the method further includes:

acquiring a vehicle image data set, and dividing the vehicle image data set into an initial vehicle image training set and an initial vehicle image testing set according to a preset proportion;

respectively carrying out data preprocessing on the initial vehicle image training set and the initial vehicle image testing set to obtain a candidate vehicle image training set and a target vehicle image testing set;

sampling the candidate vehicle image training set according to a preset minimum intersection ratio to obtain a sampling chart set;

And carrying out scale transformation and overturning on the sampling atlas to obtain a target vehicle image training set.

A second aspect of the present invention provides a super-resolution-based vehicle detection apparatus, comprising:

the reconstruction module is used for carrying out super-resolution reconstruction processing on a preset target vehicle image training set through a super-resolution reconstruction sub-network in a preset initial neural network to obtain a target image set, wherein the super-resolution reconstruction sub-network comprises a feature extraction layer, a multi-level feature division multiplexing layer and a transposition convolution layer, and the resolution of the target image set is higher than that of the target vehicle image training set;

the first detection module is used for detecting the target image set through a target detection sub-network in the initial neural network to obtain detection information, wherein the target detection sub-network comprises a central point network layer, a region generation network layer, an environment information enhancement layer and a spatial attention layer;

the acquisition module is used for acquiring a target loss function value of the initial neural network according to the target image set and the detection information;

the iteration adjustment module is used for carrying out iteration adjustment on the parameters of the initial neural network according to the target loss function value until the target loss function value is converged to obtain a target neural network;

And the second detection module is used for detecting the vehicle through the target neural network to a preset target vehicle image test set to obtain a target vehicle detection result.

Optionally, in a first implementation manner of the second aspect of the present invention, the acquiring module is specifically configured to:

Optionally, in a second implementation manner of the second aspect of the present invention, the reconstruction module includes:

the feature extraction sub-module is used for extracting features of a preset target vehicle image training set through a feature extraction layer in a preset initial neural network to obtain convolution features, and the super-resolution reconstruction sub-network comprises a feature extraction layer, a multi-level feature division multiplexing layer and a transposition convolution layer;

The characteristic shunting submodule is used for sequentially carrying out multi-level convolution, multi-level characteristic shunting and dimension reduction on the convolution characteristic through the multi-level characteristic shunting multiplexing layer to obtain a characteristic to be transposed;

and the convolution transposition sub-module is used for carrying out convolution kernel matrix calculation and convolution kernel matrix transposition processing on the feature to be transposed through the transposition convolution layer to obtain a target image set, wherein the resolution of the target image set is higher than that of the target vehicle image training set.

Optionally, in a third implementation manner of the second aspect of the present invention, the feature shunting submodule is specifically configured to:

Optionally, in a fourth implementation manner of the second aspect of the present invention, the first detection module includes:

the first generation sub-module is used for generating a multi-level feature map of the target image set through a preset central point network layer, and the target detection sub-network comprises a central point network layer, a region generation network layer, an environment information enhancement layer and a spatial attention layer;

a second generating sub-module, configured to generate a network layer through the area, and generate a feature map with multiple scales of the multi-level feature map, where the area generating network layer includes a convolution layer with a convolution kernel 5*5 and a convolution layer with 256 output channels and a convolution kernel 1*1;

the convolution fusion sub-module is used for sequentially carrying out convolution processing and fusion processing on the multi-level feature map through the environment information enhancement layer to obtain environment information enhancement features;

and the fusion classification sub-module is used for carrying out fusion processing and classification processing on the characteristic diagrams with various scales and the environment information enhancement characteristics through the spatial attention layer and a preset convolutional neural network to obtain detection information.

Optionally, in a fifth implementation manner of the second aspect of the present invention, the convolution fusion submodule is specifically configured to:

Optionally, in a sixth implementation manner of the second aspect of the present invention, the super-resolution-based vehicle detection device further includes:

the dividing module is used for acquiring a vehicle image data set and dividing the vehicle image data set into an initial vehicle image training set and an initial vehicle image testing set according to a preset proportion;

The preprocessing module is used for respectively preprocessing data of the initial vehicle image training set and the initial vehicle image testing set to obtain a candidate vehicle image training set and a target vehicle image testing set;

the sampling module is used for sampling the candidate vehicle image training set according to a preset minimum intersection ratio to obtain a sampling chart set;

and the transformation overturning module is used for carrying out scale transformation and overturning on the sampling atlas to obtain a target vehicle image training set.

A third aspect of the present invention provides a super-resolution-based vehicle detection apparatus, comprising: a memory and at least one processor, the memory having instructions stored therein; the at least one processor invokes the instructions in the memory to cause the super-resolution based vehicle detection device to perform the super-resolution based vehicle detection method described above.

A fourth aspect of the present invention provides a computer-readable storage medium comprising a storage data area storing data created according to use of blockchain nodes and a storage program area storing instructions that, when run on a computer, cause the computer to perform the above-described super-resolution-based vehicle detection method.

According to the technical scheme, through a super-resolution reconstruction sub-network in a preset initial neural network, super-resolution reconstruction processing is carried out on a preset target vehicle image training set to obtain a target image set, wherein the super-resolution reconstruction sub-network comprises a feature extraction layer, a multi-level feature division multiplexing layer and a transposition convolution layer, and the resolution of the target image set is higher than that of the target vehicle image training set; detecting a target image set through a target detection sub-network in the initial neural network to obtain detection information, wherein the target detection sub-network comprises a central point network layer, a region generation network layer, an environment information enhancement layer and a spatial attention layer; acquiring a target loss function value of the initial neural network according to the target image set and the detection information; according to the target loss function value, carrying out iterative adjustment on the parameters of the initial neural network until the target loss function value converges to obtain a target neural network; and carrying out vehicle detection on a preset target vehicle image test set through a target neural network to obtain a target vehicle detection result. According to the invention, the super-resolution reconstruction sub-network comprising the feature extraction layer, the multi-level feature shunt multiplexing layer and the transposed convolution layer and the target detection sub-network comprising the central point network layer, the region generation network layer, the environment information enhancement layer and the spatial attention layer are adopted, so that the shallow layer features rich in position information of the vehicle image can be repeatedly utilized, the deep layer features rich in semantic information of the vehicle image can be extracted, the calculated amount is less, the operation speed is not obviously reduced, the detection speed and the detection precision are higher, the super-resolution reconstruction sub-network and the target detection sub-network are combined to train by adopting the target loss function comprising the loss function value of the super-resolution reconstruction sub-network and the loss function value of the target detection sub-network, the internal rule of the vehicle image can be excavated to the maximum extent, the target neural network has higher vehicle detection precision, recall rate and robustness for the low-resolution vehicle image with noise and fuzzy, and the efficiency and the precision of vehicle detection on the low-resolution image are further improved.

Drawings

FIG. 1 is a schematic diagram of an embodiment of a super-resolution-based vehicle detection method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of another embodiment of a super-resolution-based vehicle detection method according to an embodiment of the present invention;

FIG. 3 is a schematic view of an embodiment of a super-resolution-based vehicle detection apparatus according to an embodiment of the present invention;

FIG. 4 is a schematic view of another embodiment of a super-resolution-based vehicle detection apparatus according to an embodiment of the present invention;

fig. 5 is a schematic diagram of an embodiment of a super-resolution-based vehicle detection apparatus in an embodiment of the present invention.

Detailed Description

The embodiment of the invention provides a vehicle detection method, device, equipment and storage medium based on super-resolution, which solve the problems of low efficiency and low accuracy of vehicle detection on low-resolution images in the prior art.

The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.

For ease of understanding, a specific flow of an embodiment of the present invention is described below with reference to fig. 1, and an embodiment of a super-resolution-based vehicle detection method in an embodiment of the present invention includes:

101. and performing super-resolution reconstruction processing on a preset target vehicle image training set through a super-resolution reconstruction sub-network in a preset initial neural network to obtain a target image set, wherein the super-resolution reconstruction sub-network comprises a feature extraction layer, a multi-level feature shunt multiplexing layer and a transposition convolution layer, and the resolution of the target image set is higher than that of the target vehicle image training set.

It is to be understood that the execution subject of the present invention may be a super-resolution-based vehicle detection device, and may also be a terminal or a server, which is not limited herein. The embodiment of the invention is described by taking a server as an execution main body as an example.

After the initial vehicle image training set and the initial vehicle image testing set are obtained, the server performs data preprocessing on the initial vehicle image training set and the initial vehicle image testing set to obtain a candidate vehicle image training set and a target vehicle image testing set, the data preprocessing comprises data cleaning, data integration, data transformation, data reduction and the like, original pictures of all vehicle images in the candidate vehicle image training set can be used as target vehicle image training sets, candidate vehicle images of part of the candidate vehicle image training set can be randomly selected, part of the vehicle images are cut, size transformed and turned to obtain target vehicle images, and the target vehicle images and vehicle images except the candidate vehicle images in the candidate vehicle image training set are used as target vehicle image training sets.

After the server obtains the target vehicle image training set and the target vehicle image testing set, the target vehicle image training set is input into a preset initial neural network, and the target vehicle image is subjected to feature extraction, feature distribution and transposition convolution processing through a feature extraction layer, a multi-level feature distribution multiplexing layer and a transposition convolution layer of a super-resolution reconstruction sub-network in the initial neural network to obtain the target image set. The output of the characteristic extraction layer is the input of the multistage characteristic shunt multiplexing layer, and the output of the multistage characteristic shunt multiplexing layer is the input of the transposed convolution layer.

It should be noted that, the multi-stage feature division multiplexing layer may perform convolution, division and fusion processing on the features extracted by the feature extraction layer, the multi-stage feature division multiplexing layer may include a plurality of convolution layers, and the fusion processing of the multi-stage feature division multiplexing layer may be a matrix addition or matrix multiplication manner. The transposed convolution layer does not include an activation function.

102. And detecting the target image set through a target detection sub-network in the initial neural network to obtain detection information, wherein the target detection sub-network comprises a central point network layer, a region generation network layer, an environment information enhancement layer and a spatial attention layer.

The target detection sub-network is composed of a backbone network and a detection network, wherein the backbone network is composed of a central point network layer centrnet, and the detection network is composed of a region generation network layer, an environment information enhancement layer and a space attention layer. The central point network layer central net can be a CNet146 network, a CNet49 network and/or a CNet535 network, in order to prevent the number of output channels in the convolution layers from being large, so that the calculation amount is greatly increased, the convolution layers Conv5 are not used in the CNet146 and the CNet535, the CNet49 and the CNet535 respectively have higher speed and detection precision, and the CNet146 can achieve better balance between the speed and the precision.

The regional generation network layer (Region Proposal Network, RPN) comprises a convolution layer with a convolution kernel 5*5 and a convolution layer with a convolution kernel 1*1 with a convolution channel number of 256, the convolution kernel of 5*5 is larger than the convolution kernel of 3*3, more environmental information can be captured, the calculation cost can be reduced by the convolution kernel of 1*1, and the detection precision can be well kept.

The environment information enhancement layer is responsible for fusing environment information with different scales, encoding more environment information, increasing receptive fields and generating more distinguishing characteristics. The spatial attention layer readjusts the feature distribution of the feature map output by the environmental information enhancement layer in the spatial dimension, thereby obtaining detection information.

103. And acquiring a target loss function value of the initial neural network according to the target image set and the detection information.

The server invokes the loss function of the initial neural network and calculates according to the target image set, the detection information and the loss function of the initial neural networkCalculating a target loss function value of the initial neural network, wherein the loss function of the initial neural network is as follows: l (x, y; θ) _SR )＝αL _rec (x,S(↓(x)；θ _SR ))+βL _task (y,D(S(↓(x)；θ _SR ) X is the target image set and the detection information, y is the vehicle reality value of each target vehicle image in the target vehicle image training set, θ _SR For training parameters of the initial neural network, alpha and beta are respectively the weight of the super-resolution reconstruction sub-network and the weight of the target detection sub-network, L _rec Reconstructing a loss function value of the sub-network for super-resolution, L _task And (3) the loss function value of the target detection sub-network is obtained, S is the loss function of the super-resolution reconstruction sub-network, and D is the loss function of the target detection sub-network. The super-resolution sub-network can be induced to reconstruct an image which has high visual quality and can be correctly classified by a machine through a loss function of the initial neural network.

104. And (3) carrying out iterative adjustment on the parameters of the initial neural network according to the target loss function value until the target loss function value is converged, so as to obtain the target neural network.

The training strategy of the initial neural network is as follows: the batch size is 6, the initial learning rate is 1e-4, the batch is reduced by 10 times after every 105 iterations, and the momentum is 0.9 by adopting an Adam optimizer. At the beginning of training, β was set to 0 and after 105 iterations, β was set to 0.1. And the server iteratively adjusts the parameters of the initial neural network according to the objective function value until the objective loss function value converges to obtain the objective neural network, wherein the parameters can be network structure parameters of the initial neural network or super parameters of the initial neural network. The larger the average accuracy (mean average precision, mAP) of the target neural network is, the higher the detection accuracy is, and although the peak signal-to-noise ratio PSNR of the reconstructed image is not obviously improved, sometimes even reduced, the method can obviously improve the detected mAP.

105. And carrying out vehicle detection on a preset target vehicle image test set through a target neural network to obtain a target vehicle detection result.

After the server obtains the target neural network through training, the target neural network is called, and vehicle detection is carried out on a preset target vehicle image test set so as to test the target neural network. The server can calculate the detection accuracy of the target neural network according to the detection result of the target vehicle, and optimize the target neural network according to the detection accuracy.

According to the embodiment of the invention, the super-resolution reconstruction sub-network comprising the feature extraction layer, the multi-stage feature shunt multiplexing layer and the transposed convolution layer and the target detection sub-network comprising the central point network layer, the region generation network layer, the environment information enhancement layer and the spatial attention layer are adopted, so that the shallow features rich in position information of the vehicle image can be recycled, the deep features rich in semantic information of the vehicle image can be extracted, the calculated amount is less, the operation speed is not obviously reduced, the detection speed and the detection precision are higher, the super-resolution reconstruction sub-network and the target detection sub-network are combined to train by adopting the target loss function comprising the loss function value of the super-resolution reconstruction sub-network and the loss function value of the target detection sub-network, the internal rule of the vehicle image can be mined to the greatest extent, the target neural network has higher vehicle detection precision, recall rate and robustness for the low-resolution vehicle image with noise and fuzzy, and the efficiency and the accuracy of vehicle detection on the low-resolution image are further improved.

Referring to fig. 2, another embodiment of a super-resolution-based vehicle detection method according to an embodiment of the present invention includes:

201. And extracting features of a preset target vehicle image training set through a feature extraction layer in a preset initial neural network to obtain convolution features, wherein the super-resolution reconstruction sub-network comprises a feature extraction layer, a multi-stage feature shunt multiplexing layer and a transposition convolution layer.

The built super-resolution reconstruction sub-network comprises a feature extraction module, a multistage feature division multiplexing module and a reconstruction module, wherein the feature extraction module corresponds to a feature extraction layer, the multistage feature division multiplexing module corresponds to a multistage feature division multiplexing layer, and the reconstruction module corresponds to a transposed convolution layer. The preset target vehicle image training set is low scoreResolution of the original image of the vehicle. The feature extraction layer includes 2 layers of convolution layers with lrehu activation function having a convolution kernel 3*3 and 1 layer of convolution layers having a convolution kernel 1*1. The process of extracting the convolution features by the feature extraction layer can be represented by the following formula: b (B) ₀ ＝f(x)，B ₀ For convolution features, x is a preset target vehicle image training set, and f (x) is a feature extraction function.

The method comprises the steps that a server carries out convolution processing on a target vehicle image training set through a feature extraction layer in a preset initial neural network, a vehicle image data set is obtained before convolution features are obtained, and the vehicle image data set is divided into an initial vehicle image training set and an initial vehicle image testing set according to a preset proportion; respectively carrying out data preprocessing on the initial vehicle image training set and the initial vehicle image testing set to obtain a candidate vehicle image training set and a target vehicle image testing set; sampling the candidate vehicle image training set according to a preset minimum intersection ratio to obtain a sampling graph set; and carrying out scale transformation and overturning on the sampling atlas to obtain the target vehicle image training set.

The server extracts a plurality of vehicle images from the preset database and crawls the plurality of vehicle images from the network platform, thereby obtaining a vehicle image set, wherein the scale and corresponding scene of each or part of the vehicle images in the vehicle image set are different, for example: the size of the vehicle image is between 100 x 100 pixels and 600 x 600 pixels, and is distributed as uniformly as possible within the size range; the backgrounds of the vehicle images should be various, the background of a part of the vehicle images is an urban road, and the background of the rest of the vehicle images may be a rural road, an expressway, a square, or the like. The preset proportion is as follows: the training set accounts for 85%, the test set accounts for 15%, i.e., the initial vehicle image training set accounts for 85% of the vehicle image dataset, and the initial vehicle image test set accounts for 15% of the vehicle image dataset.

Data preprocessing includes data cleaning, data integration, data transformation, data reduction and the like. The predetermined minimum overlap ratio may be 0.1, 0.3, 0.5, 0.7 or 0.9. The scale size may be 240 x 240, 320 x 320 and/or 480 x 480. The server may invoke a graphics processor (graphics processing unit, GPU) to process the candidate vehicle image training set by the GPU to obtain the target vehicle image training set. And the server horizontally turns over the sampling atlas subjected to the scale transformation according to the probability of 0.5 to obtain a target vehicle image training set.

202. And sequentially carrying out multi-level convolution, multi-level feature shunting and dimension reduction on the convolution features through the multi-level feature shunting multiplexing layer to obtain features to be transposed.

Specifically, the server carries out convolution processing on the convolution characteristics through a characteristic shunt multiplexing unit in a multistage characteristic shunt multiplexing layer to obtain initial characteristics, wherein the multistage characteristic shunt multiplexing layer comprises a characteristic shunt multiplexing unit and a dimension reduction unit; shunting the initial characteristics through preset channel dimensions to obtain initial multi-layer characteristics and initial one-layer characteristics; carrying out convolution processing and shunt processing of a preset level on the initial multilayer feature to obtain a target multilayer feature and a plurality of candidate one-layer features; performing serial connection and convolution processing of a preset level on the initial one-layer feature and a plurality of candidate one-layer features through a dimension reduction unit to obtain a target one-layer feature; and carrying out element-by-element addition processing on the target multilayer characteristic and the target one-layer characteristic to obtain the characteristic to be transposed.

The multi-stage feature division multiplexing layer comprises a plurality of feature division multiplexing components, each feature division multiplexing component is composed of a feature division multiplexing unit and a dimension reduction unit, the dimension reduction unit is responsible for compressing input information and reducing the number of feature channels, and the feature division multiplexing layer is a convolution layer with a convolution kernel of 1*1. The plurality of feature division multiplexing components are connected in sequence, and the output of the last feature division multiplexing component is the input of the next feature division multiplexing component. The course of action of the individual feature-splitting multiplexing components can be represented by the following formula: b (B) _k ＝F _k (B _k-1 ) K=1, ··, n, where F _k Representing the function corresponding to the kth characteristic shunt multiplexing component, B _k-1 And B _k Representing the input and output of the kth feature split multiplexing component, respectively.

For example: the multi-stage characteristic shunt multiplexing layer comprises 4 characteristic shunt multiplexing components, namely a characteristic shunt multiplexing component 1, a characteristic shunt multiplexing component 2, a characteristic shunt multiplexing component 3 and a characteristic shunt multiplexing component 4, wherein the characteristic shunt multiplexing component 1 is provided with a convolution layer A1, a channel dimension B1 and a dimension reduction unit C1, the characteristic shunt multiplexing component 2 is provided with a convolution layer A2, a channel dimension B2 and a dimension reduction unit C2, the characteristic shunt multiplexing component 3 is provided with a convolution layer A3, a channel dimension B3 and a dimension reduction unit C3, the characteristic shunt multiplexing component 4 is provided with a convolution layer A4, a channel dimension B4 and a dimension reduction unit C4, and convolution kernels of the convolution layer A1, the convolution layer A2, the convolution layer A3 and the convolution layer A4 are 3*3;

the convolution characteristic is input into a characteristic shunt multiplexing component 1, the convolution layer A1 carries out convolution processing on the convolution characteristic to obtain an initial characteristic D1, a channel dimension B1 shunts the initial characteristic D1 into a Z layer (the value of Z can be adjusted according to the actual conditions such as a data set, for example, 4 can be taken), an initial multi-layer characteristic E1 (1-1/Z) and an initial one-layer characteristic F1 (1/Z) are obtained, and the initial one-layer characteristic F1 is subjected to convolution processing through a dimension reduction unit C1 to obtain a convolution one-layer characteristic H1;

Inputting an initial multilayer feature E1 into a feature division multiplexing component 2, carrying out convolution processing on the initial multilayer feature E1 by a convolution layer A2 to obtain an initial feature D2, dividing the initial feature D2 into Z layers by a channel dimension B2 to obtain an initial multilayer feature E2 (1-1/Z) and a candidate one-layer feature F2 (1/Z), carrying out convolution processing on the candidate one-layer feature F2 by a dimension reduction unit C2 to obtain a convolution one-layer feature H2, and carrying out serial connection on the convolution one-layer feature H1 and the convolution one-layer feature H2 to obtain a serial connection feature G1;

inputting the initial multilayer feature E2 into a feature division multiplexing component 3, carrying out convolution processing on the initial multilayer feature E2 by a convolution layer A3 to obtain an initial feature D3, dividing the initial feature D3 into s layers by a channel dimension B3 to obtain an initial multilayer feature E3 (1-1/Z) and a candidate one-layer feature F3 (1/Z), carrying out convolution processing on the candidate one-layer feature F3 by a dimension reduction unit C3 to obtain a convolution one-layer feature H3, and connecting a series feature G1 and the convolution one-layer feature H3 in series to obtain a series feature G2;

inputting the initial multilayer feature E3 into a feature division multiplexing component 4, carrying out convolution processing on the initial multilayer feature E3 by a convolution layer A4 to obtain an initial feature D4, dividing the initial feature D4 into s layers by a channel dimension B4 to obtain a target multilayer feature E4 (1-1/Z) and a candidate one-layer feature F4 (1/Z), carrying out convolution processing on the candidate one-layer feature F4 by a dimension reduction unit C4 to obtain a convolution one-layer feature H4, and connecting a series-connection feature G2 and the convolution one-layer feature H4 in series to obtain a target one-layer feature G3; and carrying out element-by-element addition processing on the target multilayer characteristic E4 and the target one-layer characteristic G3 to obtain a characteristic to be transposed.

The splitting operation enables the output of the feature splitting multiplexing unit to contain features with different convolution times, so that the shallow features rich in position information are recycled, and the deep features rich in semantic information can be extracted. The low-frequency information carried by the low-resolution image is similar to the low-frequency information of the target image, and a great deal of time is spent on paying attention to the low-frequency information during training, so that many current popular algorithms only learn the high-frequency part residual error between the target image and the low-resolution image. This increases the speed of the algorithm, but the difference between the low frequency information of the target image thus obtained and the low frequency information of the low resolution image still compromises the performance of the subsequent detection network. The feature division multiplexing module obtains better balance between the shallow features and the deep features, can improve the subsequent detection result, but has less calculated amount and does not obviously reduce the operation speed.

203. And performing convolution kernel matrix calculation and convolution kernel matrix transposition processing on the characteristics to be transposed through the transposition convolution layer to obtain a target image set, wherein the resolution of the target image set is higher than that of the target vehicle image training set.

The server implements a transpose convolution layer by a manner of a classification of in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, bias=true, and performs a transpose convolution layer, a convolution kernel matrix calculation of a feature to be transposed and a transposition processing of the convolution kernel matrix, where in_channels (int) represent a number of channels of an input signal, out_channels (int) represent a number of channels generated by convolution, kernel_ size (int or tuple) represents a size of a convolution kernel, stride (int or aperture) represents a step size, a calculation step size of a control correlation coefficient, padding (int or aperture) represents a number of layers of each side supplement 0 of an input, output_padding (int or aperture) represents a number of layers of each side supplement 0 of an output, and an output channel (int_ size (int or tuple)) represents a number of blocking channels between the input and the output channel, and the number of blocking channels between the input and the output are controlled: group=1, the output is a convolution of all inputs; group=2, which is equivalent to having two parallel convolution layers, each convolution layer calculates half of the input channel and generates half of the output channel, and then the two outputs are connected, where bias (bias) is added if bias=true, and the specific size of each parameter needs to be adjusted according to the actual application scenario; data types for parameters kernel_size, stride, and padding: may be an int type of data where the convolution height and width values are the same; or a tuple of points (containing data from two int types), the first one representing the value of height and the second one representing the value of width.

When the server performs convolution kernel matrix calculation and convolution kernel matrix transposition processing on the feature to be transposed in the transposition convolution layer to obtain a first candidate image set, performing bicubic interpolation Bicubic interpolation on a target vehicle image training set (a low-resolution vehicle image original image) to obtain a second candidate image set, and fusing the first candidate image set and the second candidate image set to obtain a target image set, wherein the target image set can be fused in a matrix addition mode or in a matrix multiplication mode.

204. And detecting the target image set through a target detection sub-network in the initial neural network to obtain detection information, wherein the target detection sub-network comprises a central point network layer, a region generation network layer, an environment information enhancement layer and a spatial attention layer.

Specifically, the server generates a multi-level feature map of a target image set through a preset central point network layer, and the target detection sub-network comprises a central point network layer, a region generation network layer, an environment information enhancement layer and a spatial attention layer; generating a multi-scale feature map of the multi-level feature map through a region generation network layer, wherein the region generation network layer comprises a convolution layer with a convolution kernel of 5*5, and a convolution layer with 256 output channels and 1*1 convolution kernels; sequentially carrying out convolution processing and fusion processing on the multi-level feature images through the environment information enhancement layer to obtain environment information enhancement features; and carrying out fusion processing and classification processing on the feature images of various scales and the environmental information enhancement features through the spatial attention layer and a preset convolutional neural network to obtain detection information.

Wherein the server generates a network layer through the region to generate a target image set {32 } ² ，64 ² ，128 ² ，256 ² ，512 ² Sum { 5 scales {1:2,3:4,1;1,4:3,2:1}5 aspect ratio prior frames, and generating a p-dimension a-p feature map (i.e. feature maps of multiple scales) of the target image set according to the prior frames, wherein a is a super parameter which needs to be adjusted according to practical situations, and can be set to 5, and p is the size of the pooling kernel, preferably 7.

The method comprises the steps that a server sequentially carries out convolution layer processing, batch normalization processing and activation function Sigmoid processing on feature images of various scales through a spatial attention layer to obtain a first feature image, re-weights environmental information enhancement features through a value output by the activation function Sigmoid to obtain a second feature image, multiplies the first feature image by the second feature image to obtain a third feature image, and carries out vehicle classification on the third feature image through a preset convolution neural network (regions with cnn features, RCNN) to obtain detection information, wherein the convolution kernel of the convolution layer in the convolution layer processing is 1*1, and the channel number is 245. By means of the batch normalization process, sufficient statistical information can be gathered from more feature maps of various scales.

Specifically, the server ranks the multi-level feature graphs according to the order from small to large, determines the first multi-level feature graph as an initial feature graph, and determines the multi-level feature graphs except the initial feature graph in the multi-level feature graphs as a plurality of candidate feature graphs; respectively carrying out 2 times up-sampling treatment and global average pooling treatment on the initial feature map through an environment information enhancement layer to obtain a pooling feature map and a sampling feature map; respectively carrying out convolution processing on the candidate feature images to obtain a plurality of target feature images; and performing matrix addition processing on the pooled feature map, the sampling feature map and the plurality of target feature maps to obtain the environment information enhancement feature.

For example: the multi-level feature graphs are W1, W2 and W3, the scales of W1, W2 and W3 are 20×20, 20×20 and 10×10 respectively, W1 and W2 are candidate feature graphs, W3 is an initial feature graph, convolution processing with a convolution kernel 1*1 and a channel number 245 is performed on W1 and W2 to obtain a target feature graph R1 and a target feature graph R2, 2 times up-sampling processing and convolution processing with a convolution kernel 1*1 and a channel number 245 are performed on W3 to obtain a pooled feature graph, global average pooling processing, convolution processing with a convolution kernel 1*1 and convolution processing with a channel number 245 are performed on W3 to obtain a sampled feature graph, matrix addition is performed on the target feature graph R1, the target feature graph R2, the pooled feature graph and the sampled feature graph to obtain an environment information enhancement feature, the environment information enhancement layer can polymerize multi-scale local environment information and global environment information, characterization capability of the thin feature graph can be improved, and required calculation amount is small.

205. And acquiring a target loss function value of the initial neural network according to the target image set and the detection information.

Specifically, the server calculates a first loss function value of a super-resolution reconstruction sub-network in an initial neural network through a target image set and a preset mean square error loss function; calculating a second loss function value of the target detection sub-network in the initial neural network through the detection information and a preset regression loss function; and calculating a weighted sum of the first loss function value and the second loss function value according to a preset weight value to obtain a target loss function value of the initial neural network.

The preset mean square error loss function is as follows:x1 is the target image set, N is the number of pixels of the image in the target image set, i is the ith pixel of the image in the target image set,/I>And reconstructing a predicted value of x obtained by the sub-network for the super-resolution. The preset regression loss function is as follows:x2 is detection information, y is a vehicle true value of each target vehicle image in the target vehicle image training set, < ->For the predicted value of y obtained by the target sub-detection network, B is the number of matched priori frames, L _conf L as a function of punishing the misclassification of matched bounding boxes _loc Is a smooth L1 (smooth L1) distance function for representing the displacement of the punishment bounding box from the true value of the vehicle, lambda is L _loc Is a super parameter, and is adjusted according to the experimental result.

206. And (3) carrying out iterative adjustment on the parameters of the initial neural network according to the target loss function value until the target loss function value is converged, so as to obtain the target neural network.

The parameters of the initial neural network may be weight values or super parameters. The server may also adjust parameters and network structure of the initial neural network based on the objective loss function value. The method comprises the steps that when a server iteratively adjusts parameters of an initial neural network according to target loss function values until the target loss function values are converged to obtain the target neural network, a region of interest of a vehicle image in a target vehicle image training set can be identified and divided, difficult samples of the vehicle image in the target vehicle image training set are obtained according to region of interest sampling, the difficult samples are ordered according to the target loss function values to obtain ordered samples, a sample with the largest preset target loss function value is selected from the ordered samples, then the weight of the initial neural network is propagated and updated, non-maximal suppression processing is carried out on the difficult samples to obtain suppression samples, and the reverse propagation of the preset region of interest is selected from the suppression samples, so that training of the initial neural network is achieved, and time consumption of processing can be effectively reduced. The spatial attention layer designates a confidence threshold in the detection frames obtained after the feature map is processed for one time, and the detection frames with the final score larger than the confidence threshold are reserved.

207. And carrying out vehicle detection on a preset target vehicle image test set through a target neural network to obtain a target vehicle detection result.

According to the embodiment of the invention, the super-resolution reconstruction sub-network comprising the feature extraction layer, the multi-stage feature shunt multiplexing layer and the transposed convolution layer and the target detection sub-network comprising the central point network layer, the region generation network layer, the environment information enhancement layer and the spatial attention layer are adopted, so that the shallow features rich in position information of the vehicle image can be recycled, the deep features rich in semantic information of the vehicle image can be extracted, the calculated amount is less, the operation speed is not obviously reduced, the detection speed and the detection precision are higher, the super-resolution reconstruction sub-network and the target detection sub-network are combined to train by adopting the target loss function comprising the loss function value of the super-resolution reconstruction sub-network and the loss function value of the target detection sub-network, the internal rule of the vehicle image can be mined to the greatest extent, the target neural network has higher vehicle detection precision, recall rate and robustness for the low-resolution vehicle image with noise and fuzzy, and the efficiency and the accuracy of vehicle detection on the low-resolution image are further improved. The scheme can be applied to the intelligent traffic field, so that the construction of intelligent cities is promoted.

The above description is given of the vehicle detection method based on super resolution in the embodiment of the present invention, and the following description is given of the vehicle detection device based on super resolution in the embodiment of the present invention, referring to fig. 3, and one embodiment of the vehicle detection device based on super resolution in the embodiment of the present invention includes:

the reconstruction module 301 is configured to perform super-resolution reconstruction processing on a preset target vehicle image training set through a super-resolution reconstruction sub-network in a preset initial neural network, so as to obtain a target image set, where the super-resolution reconstruction sub-network includes a feature extraction layer, a multi-level feature division multiplexing layer and a transposed convolution layer, and the resolution of the target image set is higher than that of the target vehicle image training set;

the first detection module 302 is configured to detect, through a target detection sub-network in the initial neural network, a target image set to obtain detection information, where the target detection sub-network includes a central point network layer, a region generation network layer, an environmental information enhancement layer, and a spatial attention layer;

an obtaining module 303, configured to obtain a target loss function value of the initial neural network according to the target image set and the detection information;

the iteration adjustment module 304 is configured to iteratively adjust parameters of the initial neural network according to the objective loss function value until the objective loss function value converges, thereby obtaining an objective neural network;

And the second detection module 305 is configured to perform vehicle detection on a preset target vehicle image test set through the target neural network, so as to obtain a target vehicle detection result.

The function implementation of each module in the above-mentioned vehicle detection device based on super resolution corresponds to each step in the above-mentioned vehicle detection method embodiment based on super resolution, and the function and implementation process thereof are not described in detail here.

Referring to fig. 4, another embodiment of the super-resolution-based vehicle detection apparatus according to the present invention includes:

the reconstruction module 301 specifically includes:

the feature extraction sub-module 3011 is configured to perform feature extraction on a preset target vehicle image training set through a feature extraction layer in a preset initial neural network to obtain convolution features, where the super-resolution reconstruction sub-network includes a feature extraction layer, a multi-level feature division multiplexing layer and a transposed convolution layer;

the feature shunting submodule 3012 is used for sequentially carrying out multi-level convolution, multi-level feature shunting and dimension reduction on the convolution feature through a multi-level feature shunting multiplexing layer to obtain a feature to be transposed;

the convolution transposition submodule 3013 is used for carrying out convolution kernel matrix calculation and convolution kernel matrix transposition processing on the characteristics to be transposed through the transposition convolution layer to obtain a target image set, wherein the resolution of the target image set is higher than that of the target vehicle image training set;

Optionally, the obtaining module 303 may be further specifically configured to:

calculating a first loss function value of a super-resolution reconstruction sub-network in an initial neural network through a target image set and a preset mean square error loss function;

calculating a second loss function value of the target detection sub-network in the initial neural network through the detection information and a preset regression loss function;

Optionally, the feature shunting submodule 3012 may also be specifically configured to:

the convolution characteristic is subjected to convolution processing through a characteristic division multiplexing unit in a multistage characteristic division multiplexing layer to obtain an initial characteristic, wherein the multistage characteristic division multiplexing layer comprises a characteristic division multiplexing unit and a dimension reduction unit;

shunting the initial characteristics through preset channel dimensions to obtain initial multi-layer characteristics and initial one-layer characteristics;

carrying out convolution processing and shunt processing of a preset level on the initial multilayer feature to obtain a target multilayer feature and a plurality of candidate one-layer features;

performing serial connection and convolution processing of a preset level on the initial one-layer feature and a plurality of candidate one-layer features through a dimension reduction unit to obtain a target one-layer feature;

and carrying out element-by-element addition processing on the target multilayer characteristic and the target one-layer characteristic to obtain the characteristic to be transposed.

Optionally, the first detection module 302 includes:

a first generation sub-module 3021, configured to generate, through a preset central point network layer, a multi-level feature map of the target image set, where the target detection sub-network includes a central point network layer, a region generation network layer, an environmental information enhancement layer, and a spatial attention layer;

a second generating submodule 3022, configured to generate a network layer through a region, and generate a feature map with multiple scales of the multi-level feature map, where the network layer includes a convolution layer with a convolution kernel 5*5 and a convolution layer with 256 output channels and a convolution kernel 1*1;

The convolution fusion submodule 3023 is used for sequentially carrying out convolution processing and fusion processing on the multi-level feature images through the environment information enhancement layer to obtain environment information enhancement features;

and the fusion classification submodule 3024 is used for carrying out fusion processing and classification processing on the feature images of various scales and the environmental information enhancement features through the spatial attention layer and a preset convolutional neural network to obtain detection information.

Optionally, the convolution fusion submodule 3023 may be further specifically configured to:

the multi-level feature images are ordered according to the order from small scale to large scale, the multi-level feature image with the first order is determined to be an initial feature image, and the multi-level feature images except the initial feature image in the multi-level feature images are determined to be a plurality of candidate feature images;

respectively carrying out 2 times up-sampling treatment and global average pooling treatment on the initial feature map through an environment information enhancement layer to obtain a pooling feature map and a sampling feature map;

and performing matrix addition processing on the pooled feature map, the sampling feature map and the plurality of target feature maps to obtain the environment information enhancement feature.

Optionally, the vehicle detection device based on super resolution further includes:

The dividing module 306 is configured to obtain a vehicle image dataset, and divide the vehicle image dataset into an initial vehicle image training set and an initial vehicle image testing set according to a preset proportion;

the preprocessing module 307 is configured to perform data preprocessing on the initial vehicle image training set and the initial vehicle image testing set, so as to obtain a candidate vehicle image training set and a target vehicle image testing set;

the sampling module 308 is configured to sample the training set of candidate vehicle images according to a preset minimum intersection ratio to obtain a sampling atlas;

the transformation overturning module 309 is configured to transform and overturn the scale of the sampling atlas to obtain a training set of images of the target vehicle.

The function implementation of each module and each sub-module in the above-mentioned super-resolution-based vehicle detection device corresponds to each step in the above-mentioned super-resolution-based vehicle detection method embodiment, and the function and implementation process thereof are not described in detail herein.

The super-resolution-based vehicle detection apparatus in the embodiment of the present invention is described in detail from the point of view of the modularized functional entity in fig. 3 and 4 above, and the super-resolution-based vehicle detection device in the embodiment of the present invention is described in detail from the point of view of hardware processing below.

Fig. 5 is a schematic structural diagram of a super-resolution vehicle detection device according to an embodiment of the present invention, where the super-resolution vehicle detection device 500 may have relatively large differences due to different configurations or performances, and may include one or more processors (central processing units, CPU) 510 (e.g., one or more processors) and a memory 520, and one or more storage media 530 (e.g., one or more mass storage devices) storing application programs 533 or data 532. Wherein memory 520 and storage medium 530 may be transitory or persistent storage. The program stored in the storage medium 530 may include one or more modules (not shown), each of which may include a series of instruction operations on the super-resolution-based vehicle detection apparatus 500. Still further, the processor 510 may be configured to communicate with the storage medium 530 to execute a series of instruction operations in the storage medium 530 on the super-resolution based vehicle detection device 500.

The super resolution based vehicle detection device 500 may also include one or more power supplies 540, one or more wired or wireless network interfaces 550, one or more input/output interfaces 560, and/or one or more operating systems 531, such as Windows Serve, mac OS X, unix, linux, freeBSD, and the like. It will be appreciated by those skilled in the art that the super-resolution based vehicle detection apparatus structure shown in fig. 5 does not constitute a limitation of the super-resolution based vehicle detection apparatus, and may include more or less components than those illustrated, or may combine certain components, or may be a different arrangement of components.

The present invention also provides a super-resolution-based vehicle detection apparatus, which includes a memory and a processor, wherein instructions are stored in the memory, and when the instructions are executed by the processor, the processor is caused to execute the steps of the super-resolution-based vehicle detection method in the above embodiments.

The present invention also provides a computer readable storage medium, which may be a non-volatile computer readable storage medium, and may also be a volatile computer readable storage medium, in which instructions are stored, which when executed on a computer, cause the computer to perform the steps of the super-resolution-based vehicle detection method.

Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created from the use of blockchain nodes, and the like.

The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. The vehicle detection method based on the super resolution is characterized by comprising the following steps of:

Carrying out weighted summation according to a first loss function value corresponding to the target image set and a second loss function value corresponding to the detection information to obtain a target loss function value of the initial neural network;

performing vehicle detection on a preset target vehicle image test set through the target neural network to obtain a target vehicle detection result;

the super-resolution reconstruction sub-network in the preset initial neural network is used for carrying out super-resolution reconstruction processing on a preset target vehicle image training set to obtain a target image set, the super-resolution reconstruction sub-network comprises a feature extraction layer, a multi-level feature shunt multiplexing layer and a transposed convolution layer, the resolution of the target image set is higher than that of the target vehicle image training set, and the super-resolution reconstruction sub-network comprises:

performing convolution kernel matrix calculation and convolution kernel matrix transposition on the feature to be transposed through the transposition convolution layer to obtain a target image set, wherein the resolution of the target image set is higher than that of the target vehicle image training set;

the detecting the target image set through a target detection sub-network in the initial neural network to obtain detection information, wherein the target detection sub-network comprises a central point network layer, a region generation network layer, an environment information enhancement layer and a spatial attention layer, and the method comprises the following steps:

2. The super-resolution based vehicle detection method according to claim 1, wherein the weighting and summing according to a first loss function value corresponding to the target image set and a second loss function value corresponding to the detection information, to obtain a target loss function value of the initial neural network, includes:

3. The super-resolution-based vehicle detection method according to claim 1, wherein the sequentially performing multi-level convolution, multi-level feature splitting and dimension reduction on the convolution feature by the multi-level feature splitting multiplexing layer to obtain a feature to be transposed, includes:

4. The super-resolution-based vehicle detection method as claimed in claim 1, wherein the sequentially performing convolution processing and fusion processing on the multi-level feature map by the environmental information enhancement layer to obtain environmental information enhancement features includes:

5. The super-resolution based vehicle detection method as claimed in any one of claims 1 to 4, wherein before performing super-resolution reconstruction processing on a preset target vehicle image training set through a super-resolution reconstruction sub-network in a preset initial neural network to obtain a target image set, the method further comprises:

6. A super-resolution-based vehicle detection apparatus, characterized by comprising:

The acquisition module is used for carrying out weighted summation according to the first loss function value corresponding to the target image set and the second loss function value corresponding to the detection information to obtain a target loss function value of the initial neural network;

the second detection module is used for detecting the vehicle of a preset target vehicle image test set through the target neural network to obtain a target vehicle detection result;

the reconstruction module comprises:

The convolution transposition submodule is used for carrying out convolution kernel matrix calculation and convolution kernel matrix transposition on the feature to be transposed through the transposition convolution layer to obtain a target image set, wherein the resolution of the target image set is higher than that of the target vehicle image training set;

the first detection module includes:

7. The super-resolution based vehicle detection apparatus as claimed in claim 6, wherein the acquisition module is specifically configured to:

8. The super-resolution based vehicle detection apparatus as claimed in claim 6, wherein the feature distribution sub-module is specifically configured to:

9. A super-resolution-based vehicle detection apparatus, characterized by comprising: a memory and at least one processor, the memory having instructions stored therein;

the at least one processor invokes the instructions in the memory to cause the super-resolution based vehicle detection apparatus to perform the super-resolution based vehicle detection method of any one of claims 1-5.

10. A computer readable storage medium comprising a stored data area storing data created from use of blockchain nodes and a stored program area storing instructions, wherein the instructions when executed by a processor implement the super-resolution based vehicle detection method of any of claims 1-5.