WO2021249523A1 - 图像处理方法及装置 - Google Patents

图像处理方法及装置 Download PDF

Info

Publication number
WO2021249523A1
WO2021249523A1 PCT/CN2021/099579 CN2021099579W WO2021249523A1 WO 2021249523 A1 WO2021249523 A1 WO 2021249523A1 CN 2021099579 W CN2021099579 W CN 2021099579W WO 2021249523 A1 WO2021249523 A1 WO 2021249523A1
Authority
WO
WIPO (PCT)
Prior art keywords
convolution
image
peripheral
calculation
data
Prior art date
Application number
PCT/CN2021/099579
Other languages
English (en)
French (fr)
Inventor
程捷
蒋磊
曹洋
查正军
Original Assignee
华为技术有限公司
中国科学技术大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司, 中国科学技术大学 filed Critical 华为技术有限公司
Publication of WO2021249523A1 publication Critical patent/WO2021249523A1/zh
Priority to US18/064,132 priority Critical patent/US20230104428A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • This application relates to the field of computer technology, and in particular to an image processing method and device.
  • a common method is to first use image enhancement algorithms to enhance the degraded image, improve the image quality of the low-quality image, and then perform processing such as recognition on the image after the quality has been improved.
  • image enhancement algorithms the main purpose of existing image enhancement algorithms is to obtain an enhanced image with a good visual perception effect, and the object is human.
  • this type of algorithm cannot guarantee that the computer network can extract complete structural or statistical features from the enhanced image, so the recognition accuracy is low.
  • the image processing method and device provided by the present application can improve the processing effect of low-quality images.
  • an embodiment of the present invention provides an image processing method. According to this method, after receiving the image data of the target image, the image data can be processed based on the network parameters to obtain the enhanced image feature data of the target image, and the target image can be processed based on the enhanced image feature data. The image is processed.
  • the target image is a low-quality image
  • the network parameter is used to indicate the correspondence between the feature data of the low-quality image and the feature data of the clear image.
  • the low-quality image itself is not processed in advance, but in the image processing process, the set network parameters are used to process the low-quality image.
  • the image data of the high-quality image is processed to obtain enhanced image feature data of the low-quality image, and the low-quality image is processed based on the enhanced image feature data.
  • the network parameters reflect the corresponding relationship between the feature data of low-quality images and the feature data of clear images, that is to say, the connection between low-quality image features and clear image features is used in the processing process for low-quality targets
  • the feature of the image is processed, therefore, the feature data of the low-quality image can be enhanced, the network recognizability of the feature data of the target image can be improved, and the processing effect of the low-quality image can be improved, for example, the recognition accuracy of the low-quality image can be improved.
  • the feature data of the target image may be obtained according to the image data, and based on the The network parameter performs neural network calculation on the feature data and the image data to obtain residual data, and it is possible to obtain enhanced image feature data of the target image based on the residual data and the feature data.
  • the feature data is feature data obtained by calculating the image data through N layers of neural network, N is greater than 0 and less than a preset threshold, and the residual data is used to indicate the feature data and clear image of the target image The deviation between the characteristic data.
  • the non-classical receptive field structure of the retina is referred to in the processing process, and the principle of light-sensitive bipolar cells of the retina is simulated, it has the ability to enhance the high-frequency information in the target image and maintain the low-frequency information in the target image.
  • the function makes the enhanced image feature data easier to be recognized or extracted, and the processing effect is better. For example, it can improve the recognition accuracy of low-quality images.
  • the robustness (or stability) of the network can be made stronger.
  • the embodiment of the present invention uses the feature drift properties of the image to process low-quality images, the processing process does not require the supervision of semantic signals (used to indicate image content), and there are fewer network parameters.
  • the performing neural network calculations on the feature data and the image data based on the network parameters includes: performing neural network calculations on the feature data and the image data based on the network parameters.
  • the image data is subjected to center_peripheral convolution calculation. According to this method, the photoreceptor principle of bipolar cells of the retina can be simulated, and the image processing effect can be better.
  • the performing neural network calculations on the feature data and the image data based on the set network parameters includes: performing neural network calculations on the feature data based on the set network parameters.
  • the data and the image data perform at least a first-level center_peripheral convolution calculation, a second-level center_peripheral convolution calculation, and a third-level center_peripheral convolution calculation.
  • this method the structure of the non-classical receptive field of the retina and the light-sensing principle of retinal bipolar cells can be simulated, and the accuracy and effect of image recognition can be improved.
  • the input data for the first-level center_peripheral convolution calculation includes the feature data and the image data
  • the input data of the peripheral convolution calculation includes the calculation result of the first-level center_peripheral convolution calculation
  • the input data of the third-level center_peripheral convolution calculation includes the calculation result of the second-level center_peripheral convolution. Calculation results.
  • the residual data is based on the calculation result of the first-level center_peripheral convolution calculation, and the second-level center_peripheral convolution The calculation result of the calculation and the calculation result of the third-level center_peripheral convolution calculation are obtained.
  • the first-level center-peripheral convolution calculation is used to simulate the response of the central region of the retina of the human eye to the target image
  • the second The first-level center_peripheral convolution calculation is used to simulate the response of the peripheral area of the human retina to the target image
  • the third-level center_peripheral convolution calculation is used to simulate the response of the peripheral area of the human retina to the target image. response.
  • the first-level center-peripheral convolution calculation includes: performing a first volume on the feature data and the image data based on a first convolution check The first intermediate result is obtained by the product operation, wherein the weight of the central area of the first convolution kernel is 0; the second convolution operation is performed on the feature data and the image data based on the second convolution kernel to obtain the first Two intermediate results, wherein the second convolution kernel only includes the weight of the central area, and the first convolution kernel and the second convolution kernel have the same size; based on the first intermediate result and the The second intermediate result obtains the calculation result of the first-level center_peripheral convolution.
  • the second-level center_peripheral convolution calculation includes: a calculation result of the first-level center_peripheral convolution based on a third convolution kernel Perform a third convolution operation to obtain a third intermediate result, where the value of the central area of the third convolution kernel is 0; the calculation result of the first-level center_peripheral convolution is performed based on the fourth convolution kernel.
  • the third-level center_peripheral convolution calculation includes: a calculation result of the second-level center_peripheral convolution based on a fifth convolution kernel Perform the fifth convolution operation to obtain the fifth intermediate result, where the weight of the central area of the fifth convolution kernel is 0; execute the calculation result of the second-level center_peripheral convolution based on the sixth convolution kernel A sixth convolution operation to obtain a sixth intermediate result, wherein the sixth convolution kernel only includes the weight of the central region, and the fifth convolution kernel and the sixth convolution kernel have the same size; The fifth intermediate result and the sixth intermediate result obtain the calculation result of the third-level center-peripheral convolution.
  • the method may be executed by a neural network device, and the network parameters are obtained after training.
  • the present application provides an image recognition device, which includes a functional module for implementing the image processing method in the first aspect or any one of the first aspects.
  • the present application provides an image recognition device, including a neural network used to implement the first aspect or the image processing method in any one of the first aspects.
  • the present application also provides a computer program product, including program code, and instructions included in the program code are executed by a computer to implement the image in the first aspect or any one of the first aspects.
  • a computer program product including program code, and instructions included in the program code are executed by a computer to implement the image in the first aspect or any one of the first aspects.
  • the present application also provides a computer-readable storage medium for storing program code, and instructions included in the program code are executed by a computer to implement the first aspect or the first aspect described above.
  • An image processing method in any implementation of one aspect.
  • FIG. 1 is a schematic structural diagram of an image processing device provided by an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of a neural network layer in a neural network system provided by an embodiment of the present invention
  • 3A is a flowchart of an image processing method provided by an embodiment of the present invention.
  • 3B is a flowchart of another image processing method provided by an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of signals of an image processing method provided by an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a structure of a non-classical receptive field in the retina of a human eye provided by an embodiment of the present invention
  • FIG. 6 is a schematic structural diagram of a feature drift module provided by an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a center-periphery convolution mechanism provided by an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of training of a neural network system provided by an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of another image processing apparatus provided by an embodiment of the present invention.
  • FIG. 1 is a schematic structural diagram of an image processing device provided by an embodiment of the present invention.
  • the image processing apparatus 100 may include a control module 105 and a neural network circuit 110.
  • the control module 105 may include a processor 1052 and a memory 1054.
  • the processor 1052 is the computing core and control unit of the control module 105.
  • the processor 1052 may include one or more processor cores (cores).
  • the processor 1052 may be a very large-scale integrated circuit.
  • An operating system and other software programs are installed in the processor 1052, so that the processor 1052 can implement access to the memory 1054, cache, disk, and peripheral devices (such as the neural network circuit in FIG. 1).
  • the Core in the processor 1052 may be, for example, a central processing unit (CPU), or other application specific integrated circuit (ASIC).
  • CPU central processing unit
  • ASIC application specific integrated circuit
  • the memory 1054 can be used as a cache of the processor 1052.
  • the memory 1054 may be connected to the processor 1052 through a double data rate (DDR) bus.
  • the memory 1054 is generally used to store various running software in the operating system, input and output data, and information exchanged with external memory. In order to increase the access speed of the processor 1052, the memory 1054 needs to have the advantage of fast access speed.
  • dynamic random access memory (DRAM) is usually used as the memory 1054.
  • the processor 1052 can access the memory 1054 at a high speed through a memory controller (not shown in FIG. 1), and read and write any storage unit in the memory 1054.
  • the neural network circuit 110 is used to perform artificial neural network calculations.
  • artificial neural network artificial neural network, ANN
  • NN neural network
  • Artificial neural networks may include neural networks such as convolutional neural network (CNN), deep neural network (DNN), and multilayer perceptron (MLP). Neural networks are usually used for image recognition, image classification, speech recognition and so on.
  • the neural network circuit 110 may include one or more neural network chips 115 (may be referred to as chips 115 for short) for performing artificial neural network calculations.
  • the one or more chips 115 are used to perform neural network calculations.
  • the neural network circuit 110 is connected to the control module 105. As shown in FIG. 1, the neural network circuit 110 can be connected to the control module 105 through the connection bus 106.
  • the connection bus 106 may be a PCIE (peripheral component interconnect express) bus, or may be another connection line (such as a network cable, etc.).
  • the connection mode of the neural network circuit 110 and the control module 105 is not limited.
  • the processor 1052 can access the neural network circuit 110 through the connection bus 106.
  • the image data to be processed can be sent to the chip 115 in the neural network circuit 110 through the connection bus 106, and then through the connection The bus 106 receives the processing result of the neural network circuit 110.
  • the control module 105 can also monitor the working status of the neural network circuit 110 through the connection bus 106.
  • the neural network system may include multiple neural network layers.
  • the neural network layer is a logical layer concept, and a neural network layer refers to a neural network operation to be performed once.
  • the neural network layer may include a convolutional layer, a pooling layer, and so on.
  • the neural network system may include n neural network layers (also referred to as n-layer neural network), where n is an integer greater than or equal to 2.
  • Figure 2 shows part of the neural network layers in the neural network system.
  • the first layer 202 can perform convolution operations
  • the second layer 204 can perform pooling operations on the output data of the first layer 302
  • the third layer 206 can perform convolution operations on the output data of the second layer 204.
  • the fourth layer 208 may perform a convolution operation on the output result of the third layer 206
  • the fifth layer 210 may perform a summation operation on the output data of the second layer 204 and the output data of the fourth layer 208, and so on.
  • Figure 2 is only a simple example and description of the neural network layers in the neural network system, and does not limit the specific operations of each layer of the neural network.
  • the fourth layer 208 can also be a pooling operation.
  • the fifth layer 210 may also perform other neural network operations such as convolution operation or pooling operation.
  • FIG. 1 is only a schematic diagram of an image processing device.
  • the image processing apparatus in the embodiment of the present invention may also be a computing device capable of performing neural network calculations, such as a server or a computer.
  • These computing devices may include computing nodes such as a central processing unit (Center Process Unit, CPU) or a graphics processing unit (Graphic Process Unit, GPU) for performing neural network calculations.
  • These computing devices may not include the neural network chip specifically used for performing neural network calculations as shown in FIG. 1.
  • the specific structure of the image processing device is not limited, as long as it includes a neural network capable of implementing the image processing method provided by the embodiment of the present invention.
  • devices including a neural network such as a computer, a server, and the image processing device shown in FIG. 1, may also be referred to as a neural network device or a neural network system.
  • image processing may include image processing methods such as image classification, target recognition, and semantic segmentation.
  • the image processing method provided by the embodiment of the present invention can be applied to scenes such as automatic driving, smart phone photography, and intelligent monitoring system, etc., to improve the processing accuracy of low-quality images by the image processing device.
  • FIG. 3A is a flowchart of an image processing method provided by an embodiment of the present invention
  • FIG. 3B is a flowchart of another image processing method provided by an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of signals of an image processing method provided by an embodiment of the present invention. Both the image processing methods shown in FIGS. 3A and 3B can be executed by the neural network circuit 110 in FIG. 1.
  • the neural network chip 115 in the neural network circuit 110 performs neural network calculation on the input image data.
  • the neural network circuit 110 may perform multiple neural network calculations such as convolution and pooling on the image data to obtain the image processing result.
  • the image processing method provided by the embodiment of the present invention may include the following steps.
  • step 302 image data of a target image is received, and the target image is a low-quality image.
  • a low-quality image refers to interference caused by factors such as light (such as low light, overexposure, etc.), weather (such as rain, snow, fog, etc.) and relative movement of the target during the image imaging process. The inherent color and structure information of the image is destroyed and the low-quality image is produced.
  • a low-quality image is an image whose image quality is lower than a preset threshold.
  • the chip 115 in the neural network circuit 110 can receive the image data of the target image 402 to be processed sent by the processor 1052.
  • low-quality images may be images in a pre-collected image database, or the image processing device and the image collection system can be connected to process the images collected by the image collection system in real time.
  • the target image may be a picture, a video, or the like.
  • the image data is processed based on network parameters to obtain enhanced image feature data of the target image.
  • the network parameter is used to indicate the correspondence between the feature data of the low-quality image and the feature data of the clear image.
  • the neural network system is provided with network parameters after training.
  • the network parameters are obtained after training based on a plurality of low-quality images and clear images, and the network parameters can indicate the quality of the low-quality images.
  • the characteristic data of the target image can be obtained according to the image data.
  • the feature data is data obtained by calculating the image data through N layers of neural network, and N is greater than 0 and less than a preset threshold.
  • the feature data of the target image may include the shallow feature data of the target image.
  • the shallow feature data may include data used to indicate features such as color, structure, and texture of the image.
  • the feature data of the target image can be obtained by the chip 115 in the neural network circuit 110 performing N-layer neural network calculations.
  • the feature data may be feature data (feature map) calculated and output by the first N layers of neural network performed by the neural network circuit 110.
  • the value of N can be set according to actual needs, for example, N can be less than 5 or N can be less than 10, etc., which is not limited here.
  • the N-layer neural network calculation performed by the chip 115 is referred to as the feature acquisition module 403 in FIG. 4.
  • the feature data output by any one of the first five neural network layers in the neural network calculation performed by the neural network circuit 110 can be used as the feature data of the target image. Taking the neural network circuit 110 that needs to perform the n-layer neural network calculation as shown in FIG.
  • the first layer 202, the second layer 204, the third layer 206, the fourth layer 208, or the fifth layer 210 can be output
  • the feature data of is used as the feature data of the target image.
  • the feature acquisition module 403, feature drift module 405, enhancement processing module 407, and the next layer of neural network 409 shown in FIG. 4 are all logical concepts and are used to indicate the neural network circuit 110 in FIG. 1 Neural network calculations performed.
  • step 306 neural network calculations are performed on the feature data and the image data based on the set network parameters to obtain residual data.
  • the residual data is used to indicate the deviation between the feature data of the target image and the feature data of the clear image.
  • the network parameter is used to indicate the correspondence between the feature data of the low-quality image and the feature data of the clear image.
  • the characteristic data of the clear image includes the characteristic data of the clear image used in the training process.
  • the feature data of the clear image includes the shallow feature data of the clear image.
  • the feature data 404 and the image data of the target image 402 can be calculated by the feature drifting module (Feature De-drifting Module) 405 to obtain the residual data 406.
  • the feature drifting module Feature De-drifting Module
  • the feature drift module 405 is also a neural network calculation performed in the neural network circuit 110.
  • the feature drift module 405 is a neural network calculation module based on the non-classical receptive field of adaptive antagonistic convolution.
  • the feature representations of image blocks with similar structures and clear image blocks have the same feature drift law between the feature representations of corresponding low-quality image blocks, and this law is similar to that of the image.
  • the content is irrelevant. Specifically, the shallow features of all clear image blocks with similar structures are gathered together, and similarly, the shallow features of all low-quality image blocks corresponding to the clear image blocks are also gathered together. And this clustering effect has nothing to do with the content of the image.
  • the embodiment of the present invention establishes the correspondence between the feature data of low-quality images (which can be referred to as low-quality features) and the feature data of clear images (which can be referred to as clear features for short), and according to this correspondence To improve the processing accuracy of low-quality images.
  • the embodiment of the present invention proposes a feature drift network based on the non-classical receptive field (nCRF) mechanism in the human retina, and, A "center-peripheral convolution mechanism" is proposed based on the principle of bipolar cell photosensitive.
  • the receptive field is the basic structure and functional unit of visual system information processing.
  • the retinal ganglion has a classical receptive field (classical receptive field, CRF) of concentric circle antagonism, and its spatial integration characteristic is to process the brightness contrast information of the image area and extract the edge information of the image.
  • CRF classical receptive field
  • the non-classical receptive field is a large area outside the classical receptive field.
  • the non-classical receptive field of retinal ganglion cells is mainly de-inhibitory, so it can compensate to a certain extent for the loss of low-frequency information caused by the classical receptive field. While maintaining the boundary enhancement function, it transmits the regional brightness gradient information of the image. , which shows the slow change of brightness on a large surface. It can be seen that the non-classical receptive field greatly broadens the scope of visual cell processing, and provides a neural basis for the integration and detection of a large range of complex graphics.
  • the non-classical receptive field in the retina of the human eye contains multiple antagonistic sub-regions, and each sub-region cooperates with each other to realize the function of high-frequency enhancement and low-frequency maintenance, thereby helping the human eye to better distinguish external things.
  • antagonism refers to the phenomenon that one substance (or process) is blocked by another substance (or process).
  • FIG. 5 is a schematic diagram of the structure of the non-classical receptive field in the human retina according to an embodiment of the present invention.
  • the non-classical receptive field in the human retina may include three areas: a central area 502 and a peripheral area. Area 504 and edge area 506.
  • the mathematical expression of the non-classical receptive field in the human retina can be:
  • G 1 ⁇ G 3 represent three Gaussian convolution kernels with different bandwidths:
  • a 1 to A 3 respectively represent the weighting coefficients of the central area, the peripheral area, and the edge area, and the variances ⁇ 1 to ⁇ 3 determine the bandwidth of the three Gaussian functions.
  • formula (1) can be written as follows:
  • the present invention proposes a feature drift module.
  • This module contains a multi-level structure, and each level of sub-structure contains one or more convolutional layers. Input a set of low-quality feature maps, each sub-network in the feature drift module can output a set of results, and finally multiple sets of output results are weighted and merged to obtain the final residual data.
  • FIG. 6 is a schematic structural diagram of a feature drift module 405 according to an embodiment of the present invention.
  • the feature drift module 405 including a three-level convolution module as an example.
  • the feature drift module 405 may include three-level convolution modules: G1 4051, G2 4052, and G3 4053.
  • the feature drift module 405 may also include a processing module 4054.
  • the convolution module G1 4051 is used to simulate the function of the central area 502 of the non-classical receptive field in the human retina shown in FIG. 5
  • the convolution module G2 4052 is used to simulate the function of the peripheral area 504 shown in FIG.
  • the product module G3 4053 is used to simulate the function of the edge area 506 shown in FIG. 5.
  • each level of convolution module can be realized by one or more convolution layers.
  • a first-level convolution including two convolutional layers is taken as an example for illustration.
  • G1 4051 includes convolutional layers G1_1 and G1_2
  • G2 4052 includes convolutional layers G2_1 and G2_2
  • G3 4053 includes convolutional layers G3_1 and G3_2.
  • the processing module 4054 is used to implement weighted superposition processing on the results output by the three convolution modules G1 4051, G2 4052, and G3 4053.
  • a bipolar neuron cell is a neuron cell that emits a protrusion from both ends of the cell body. Among them, one protrusion is distributed to the surrounding sensory receptors (also referred to as peripheral protrusions or dendrites), and the other protrusion enters the central part (also referred to as central processes or axons).
  • bipolar neuronal cells connect optic cells and ganglion cells, acting as a longitudinal connection. Bipolar neuron cells can be divided into two types: on-center and off-center. Among them, on-center means that the center is excited by light.
  • off-center means that the central cell is withdrawn from light, and it is excited when the stimulation is stopped in the central area. When the light is stopped, it is suppressed.
  • each convolution module in FIG. 6 needs to perform calculations according to the "center-peripheral convolution mechanism". Specifically, each convolution layer in each convolution module needs to perform two types of convolution, central convolution and peripheral convolution, in parallel. For example, for the convolution modules G1 4051, G2 4052, and G3 4053 shown in Figure 6, each convolution module includes two convolution layers, and each convolution layer needs to perform central convolution and peripheral convolution. kind of convolution.
  • the input data of the latter convolution layer includes the calculation result of the previous convolution layer, and the calculation result of the last convolution layer in a convolution module Is the result of the convolution module.
  • the convolution layer G1_1 performs peripheral convolution and center convolution on the input data to obtain the calculation results, and then the calculation results can be input to the convolution layer G1_2.
  • the convolution layer G1_2 is based on the set network parameter pair
  • the calculation result of the convolutional layer G1_1 continues to perform the central convolution of the peripheral convolution kernel.
  • the calculation result of the convolution layer G1_2 is the calculation result of the convolution module G1.
  • FIG. 7 is a schematic diagram of a center-periphery convolution mechanism provided by an embodiment of the present invention. It should be noted that Figure 7 illustrates how the convolutional layer G1_1 in the convolution module G1 4051 realizes center_peripheral convolution as an example. In practical applications, the convolutional layer G1_2 and the convolution module in the convolution module G1 4051 The working principle of the convolutional layer in G2 4052 and G3 4053 is similar to that of the convolution layer G1_1, and the description of FIG. 7 can be referred to.
  • the convolutional layer G1_1 is used to perform peripheral convolution operations on the input data 702 based on the first convolution kernel 7022 set in the convolutional network G1_1, and is used to simulate the on-center center to the light bipolar neuron cell.
  • the convolutional layer G1_1 is also used to perform a center convolution operation on the input data 702 based on the second convolution kernel 7024 to simulate off-center type bipolar neuron cells. It is understandable that in practical applications, the first part of the computing resources (or computing nodes) that execute the convolutional layer G1_1 can perform peripheral convolution calculations, and the second part of the computing resources that execute the convolutional layer G1_1 can perform central convolution calculations in parallel. .
  • the convolutional layer G1_1 can perform a first convolution operation on the input data based on the first convolution kernel 7022 to obtain a first intermediate result 704.
  • the weight value of the central area of the first convolution kernel is 0, which means that the convolution calculation is not performed on the value corresponding to the weight value of the central area of the first convolution kernel.
  • the first convolution kernel 7022 may be a hollow 3*3 convolution kernel.
  • the first convolution operation may be referred to as a peripheral convolution operation.
  • the convolutional layer G1_1 may also perform a second convolution operation on the input data based on the second convolution kernel 7024 to obtain a second intermediate result 706.
  • the weight of the central area of the second convolution kernel 7024 is valid, and the weight of the peripheral area of the second convolution kernel 7024 may be zero.
  • the second convolution kernel may be a 1*1 convolution kernel at the center.
  • the second convolution operation may be referred to as a central convolution operation.
  • the convolution layer G1_1 performs the central convolution operation, it means that only the value corresponding to the weight of the central region of the second convolution kernel is subjected to convolution calculation.
  • the size of the first convolution kernel 7022 and the second convolution kernel 7024 are the same, for example, both have a size of 3*3. It is understandable that 3*3 is only an example of the size of the convolution kernel. In actual applications, the size of the convolution kernel can also be 4*4, 5*5, 9*9, etc., here is not the convolution The size of the core is limited. In addition, in practical applications, the second convolution kernel may have only one value at the center. In FIG. 7, to illustrate the difference between the first convolution kernel 7022 and the second convolution kernel 7024, the center of G1_1 execution An example of the second convolution kernel 7024 of 1*1 convolution is a size of 3*3.
  • the side length of the central area of a convolution kernel needs to be smaller than the side length of the convolution kernel.
  • the central area of the convolution kernel may be a weight located at the center of the convolution kernel.
  • the central area of the convolution kernel may be 4 weights around the center of the convolution kernel.
  • the first intermediate result 704 can be obtained.
  • the second intermediate result 706 can be obtained.
  • the first intermediate result 704 and the second calculation result 706 can be superimposed, so that the calculation result of the convolutional layer G1_1 can be obtained.
  • the calculation result of the convolutional layer G1_1 may include two types: the first calculation result 708 and the second calculation result 710. Among them, the first calculation result 708 is the calculation result of strengthening the center peripheral effect.
  • the second calculation result 710 is equivalent to the calculation result of performing an ordinary 3*3 convolution, and does not strengthen the central peripheral effect. Specifically, when the first intermediate result 704 is greater than 0 and the second intermediate result 706 is less than 0, or when the first intermediate result 704 is less than 0 and the second intermediate result 706 is greater than 0, the first calculation result 708 is output. When the first intermediate result 704 is greater than 0 and the second intermediate result 706 is greater than 0, or when the first intermediate result 704 is less than 0 and the second intermediate result 706 is less than 0, the second calculation result 710 is output.
  • the convolution module G1 4051 when the convolution module G1 4051 includes two convolutional layers, after obtaining the calculation result of the convolutional layer G1_1, the calculation result of the convolutional layer G1_1 can be sent to the convolutional layer G1_2 to continue the convolution Product calculation. Similar to the convolutional layer G1_1, after the convolutional layer G1_2 simultaneously performs peripheral convolution and center convolution based on the set convolution kernel, the calculation result of the convolutional layer G1_2 can be output. It is understandable that when the convolution module G1 4051 only includes the convolution layer G1_1 as shown in FIG. 6, the calculation result of the convolution layer G1_1 is the calculation result of the convolution module G1 4051.
  • the calculation result of the convolution layer G1_2 is the calculation result of the convolution module G1 4051. It is understandable that when the convolution module G1 4051 also includes other convolution layers, the calculation result of the last convolution layer is the calculation result of the convolution module G1 4051.
  • the input data of the convolution modules G1 4051, G2 4052, and G3 4053 are different.
  • the input data of the convolution module G1 4051 is the feature data 404 obtained in step 304 and the image data of the target image received in step 302.
  • the input data of the convolution module G2 4052 is the output data of the convolution module G1 4051 (that is, the calculation result of the convolution module G1 4051)
  • the input data of the convolution module G3 4053 is the output data of the convolution module G2 4052 (that is, the convolution module G2 4052).
  • the calculation result of module G2 4052 is the calculation result of module G2 4052.
  • the convolution modules G2 4052 and G3 4053 are similar to those of the convolution module G1 4051.
  • the convolution layers in the convolution modules G2 4052 and G3 4053 can be referred to the data processing shown in Figure 7. The process is shown. Specifically, after obtaining the calculation result of the convolution module G1 4051, the convolution module G2 4052 may perform the second-level center-peripheral convolution calculation on the calculation result of the convolution module G1 4051 based on the set network parameters. Specifically, the convolution module G2 4052 may perform a third convolution operation on the calculation result of the first-level center_peripheral convolution based on the third convolution kernel to obtain the third intermediate result.
  • the convolution module G2 4052 may also perform a fourth convolution operation on the calculation result of the first-level center_peripheral convolution based on the fourth convolution kernel to obtain a fourth intermediate result.
  • the fourth convolution kernel only includes the weight of the central area, and the third convolution kernel and the fourth convolution kernel have the same size.
  • the calculation result of the second-level center-peripheral convolution is obtained based on the third intermediate result and the fourth intermediate result.
  • the convolution module G3 4053 may perform the third-level center-peripheral convolution calculation on the calculation result of the convolution module G2 4052 based on the set network parameters. Specifically, the convolution module G3 4053 may perform a fifth convolution operation on the calculation result of the second-level center_peripheral convolution based on the fifth convolution kernel to obtain the fifth intermediate result. Wherein, the weight of the central area of the fifth convolution kernel is 0. While performing the fifth convolution operation, the convolution module G3 4053 may also perform a sixth convolution operation on the calculation result of the second-level center_peripheral convolution based on the sixth convolution kernel to obtain a sixth intermediate result.
  • the sixth convolution kernel only includes the weight of the central region, and the fifth convolution kernel and the sixth convolution kernel have the same size. And obtain the calculation result of the third-level center-peripheral convolution based on the fifth intermediate result and the sixth intermediate result.
  • Figure 7 takes the convolution modules G2 4052 and G3 4053 each including two convolution layers as an example.
  • the convolution modules G24052 and G3 4053 can also include one or more layers of convolution, and each convolution layer can perform the central convolution and peripheral convolution as described above based on different convolution kernels. operate.
  • the convolution module G1 4051 is similar, the convolution module G2 4052 can obtain the calculation result of the convolution module G2 4052 based on the calculation result of one or more layers in the convolution module G2 4052, and the convolution module G3 4053 can also be based on The calculation result of one or more layers of convolution in the convolution module G3 4053 obtains the calculation result of the convolution module G3 4053.
  • the convolution kernels set in the convolutional layers in the convolution modules G1 4051, G2 4052, and G3 4053 may also be collectively referred to as the network parameters of the feature drift module 405.
  • the network parameters are obtained after training on multiple low-quality images, and can be used to indicate the correspondence between low-quality image feature data and clear image feature data.
  • the convolution kernels of different convolution layers can be different. In the same convolution layer, the size of the convolution kernel that performs peripheral convolution and central convolution is the same.
  • the feature drift module 405 provided by the embodiment of the present invention is implemented based on the feature drift law of the image, the low-quality image feature data and the clear image feature data indicated by the network parameters of the feature drift module 405 obtained by training The corresponding relationship has nothing to do with the specific content of the image.
  • the convolution modules G1 4051, G2 4052, and G3 4053 respectively perform the center-peripheral convolution as shown in Figure 7, the convolution modules G1 4051, G2 4052 and G3
  • the output result of 4053 is input to the processing module 4054 for accumulation processing, so as to obtain residual data 406 corresponding to the target image.
  • the residual data 406 is used to indicate the deviation between the feature data of the target image and the clear image feature data.
  • the processing module 4054 may also be a convolutional layer, and may perform a 1*1 convolution operation on the output results of the convolution modules G1 4051, G2 4052, and G3 4053 based on the weights set in the processing module 4054.
  • the weights in the processing module 4054 can be set to the weights A1, A2, and A3 in the above formula (3).
  • step 308 enhanced image feature data is obtained according to the residual data and the shallow feature data.
  • the residual data 406 and the shallow feature data 404 can be superimposed by the enhancement processing module 407, so that the enhanced image feature data 408 can be obtained.
  • the enhancement processing module 407 may be implemented by an adder or a convolutional layer, and the implementation manner of the enhancement processing module 407 is not limited herein.
  • the target image is processed based on the enhanced image feature data to obtain a processing result.
  • the enhanced image feature data 408 can be input to the next layer of neural network 409, so that the target image 402 is processed based on the enhanced image feature data 408 to obtain the The final processing result of the target image.
  • the target image may be identified, classified, detected, etc. based on the enhanced image feature data 408.
  • the low-quality image itself is not processed in advance, but in the image processing process, the set network parameters are used to process the low-quality image.
  • the image data of the high-quality image is processed to obtain enhanced image feature data of the low-quality image, and the low-quality image is processed based on the enhanced image feature data.
  • the network parameters reflect the corresponding relationship between the feature data of low-quality images and the feature data of clear images, that is to say, the relationship between low-quality image features and clear image features is used in the processing process for low-quality targets
  • the features of the image are processed, therefore, the recognizability of network features can be improved, and the processing effect of low-quality images can be improved. For example, the recognition accuracy of low-quality images can be improved.
  • the image processing method provided by the embodiment of the present invention utilizes the characteristic drift properties of the image in the image processing process, and constructs the center-peripheral convolution based on the non-classical receptive field structure of the retina and the photosensitive principle of bipolar cells
  • the mechanism processes the shallow features of low-quality images to obtain enhanced image feature data, and then processes low-quality images based on the enhanced image feature data.
  • the processing process refers to the non-classical receptive field structure of the retina and simulates the principle of light-sensitive bipolar cells of the retina, it has the function of enhancing the high-frequency information in the target image and maintaining the low-frequency information in the target image, so that the enhanced Image feature data is easier to be recognized or extracted, and the processing effect is better, for example, it can improve the recognition accuracy of low-quality images.
  • the robustness (or stability) of the network can be made stronger.
  • the embodiment of the present invention uses the feature drift properties of the image to process low-quality images, the processing process does not require the supervision of semantic signals (used to indicate image content), and there are fewer network parameters.
  • FIG. 8 is a schematic diagram of training of a neural network system provided by an embodiment of the present invention. Similar to 4, the first shallow feature calculation module 803, the feature drift module 405, the enhancement processing module 807, the second shallow feature calculation module 809, and the error calculation module 811 shown in FIG. 8 are all logical concepts and can be It is a neural network calculation performed by a neural network device, where the feature drift module 405 is a neural network to be trained.
  • training process provided in Figure 8 can be directly trained in the neural network circuit shown in Figure 1, or in the central processing unit (Center Process Unit, CPU), image processing unit (Graphic Process Unit, GPU). ), Tensor Process Unit (TPU) and other equipment for training.
  • the training scene is not limited here.
  • the degraded image imaging model multiple low-quality images are generated according to the selected multiple clear images to obtain the training set.
  • the generated multiple low-quality images may include various types and various low-quality degraded images.
  • 15 degradation types can be considered, and each degradation type can include at least 5 degradation degrees.
  • 15 low-quality images of degradation types can be generated for each clear image, and each degradation type can include at least 5 degradation degrees.
  • the low-quality image 802 may be input to the first feature acquisition module 803 to obtain the first feature data 804 of the low-quality image 802.
  • the first feature data 804 may include shallow feature data of the low-quality image 802.
  • the clear image 810 may be input to the second feature acquisition module 809 to obtain the second feature data 812 of the clear image 810.
  • the second feature data 812 may include the shallow feature data of the clear image 810.
  • both the first feature acquisition module 803 and the second feature acquisition module 809 may be the first N layers of neural networks of the neural network system, where N is less than the preset threshold.
  • both the first feature acquisition module 803 and the second feature acquisition module 809 may be the first N-layer neural network on the VGG16 or AlexNet network.
  • VGG16 or AlexNet are two network models.
  • the feature data output by the first pooling "pooling1" layer and the first convolution "Conv1" layer can be selected as the first feature data 804, respectively.
  • the type of neural network used to extract feature data of low-quality images is not limited.
  • the size of the input image and the size of the output feature map are not limited in the embodiment of the present invention, and can be set by themselves according to the network and user requirements.
  • the feature drift module 405 can obtain the residual data of the trained image data according to the network structure shown in FIG. 6 and the convolution process shown in FIG. 7.
  • the calculation process of the training process is similar to the calculation process of the aforementioned image processing process, except that the network parameters (or convolution kernel) and training set in each convolution module shown in Figure 6 during the training process The values of the network parameters set in the application process are different, and the purpose of training is to obtain appropriate network parameters.
  • the feature drift module 405 may first process the input low-quality image according to the network parameters initially set in each convolution module. It is understandable that the network parameters initially set in the feature drift module 405 can be obtained through Gaussian initialization, or through other initialization methods (for example, Xavier).
  • the obtained residual data 802 and the first feature data 804 can be input to the enhancement processing module 807 for accumulation processing to obtain enhancement After the image feature data 808.
  • the enhancement processing module 807 can be implemented by an adder or a convolutional layer.
  • the enhanced image feature data 808 and the second shallow feature data 812 of the clear image 810 can be compared by the error calculation module 811 to obtain the enhanced image feature data 808 and the second shallow feature data 812 of the clear image 810.
  • the error calculation module 811 may use a Mean Square Error (MSE) function to calculate the error between the enhanced image feature data 808 and the second shallow feature data 812 of the clear image 810.
  • MSE Mean Square Error
  • the network parameters in the feature drift module 405 can be optimized in the manner of gradient return according to the calculated error.
  • the error obtained by the error calculation module 811 can be made smaller than the preset threshold, so that the trained network parameters of the feature drift module 405 can be obtained.
  • the network parameters of the feature drift module 405 after error convergence can be used as the network parameters of the feature drift module 405 applied in the image processing process.
  • the feature drift module 405 after training can be applied to any image and training process.
  • the data is the same type of low-quality image.
  • the network parameters obtained after training in the embodiment of the present invention can be embedded in an existing neural network without retraining according to actual application scenarios to process the input degraded image.
  • the connection between low-quality image features and clear image features is used to identify low-quality images, the recognizability of network features can be improved, and the processing effect of low-quality images can be improved. For example, low-quality images can be improved. Recognition accuracy of quality images.
  • FIG. 9 is a schematic structural diagram of another image processing apparatus provided by an embodiment of the present invention.
  • the image processing apparatus 900 may include a receiving module 902, a feature enhancement module 904, and a processing module 906.
  • the receiving module 902 is configured to receive image data of a target image, and the target image is a low-quality image.
  • the feature enhancement module 904 is configured to process the image data based on network parameters to obtain enhanced image feature data of the target image, wherein the network parameters are used to indicate the feature data of the low-quality image and the clear image Correspondence between feature data.
  • the processing module 906 is configured to process the target image based on the enhanced image feature data.
  • the feature enhancement module 904 can obtain feature data of the target image according to the image data, and after obtaining the feature data, can perform neural network calculations on the feature data and the image data based on the network parameters , Obtain residual data, and obtain enhanced image feature data of the target image according to the residual data and the feature data.
  • the feature data is the feature data obtained through N-layer neural network calculation on the image data, and N is greater than 0 and less than a preset threshold; the residual data is used to indicate the feature data and clear image of the target image The deviation between the characteristic data.
  • the feature enhancement module 904 is configured to perform center-peripheral convolution calculation on the feature data and the image data based on the network parameters.
  • the feature enhancement module 904 may perform at least the first-level center_peripheral convolution calculation, the second-level center_peripheral convolution calculation, and the second-level center_periphery convolution calculation on the feature data and the image data based on the set network parameters. Three-level center_peripheral convolution calculation.
  • the input data of the first-level center_periphery convolution calculation includes the feature data and the image data
  • the input data of the second-level center_periphery convolution calculation includes the first-level center_periphery
  • the calculation result of the convolution calculation and the input data of the third-level center_peripheral convolution calculation includes the calculation result of the second-level center_peripheral convolution calculation.
  • the feature enhancement module 904 may be based on the calculation result of the first-level center_peripheral convolution calculation, the calculation result of the second-level center_peripheral convolution calculation, and the calculation result of the third-level center_peripheral convolution calculation. Obtain the residual data.
  • the feature enhancement module 904 may perform a first convolution operation on the feature data and the image data based on the first convolution kernel to obtain a first intermediate result, wherein the center of the first convolution kernel The weight of the area is 0.
  • the feature enhancement module 904 may perform a second convolution operation on the feature data and the image data based on the second convolution kernel to obtain a second intermediate result, where the second convolution kernel only includes the weight of the central region. Value, the size of the first convolution kernel and the second convolution kernel are the same.
  • the feature enhancement module 904 may obtain the calculation result of the first-level center-peripheral convolution based on the first intermediate result and the second intermediate result.
  • the feature enhancement module 904 may also perform a third convolution operation on the calculation result of the first-level center_peripheral convolution based on the third convolution kernel to obtain a third intermediate result, where the first intermediate result is obtained.
  • the value of the central area of the three convolution kernel is 0.
  • a fourth convolution operation is performed on the calculation result of the first-level center_peripheral convolution based on the fourth convolution kernel to obtain a fourth intermediate result, wherein the fourth convolution kernel only includes the weight of the central region ,
  • the third convolution kernel and the fourth convolution kernel have the same size. Therefore, the calculation result of the second-level center-peripheral convolution can be obtained according to the third intermediate result and the fourth intermediate result.
  • the feature enhancement module 904 may also perform a fifth convolution operation on the calculation result of the second-level center_peripheral convolution based on the fifth convolution kernel to obtain a fifth intermediate result, where the first The weight of the central area of the five convolution kernel is 0.
  • the feature enhancement module 904 may also perform a sixth convolution operation on the calculation result of the second-level center_peripheral convolution based on the sixth convolution kernel to obtain a sixth intermediate result, wherein the sixth convolution kernel only Including the weight of the central area, the fifth convolution kernel and the sixth convolution kernel have the same size.
  • the feature enhancement module 904 may obtain the calculation result of the third-level center-peripheral convolution based on the fifth intermediate result and the sixth intermediate result.
  • the image processing device shown in FIG. 9 does not process the low-quality image itself in advance, but uses the set network parameters to process the image data of the low-quality image in the image processing process to obtain enhancement of the low-quality image And process the low-quality image based on the enhanced image feature data. Since the network parameters reflect the correspondence between the feature data of the low-quality image and the feature data of the clear image, the processing effect of the low-quality target image is better.
  • the image processing device provided by the embodiment of the present invention utilizes the feature drift properties of the image, and the center-peripheral convolution mechanism constructed based on the non-classical receptive field structure of the retina and the photosensitive principle of the bipolar cells is effective for the shallow layer of low-quality images.
  • Features are processed to obtain enhanced image feature data, and low-quality images are processed based on the enhanced image feature data, so that the image processing effect is better and the recognition accuracy is higher.
  • each module in the image processing apparatus 900 shown in FIG. 9 may be located in one or more devices in the image processing apparatus shown in FIG. 1.
  • some or all of the modules in the embodiment shown in FIG. 9 may be selected according to actual needs to achieve the objectives of the solution of the embodiment.
  • the device embodiments described above are merely illustrative, for example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation. For example, multiple modules or components can be combined or integrated into another system, or some features can be omitted or not implemented.
  • the connections between the modules discussed in the foregoing embodiments may be electrical, mechanical, or other forms.
  • the modules described as separate components may or may not be physically separate.
  • the components displayed as modules may be physical modules or may not be physical modules.
  • each functional module in each embodiment of the application embodiment may exist independently, or may be integrated into one processing module.
  • the functional modules shown in FIG. 9 may be integrated in the neural network circuit or processor shown in FIG. 1 and implemented by corresponding devices.
  • An embodiment of the present invention also provides a computer program product for data processing, including a computer-readable storage medium storing program code, and instructions included in the program code are used to execute the method flow described in any one of the foregoing method embodiments.
  • a computer-readable storage medium includes: U disk, mobile hard disk, magnetic disk, optical disk, random-access memory (RAM), solid state disk (SSD) or non-volatile
  • RAM random-access memory
  • SSD solid state disk
  • non-volatile Various non-transitory (non-transitory) machine-readable media that can store program codes, such as non-volatile memory.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

一种图像处理方法及装置。根据该方法,在接收目标图像的图像数据后,可以基于网络参数对所述图像数据进行处理,获得所述目标图像的增强的图像特征数据,并基于所述增强的图像特征数据对所述目标图像进行处理。其中,所述目标图像为低质图像,所述网络参数用于指示低质图像的特征数据与清晰图像的特征数据之间的对应关系。所述方法能够提升低质图像的处理效果。

Description

图像处理方法及装置 技术领域
本申请涉及计算机技术领域,尤其涉及一种图像处理方法及装置。
背景技术
近年来,随着深度学习技术的快速发展,以图像分类、目标识别、语义分割等问题为代表的高层视觉领域的研究取得了突破性进展,这些进展在很大程度上要归功于ImageNet、PASCAL VOC等大型图像数据库的出现。然而这些数据库中的图像通常为清晰无损的高质量图像。然而,在实际的成像过程中,因为光线(低光照,过曝光等),天气因素(雨,雪,雾等),噪声和运动等干扰因素会破坏图像的结构和统计信息,导致出现低质图像。因此,在实际视觉应用中,计算机视觉系统所需要处理的很有可能是一幅低质图像。
在低质图像处理的过程中,一种常用方法是先使用图像增强算法对降质图像进行增强处理,提升低质图像的图像质量,然后再对质量提升后的图像进行识别等处理。然而现有的图像增强算法的主要目的是获得一幅视觉感知效果良好的增强图像,面向对象是人。但这类算法无法保证计算机网络能够从增强后的图像中提取到完整的结构或统计特征,因此识别精度较低。
发明内容
本申请提供的一种图像处理方法及装置,能够提升低质图像的处理效果。
第一方面,本发明实施例提供了一种图像处理方法。根据该方法,在接收目标图像的图像数据后,可以基于网络参数对所述图像数据进行处理,获得所述目标图像的增强的图像特征数据,并基于所述增强的图像特征数据对所述目标图像进行处理。其中,所述目标图像为低质图像,所述网络参数用于指示低质图像的特征数据与清晰图像的特征数据之间的对应关系。
从上述本发明实施例提供的图像处理方法的描述可知,在本发明实施例中,并不预先对低质图像本身做处理,而是在图像处理过程中,利用设置的网络参数对所述低质图像的图像数据进行处理,获得低质图像的增强的图像特征数据,并基于增强的图像特征数据对所述低质图像进行处理。由于网络参数体现了低质图像的特征数据和清晰图像的特征数据之间的对应关系,也就是说在处理过程中利用的是低质图像特征和清晰图像特征之间的联系对低质的目标图像的特征进行处理,因此,可以增强低质图像的特征数据,提升目标图像的特征数据的网络可辨识性,提升对低质图像的处理效果,例如,可以提升低质图像的识别精度。
结合第一方面,在第一种可能的实现方式中,在获得所述目标图像的增强的图像特征数据的过程中,可以根据所述图像数据获得所述目标图像的特征数据,并基于所述网络参数对所述特征数据以及所述图像数据进行神经网络计算,获得残差数据,近而可以根据所述残差数据以及所述特征数据获得所述目标图像的增强的图像特征数据。其中,所述特征数据是对所述图像数据经过N层神经网络计算获得的特征数据,N大于0且小于预设阈值,所述残差数据用于指示所述目标图像的特征数据与清晰图像的特征数据之间的偏差。
根据这种方式,由于在处理过程参考了视网膜的非经典感受野结构,并模拟了视网膜的双极细胞感光原理,因此,具有增强目标图像中的高频信息并保持目标图像中的低频信息的功能,使得增强后的图像特征数据更容易被识别或提取,处理效果较好,例如,可以提升低质图像的识别精度。并且,可以使网络的鲁棒性(或稳定性)较强。进一步的,由于本发明实施例利用了图像的特征漂移属性对低质图像进行处理,因此,处理过程不需要语义信号(用于指示图像内容)的监督,网络参数较少。
结合上述实施方式,在一种可能的实现方式中,所述基于所述网络参数对所述特征数据以及所述图像数据进行神经网络计算包括:基于所述网络参数对所述特征数据以及所述图像数据进行中心_周边卷积计算。根据这种方式,能够模拟视网膜的双极细胞感光原理,使得对图像的处理效果更好。
结合上述任意一种实施方式,在又一种可能的实现方式中,所述基于设置的网络参数对所述特征数据以及所述图像数据进行神经网络计算包括:基于设置的网络参数对所述特征数据以及所述图像数据至少执行第一级中心_周边卷积计算、第二级中心_周边卷积计算以及第三级中心_周边卷积计算。根据这种方式,能够模拟视网膜的非经典感受野的结构以及视网膜双极细胞感光原理,提升图像识别精度和效果。
结合上述任意一种实施方式,在又一种可能的实现方式中,所述第一级中心_周边卷积计算的输入数据包括所述特征数据以及所述图像数据,所述第二级中心_周边卷积计算的输入数据包括所述第一级中心_周边卷积计算的计算结果,所述第三级中心_周边卷积计算的输入数据包括所述第二级中心_周边卷积计算的计算结果。
结合上述任意一种实施方式,在又一种可能的实现方式中,所述残差数据是基于所述第一级中心_周边卷积计算的计算结果、所述第二级中心_周边卷积计算的计算结果以及所述第三级中心_周边卷积计算的计算结果获得的。
结合上述任意一种实施方式,在又一种可能的实现方式中,所述第一级中心_周边卷积计算用于模拟人眼视网膜的中心区域对所述目标图像的响应,所述第二级中心_周边卷积计算用于模拟人眼视网膜的周边区域对所述目标图像的响应,所述第三级中心_周边卷积计算用于模拟人眼视网膜的边缘区域对所述目标图像的响应。
结合上述任意一种实施方式,在又一种可能的实现方式中,所述第 一级中心_周边卷积计算包括:基于第一卷积核对所述特征数据以及所述图像数据执行第一卷积操作,获得第一中间结果,其中,所述第一卷积核中心区域的权值为0;基于第二卷积核对所述特征数据以及所述图像数据执行第二卷积操作,获得第二中间结果,其中,所述第二卷积核只包括中心区域的权值,所述第一卷积核和所述第二卷积核的大小相同;基于所述第一中间结果和所述第二中间结果获得所述第一级中心_周边卷积的计算结果。
结合上述任意一种实施方式,在又一种可能的实现方式中,所述第二级中心_周边卷积计算包括:基于第三卷积核对所述第一级中心_周边卷积的计算结果执行第三卷积操作,获得第三中间结果,其中,所述第三卷积核中心区域的数值为0;基于第四卷积核对所述第一级中心_周边卷积的计算结果执行第四卷积操作,获得第四中间结果,其中,所述第四卷积核只包括中心区域的权值,所述第三卷积核和所述第四卷积核的大小相同;基于所述第三中间结果和所述第四中间结果获得所述第二级中心_周边卷积的计算结果。
结合上述任意一种实施方式,在又一种可能的实现方式中,所述第三级中心_周边卷积计算包括:基于第五卷积核对所述第二级中心_周边卷积的计算结果执行第五卷积操作,获得第五中间结果,其中,所述第五卷积核中心区域的权值为0;基于第六卷积核对所述第二级中心_周边卷积的计算结果执行第六卷积操作,获得第六中间结果,其中,所述第六卷积核只包括中心区域的权值,所述第五卷积核和所述第六卷积核的大小相同;基于所述第五中间结果和所述第六中间结果获得所述第三级中心_周边卷积的计算结果。
结合上述任意一种实施方式,在又一种可能的实现方式中,所述方法可以由神经网络设备执行,所述网络参数是通过训练后得到的。
第二方面,本申请提供了一种图像识别装置,所述识别装置包括用于实现上述第一方面或第一方面的任意一种实现方式中的图像处理方法的功能模块。
第三方面,本申请提供了一种图像识别装置,包括用于实现第一方面或第一方面的任意一种实现方式中的图像处理方法的神经网络。
第四方面,本申请还提供了一种计算机程序产品,包括程序代码,所述程序代码包括的指令被计算机所执行,以实现上述第一方面或第一方面的任意一种实现方式中的图像处理方法。
第五方面,本申请还提供了一种计算机可读存储介质,所述计算机可读存储介质用于存储程序代码,所述程序代码包括的指令被计算机所执行,以实现前述第一方面或第一方面的任意一种实现方式中的图像处理方法。
附图说明
为了更清楚的说明本发明实施例或现有技术中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例。
图1为本发明实施例提供的一种图像处理装置的结构示意图;
图2为本发明实施例提供的一种神经网络系统中的神经网络层的示意图;
图3A为本发明实施例提供的一种图像处理方法的流程图;
图3B为本发明实施例提供的另一种图像处理方法的流程图;
图4为本发明实施例提供的一种图像处理方法的信号示意图;
图5为本发明实施例提供的一种人眼视网膜中的非经典感受野的结构示意图;
图6为本发明实施例提供的一种特征漂移模块的结构示意图;
图7为本发明实施例提供的一种中心_周边卷积机制的示意图;
图8为本发明实施例提供的一种神经网络系统的训练示意图;
图9为本发明实施例提供的另一种图像处理装置的示意图。
具体实施方式
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚的描述。显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。
图1为本发明实施例提供的一种图像处理装置的结构示意图。如图1所示,图像处理装置100可以包括控制模块105以及神经网络电路110。控制模块105可以包括处理器1052以及内存1054。处理器(processor)1052是控制模块105的运算核心和控制核心(control unit)。处理器1052中可以包括一个或多个处理器核(core)。处理器1052可以是一块超大规模的集成电路。在处理器1052中安装有操作系统和其他软件程序,从而处理器1052能够实现对内存1054、缓存、磁盘及外设设备(如图1中的神经网络电路)的访问。可以理解的是,在本发明实施例中,处理器1052中的Core例如可以是中央处理器(central processing unit,CPU),还可以是其他特定集成电路(application specific integrated circuit,ASIC)。
内存1054可以作为处理器1052的缓存。内存1054可以通过双倍速率(double data rate,DDR)总线和处理器1052相连。内存1054通常用来存放操作系统中各种正在运行的软件、输入和输出数据以及与外存交换的信息等。为了提高处理器1052的访问速度,内存1054需要具备访问速度快的优点。在传统的计算机系统架构中,通常采用动态随机存取存储器(dynamic random access memory,DRAM)作为内存1054。处理器1052能够通过内存控制器(图1中未示出)高速访问内存1054,对内存1054中的任意一个存储单元进行读操作和写操作。
神经网络电路110用于执行人工神经网络计算。本领域技术人员可以知道,人工神经网络(artificial neural network,ANN),简称为神经网络(neural network,NN)或类神经网络,在机器学习和认知科学领域,是一种模仿生物神 经网络(动物的中枢神经系统,特别是大脑)的结构和功能的数学模型或计算模型,用于对函数进行估计或近似。人工神经网络可以包括卷积神经网络(convolutional neural network,CNN)、深度神经网络(deep neural network,DNN)、多层感知器(multilayer perceptron,MLP)等神经网络。神经网络通常被用于进行图像识别、图像分类、语音识别等。
在本发明实施例中,神经网络电路110可以包括一个或多个用于执行人工神经网络计算的神经网络芯片115(可以简称为芯片115)。所述一个或多个芯片115用于执行神经网络计算。神经网络电路110与控制模块105连接。如图1所示,神经网络电路110可以通过连接总线106与控制模块105连接。连接总线106可以是快捷外设互联标准(peripheral componentinterconnect express,PCIE)总线,也可以是其他连接线(例如网线等)。在此,不对神经网络电路110与控制模块105的连接方式进行限定。通过控制模块105和神经网络电路110的连接,处理器1052可以通过连接总线106访问神经网络电路110。例如,在处理器1052通过接口(图1中未示出)接收到待处理的图像数据后,可以将待处理的图像数据通过连接总线106发送给神经网络电路110中的芯片115,并通过连接总线106接收神经网络电路110的处理结果。并且,控制模块105也可以通过连接总线106监测神经网络电路110的工作状态。
神经网络系统可以包括多个神经网络层。在本发明实施例中,神经网络层为逻辑的层概念,一个神经网络层是指要执行一次神经网络操作。神经网络层可以包括卷积层、池化层等。如图2所示,神经网络系统中可以包括n个神经网络层(又可以被称为n层神经网络),其中,n为大于或等于2的整数。图2示出了神经网络系统中的部分神经网络层,如图2所示,神经网络系统可以包括第一层202、第二层204、第三层206、第四层208、第五层210……至第n层212。其中,第一层202可以执行卷积操作,第二层204可以是对第一层302的输出数据执行池化操作,第三层206可以是对第二层204的输出数据执行卷积操作,第四层208可以对第三层206的输出结果执行卷积操作,第五层210可以对第二层204的输出数据以及第四层208的输出数据执行求和操作等等。可以理解的是,图2只是对神经网络系统中的神经网络层的一个简单示例和说明,并不对每一层神经网络的具体操作进行限制,例如,第四层208也可以是池化运算,第五层210也可以是做卷积操作或池化操作等其他的神经网络操作。
实际应用中,通过神经网络系统对图像进行处理时,可以通过对图像数据执行多个神经网络层的计算,最后得到图像的处理结果。在本发明实施例中,并不对执行神经网络计算的神经网络芯片的数量进行限定。需要说明的是,图1仅仅只是一种图像处理装置的示意。本发明实施例中的图像处理装置也可以是一台服务器、计算机等能够执行神经网络计算的计算设备。这些计算设备可以包括中央处理单元(Center Process Unit,CPU)或图形处理单元(Graphic Process Unit,GPU)等用于执行神经网络计算的计算节点。这些计算设备可以不包括如图1所示的专门用于执行神经网络计算的神经网络芯片。在本发明实施例中, 并不对图像处理装置的具体结构进行限定,只要包括能够实现本发明实施例提供的图像处理方法的神经网络即可。在本发明实施例中,也可以将计算机、服务器以及图1所示的图像处理装置等包含神经网络的装置称为神经网络设备或神经网络系统。并且,在本发明实施中,图像处理可以包括图像分类、目标识别、语义分割等图像处理方式。本发明实施例提供的图像处理方法可以应用于自动驾驶、智能手机拍照和智能监控系统等场景,提升图像处理装置对低质图像的处理精度。
下面将结合图3A、图3B和图4详细描述本发明实施例提供的图像处理装置如何处理低质图像。图3A为本发明实施例提供的一种图像处理方法的流程图,图3B为本发明实施例提供的另一种图像处理方法的流程图。图3A和图3B的区别在于图3B中的部分步骤是图3A中的部分步骤的具体示意。图4为本发明实施例提供的一种图像处理方法的信号示意图。图3A和图3B所示的图像处理方法均可以由图1中的神经网络电路110执行。在图像处理装置100对图像进行处理的过程中,主要通过神经网络电路110中的神经网络芯片115对输入的图像数据进行神经网络计算。例如,神经网络电路110可以对图像数据执行卷积、池化等多个神经网络计算,以得到图像处理结果。结合图3A、图3B和图4,本发明实施例提供的图像处理方法可以包括以下步骤。
在步骤302中,接收目标图像的图像数据,所述目标图像为低质图像。在本发明实施例中,低质图像是指在图像成像过程中由于受到光线(例如低光照、过度曝光等)、天气(例如雨、雪、雾等)和目标相对运动等因素的干扰,导致图像固有的颜色和结构信息被破坏而产生的低质量图像。简单的说,低质图像为图像质量低于预设阈值的图像。如图4所示,当需要处理图像时,神经网络电路110中的芯片115可以接收处理器1052发送的待处理的目标图像402的图像数据。可以理解的是,实际应用中,低质图像可以是预先采集的图像数据库中的图像,也可以将所述图像处理装置和图像采集系统进行连接,以对图像采集系统实时采集的图像进行处理。需要说明的是,在本发明实施例中,所述目标图像可以是图片、视频等。
在步骤303中,基于网络参数对所述图像数据进行处理,获得所述目标图像的增强的图像特征数据。其中,所述网络参数用于指示低质图像的特征数据与清晰图像的特征数据之间的对应关系。在本发明实施例中,在神经网络系统中设置有训练后的网络参数,所述网络参数是根据的多个低质图像和清晰图像训练后获得的,所述网络参数可以指示低质图像的特征数据与清晰图像的特征数据之间的对应关系。因此,在确定输入的图像数据为低质图像的图像数据后,可以基于设置的网络参数对输入的图像数据进行计算,从而能够对所述目标图像的特征数据进行增强,获得增强的图像特征数据。下面将结合图3B对如何获得所述目标图像的增强的图像特征数据进行详细描述。
参见图3B中的步骤304,在通过步骤302接收所述目标图像的图像数据后,可以根据所述图像数据获得所述目标图像的特征数据。其中,所述特征数据是对所述图像数据经过N层神经网络计算获得的数据,N大于0且小于预 设阈值。在本发明实施例中,所述目标图像的特征数据可以包括所述目标图像的浅层特征数据。其中,浅层特征数据可以包括用于指示图像的颜色、结构、纹理等特征的数据。具体的,目标图像的特征数据可以通过神经网络电路110中的芯片115执行N层神经网络计算来获得。也就是说,特征数据可以是神经网络电路110执行的前N层神经网络计算输出的特征数据(feature map)。N的值可以根据实际需要进行设置,例如N可以小于5或N可以小于10等,在此不进行限定。为了方便示意,在图4中将芯片115执行的N层神经网络计算称为特征获取模块403。例如,实际应用中,可以将神经网络电路110执行的神经网络计算中的前5层中任意一个神经网络层输出的特征数据作为所述目标图像的特征数据。以神经网络电路110需要执行如图2所示的n层神经网络计算为例,则可以将第一层202、第二层204、第三层206、第四层208、或第五层210输出的特征数据作为所述目标图像的特征数据。需要说明的是,图4所示的特征获取模块403、特征漂移模块405、增强处理模块407、以及下一层神经网络409均是逻辑上的概念,用于指示图1中的神经网络电路110执行的神经网络计算。
参见图3B,在步骤306中,基于设置的网络参数对所述特征数据以及所述图像数据进行神经网络计算,获得残差数据。所述残差数据用于指示所述目标图像的特征数据与清晰图像的特征数据之间的偏差。所述网络参数用于指示低质图像的特征数据与所述清晰图像的特征数据之间的对应关系。其中,所述清晰图像的特征数据包括在训练过程中使用的清晰图像的特征数据。所述清晰图像的特征数据包括所述清晰图像的浅层特征数据。如图4所示,在获得特征数据404后,可以通过特征漂移模块(Feature De-drifting Module)405对特征数据404和目标图像402的图像数据进行计算,以获得残差数据406。需要说明的是,特征漂移模块405也是神经网络电路110中执行的神经网络计算。在本发明实施例中,特征漂移模块405是基于自适应拮抗卷积的非经典感受野的神经网络计算模块。
在实现本发明的过程中,经研究发现,不同图像中,具有相似结构清晰图像块的特征表示与对应的低质图像块的特征表示之间具有相同的特征漂移规律,且这种规律与图像内容(语义信息)无关。具体的,所有结构相似的清晰图像块的浅层特征均聚集在一起,同样地,所有与清晰图像块对应的低质图像块的浅层特征也聚集在一起。而且这种聚集效应与图像的内容无关。本发明实施例正是基于这种发现,建立低质图像的特征数据(可以简称为低质特征)和清晰图像的特征数据(可以简称为清晰特征)之间的对应关系,根据这种对应关系以提升低质图像的处理精度。
为了更好的学习低质特征与清晰特征间的对应关系,本发明实施例基于人眼视网膜中的非经典感受野(non-classical receptive field,nCRF)机制提出了一种特征漂移网络,并且,基于双极细胞感光原理提出了“中心-周边卷积机制”。感受野是视觉系统信息处理的基本结构和功能单元。视网膜神经节具有同心圆拮抗式的经典感受野(classical receptive field,CRF),其空间整 合特性是处理图像区域亮度对比信息、提取图像的边缘信息。非经典感受野是在经典感受野之外的一个大范围区域,单独刺激该区域并不能直接引起细胞的反应,但对经典感受野内刺激所引起的反应有调制作用。视网膜神经节细胞的非经典感受野主要是去抑制性的,因此可以在一定程度上补偿由经典感受野所造成的低频信息的损失,在保持边界增强功能的同时,传递图像的区域亮度梯度信息,显示大面积表面上的亮度的缓慢变化。由此可见,非经典感受野大大拓宽了视觉细胞处理的范围,为整合和检测大范围的复杂图形提供了神经基础。人眼视网膜中的非经典感受野包含多个相互拮抗的子区域,各个子区域通过相互配合实现高频增强低频保持的功能,从而帮助人眼更好的分辨外界事物。其中,拮抗是指一种物质(或过程)被另一种物质(或过程)所阻抑的现象。
图5为本发明实施例提供的一种人眼视网膜中的非经典感受野的结构示意图,如图5所示,人眼视网膜中的非经典感受野可以包括三个区域:中心区域502、周边区域504以及边缘区域506。人眼视网膜中的非经典感受野的数学表达式可以为:
f=A 1(I*G(σ 1))+A 2(I*G(σ 2))+A 3(I*G(σ 3)) 公式(1)
在公式(1)中,G 1~G 3表示三种不同带宽的高斯卷积核:
Figure PCTCN2021099579-appb-000001
A 1~A 3分别表示中心区域、周边区域和边缘区域的加权系数,方差σ 1~σ 3决定了三种高斯函数的带宽。其中,公式(1)可以写成如下形式:
f=A 1(I*G 1)+A 2((I*G 1)*G' 2)+A 3(((I*G 1)*G' 2)*G' 3) 公式(3)
式中,
Figure PCTCN2021099579-appb-000002
通过公式(3),我们可以看出第一项卷积的输出结果可以作为第二项卷积的输入,第二项卷积的输出结果也可以作为第三项卷积的输入。
为了模拟上述非经典感受野中的拮抗机制,增强低质特征,本发明提出一种特征漂移模块。该模块包含多级结构,每级子结构中包含一个或多个卷积层。输入一组低质特征图,该特征漂移模块中的每级子网络均能输出一组结果,最后将多组输出结果进行加权融合,得到最后的残差数据。
图6为本发明实施例提供的一种特征漂移模块405的结构示意图。以特征漂移模块405包括三级卷积模块为例。如图6所示,特征漂移模块405可以包括三级卷积模块:G1 4051、G2 4052和G3 4053,特征漂移模块405还可以包括以及处理模块4054。其中,卷积模块G1 4051用于模拟图5所示的人眼视网膜中的非经典感受野的中心区域502的功能,卷积模块G2 4052用于模拟图5所示周边区域504的功能,卷积模块G3 4053用于模拟图5所示的边缘区域506 的功能。其中,每一级卷积模块可以通过一个或多个卷积层来实现。图6中以一级卷积包括2个卷积层为例进行图示。例如G1 4051包括卷积层G1_1和G1_2,G2 4052包括卷积层G2_1和G2_2,G3 4053包括卷积层G3_1和G3_2。处理模块4054用于实现对三个卷积模块G1 4051、G2 4052和G3 4053输出的结果进行加权后的叠加处理。
更好的增强低质图像中的高频信息,本发明实施例基于双极细胞感光原理提出了“中心-周边卷积”(Center Surround Convolution)。双极神经元细胞(bipolar neuron)是自胞体两端各发出一个突起的神经元细胞。其中,一个突起分布至周围感觉受器(也可以称为周围突或树突),另一个突起进入中枢部(也可以称为中枢突或轴突)。在视网膜中,双极神经元细胞连接视细胞和神经节细胞,起纵向联络作用。双极神经元细胞可以分为on-center和off-center两种类型。其中,on-center表示中心给光兴奋型,当中心收到光刺激时兴奋,当光刺激外周时则被抑制;off-center表示撤光中心细胞,当在中心区域停止刺激时兴奋,当外围的光被停止时则抑制。
在本发明实施例中,基于人眼视网膜的双极细胞感光原理,图6中的每一个卷积模块均需要依据“中心-周边卷积机制”执行计算。具体的,每一个卷积模块中的每一个卷积层均需要并行执行中心卷积和周边卷积两种卷积。例如,对于图6所示的卷积模块G1 4051、G2 4052和G3 4053,每个卷积模块都包括两个卷积层,且每个卷积层均需要执行中心卷积和周边卷积两种卷积。可以理解的是,当一个卷积模块包括多个卷积层时,后一个卷积层的输入数据包括前一个卷积层的计算结果,一个卷积模块中的最后一个卷积层的计算结果为该卷积模块的结果。例如,以卷积模块G1为例,卷积层G1_1对输入数据执行周边卷积和中心卷积获得计算结果后,可以将计算结果输入卷积层G1_2,卷积层G1_2基于设置的网络参数对卷积层G1_1的计算结果继续执行周边卷积核中心卷积。卷积层G1_2的计算结果为卷积模块G1的计算结果。
下面将以一个卷积模块G1 4051中的卷积层G1_1为例,对图6所示的卷积模块如何实现中心_周边卷积进行描述。图7为本发明实施例提供的一种中心_周边卷积机制的示意图。需要说明的是,图7以卷积模块G1 4051中的卷积层G1_1如何实现中心_周边卷积为例进行示意,实际应用中,卷积模块G1 4051中的卷积层G1_2以及卷积模块G2 4052和G3 4053中的卷积层的工作原理和卷积层G1_1类似,可以参考图7的描述。
如图7所示,卷积层G1_1用于基于卷积网络G1_1中设置的第一卷积核7022对输入数据702执行周边卷积操作,用于模拟on-center中心给光型双极神经元细胞。同时,卷积层G1_1还用于基于第二卷积核7024对输入数据702执行中心卷积操作,用于模拟off-center类型的双极神经元细胞。可以理解的是,实际应用中,可以使执行卷积层G1_1的第一部分计算资源(或计算节点)执行周边卷积计算,使执行卷积层G1_1的第二部分计算资源并行执行中心卷积计算。如图7所示,卷积层G1_1可以基于第一卷积核7022对输入数据执行第一卷积 操作,获得第一中间结果704。其中,所述第一卷积核中心区域的权值为0,表示不对与所述第一卷积核中心区域的权值对应的数值进行卷积计算。例如,如图7所示,所述第一卷积核7022可以为空心3*3的卷积核。所述第一卷积操作可以称为周边卷积操作。
并且,在执行周边卷积的同时,卷积层G1_1还可以基于第二卷积核7024对输入数据执行第二卷积操作,获得第二中间结果706。其中,所述第二卷积核7024的中心区域的权值有效,所述第二卷积核7024周边区域的权值可以为0。例如,如图7所示,所述第二卷积核可以为中心1*1的卷积核。所述第二卷积操作可以被称为中心卷积操作。在卷积层G1_1执行中心卷积操作时,表示只对与所述第二卷积核中心区域的权值对应的数值进行卷积计算。在本发明实施例中,所述第一卷积核7022和所述第二卷积核7024的大小相同,例如均为3*3的大小。可以理解的是,3*3仅仅只是卷积核的大小一种实例,实际应用中,卷积核的大小还可以是4*4、5*5、9*9等大小,在此不对卷积核的大小进行限定。并且,实际应用中,所述第二卷积核也可以是只有中心一个数值,在图7中是为了示意第一卷积核7022与第二卷积核7024的区别,而将G1_1执行的中心1*1卷积的第二卷积核7024示例为3*3的大小。可以理解的是,一个卷积核的中心区域的边长需要小于该卷积核的边长。例如,当卷积核为3*3大小时,该卷积核的中心区域可以为位于该卷积核中心的1个权值。当卷积核为4*4大小时,该卷积核的中心区域可以为围绕该卷积核中心的4个权值。
继续参见图7,卷积层G1_1执行3*3空心卷积后,可以获得第一中间结果704。同时,在卷积层G1_1并行执行1*1中心卷积后,可以获得第二中间结果706。进一步的,可以将第一中间结果704和第二计算结果706进行叠加处理,从而可以得到卷积层G1_1的计算结果。在本发明实施中,卷积层G1_1的计算结果可能包括两种类型:第一计算结果708和第二计算结果710。其中,第一计算结果708为强化中心周边效应的计算结果。第二计算结果710等价于执行普通的3*3卷积的计算结果,不会强化中心周边效应。具体的,当第一中间结果704大于0且第二中间结果706小于0时,或当第一中间结果704小于0且第二中间结果706大于0时,输出第一计算结果708。当第一中间结果704大于0且第二中间结果706大于0时,或当第一中间结果704小于0且第二中间结果706小于0时,输出第二计算结果710。
参见图6,在卷积模块G1 4051包括2个卷积层的情况下,在获得卷积层G1_1的计算结果后,则可以将卷积层G1_1的计算结果发送给卷积层G1_2继续进行卷积计算。与卷积层G1_1类似,在卷积层G1_2基于设置的卷积核同时执行周边卷积和中心卷积后,可以输出卷积层G1_2的计算结果。可以理解的是,在卷积模块G1 4051只包括如图6所示的卷积层G1_1时,卷积层G1_1的计算结果即为卷积模块G1 4051的计算结果。在卷积模块G1 4051包括如图6所示的卷积层G1_1和G1_2时,卷积层G1_2的计算结果即为卷积模块G1 4051的计算结果。可以理解的是,当卷积模块G1 4051还包括其他卷积层时,最后一 个卷积层的计算结果即为卷积模块G1 4051的计算结果。
结合图6,由于卷积模块G1 4051、G2 4052和G3 4053是三级卷积,因此,卷积模块G1 4051、G2 4052和G3 4053的输入数据并不相同。具体的,卷积模块G1 4051的输入数据为在步骤304中获得的特征数据404以及在步骤302中接收的目标图像的图像数据。卷积模块G2 4052的输入数据为卷积模块G1 4051的输出数据(即卷积模块G1 4051的计算结果),卷积模块G3 4053的输入数据为卷积模块G2 4052的输出数据(即卷积模块G2 4052的计算结果)。
如前所述,卷积模块G2 4052和G3 4053的工作原理和卷积模块G1 4051的工作原理类似,卷积模块G2 4052和G3 4053中的卷积层均可以参见图7所示的数据处理流程示意。具体的,在获得卷积模块G1 4051的计算结果后,所述卷积模块G2 4052可以基于设置的网络参数对所述卷积模块G1 4051的计算结果执行第二级中心_周边卷积计算。具体的,卷积模块G2 4052可以基于第三卷积核对所述第一级中心_周边卷积的计算结果执行第三卷积操作,获得第三中间结果。其中,所述第三卷积核中心区域的数值为0。在执行第三卷积操作的同时,卷积模块G2 4052还可以基于第四卷积核对所述第一级中心_周边卷积的计算结果执行第四卷积操作,获得第四中间结果。其中,所述第四卷积核只包括中心区域的权值,所述第三卷积核和所述第四卷积核的大小相同。基于所述第三中间结果和所述第四中间结果获得所述第二级中心_周边卷积的计算结果。
类似的,在获得卷积模块G2 4052的计算结果后,所述卷积模块G3 4053可以基于设置的网络参数对所述卷积模块G2 4052的计算结果执行第三级中心_周边卷积计算。具体的,卷积模块G3 4053可以基于第五卷积核对所述第二级中心_周边卷积的计算结果执行第五卷积操作,获得第五中间结果。其中,所述第五卷积核中心区域的权值为0。在执行第五卷积操作的同时,卷积模块G3 4053还可以基于第六卷积核对所述第二级中心_周边卷积的计算结果执行第六卷积操作,获得第六中间结果。其中,所述第六卷积核只包括中心区域的权值,所述第五卷积核和所述第六卷积核的大小相同。并基于所述第五中间结果和所述第六中间结果获得所述第三级中心_周边卷积的计算结果。
图7以卷积模块G2 4052和G3 4053均包括两个卷积层为例。如前所述,卷积模块G24052和G3 4053也可以分别包括一层或多层卷积,每一个卷积层均可以基于不同的卷积核执行如前所述的中心卷积和周边卷积操作。如卷积模块G1 4051类似,卷积模块G2 4052可以基于卷积模块G2 4052中的一层或多层卷积的计算结果获得卷积模块G2 4052的计算结果,卷积模块G3 4053也可以基于卷积模块G3 4053中的一层或多层卷积的计算结果获得卷积模块G3 4053的计算结果。
在本发明实施例中,卷积模块G1 4051、G2 4052和G3 4053中各卷积层中设置的卷积核也可以被统称为特征漂移模块405的网络参数。所述网络参数是对多个低质图像训练后得到的,可以用于指示低质图像特征数据与清晰图像特征数据之间的对应关系。不同卷积层的卷积核可以不同,同一个卷积层中,执 行周边卷积和中心卷积的卷积核大小相同。需要说明的是,由于本发明实施例提供的特征漂移模块405是基于图像的特征漂移规律实现的,因此,训练获得的特征漂移模块405的网络参数指示的低质图像特征数据与清晰图像特征数据之间的对应关系和图像的具体内容无关。
再参见图6,当卷积模块G1 4051、G2 4052和G3 4053中的各卷积层分别执行如图7所示的中心_周边卷积之后,可以将卷积模块G1 4051、G2 4052和G3 4053的输出结果输入处理模块4054进行累加处理,从而获得与所述目标图像对应的残差数据406。所述残差数据406用于指示所述目标图像的特征数据与清晰图像特征数据之间的偏差。实际应用中,处理模块4054也可以是一个卷积层,可以基于处理模块4054中设置的权重对卷积模块G1 4051、G2 4052和G3 4053的输出结果执行1*1卷积操作。处理模块4054中的权重可以设置为上述公式(3)中的权重A1、A2和A3。
在步骤308中,根据所述残差数据以及所述浅层特征数据获得增强的图像特征数据。具体的,可以通过增强处理模块407将残差数据406与浅层特征数据404进行叠加处理,从而能够获得增强的图像特征数据408。可以理解的是增强处理模块407可以通过加法器或卷积层来实现,在此不对增强处理模块407的实现方式进行限定。
在步骤310中,基于所述增强的图像特征数据对所述目标图像进行处理,获得处理结果。具体的,在获得增强的图像特征数据408后,可以将增强的图像特征数据408输入下一层神经网络409,从而基于增强的图像特征数据408对所述目标图像402进行处理,以获得所述目标图像的最终处理结果。例如,可以基于增强的图像特征数据408对所述目标图像进行识别、分类、检测等。
从上述本发明实施例提供的图像处理方法的描述可知,在本发明实施例中,并不预先对低质图像本身做处理,而是在图像处理过程中,利用设置的网络参数对所述低质图像的图像数据进行处理,获得低质图像的增强的图像特征数据,并基于增强的图像特征数据对所述低质图像进行处理。由于网络参数体现了低质图像的特征数据和清晰图像的特征数据之间的对应关系,也就是说在处理过程中利用的是低质图像特征和清晰图像特征之间的关系对低质的目标图像的特征进行处理,因此,可以提升网络特征的可辨识性,提升对低质图像的处理效果,例如,可以提升低质图像的识别精度。
进一步的,由于本发明实施例提供的图像处理方法在图像处理过程中,利用了图像的特征漂移属性,并根据视网膜的非经典感受野结构和双极细胞的感光原理构建的中心_周边卷积机制对低质图像的浅层特征进行处理,以获得增强后的图像特征数据,近而基于增强的图像特征数据对低质图像进行处理。由于处理过程参考了视网膜的非经典感受野结构,并模拟了视网膜的双极细胞感光原理,因此,具有增强目标图像中的高频信息并保持目标图像中的低频信息的功能,使得增强后的图像特征数据更容易被识别或提取,处理效果较好,例如,可以提升低质图像的识别精度。并且,可以使网络的鲁棒性(或稳定性)较强。进 一步的,由于本发明实施例利用了图像的特征漂移属性对低质图像进行处理,因此,处理过程不需要语义信号(用于指示图像内容)的监督,网络参数较少。
如前所述,本发明实施例中的特征漂移模块405的网络参数是根据训练获得的。下面将简单介绍一下特征漂移模块405的网络参数的训练过程。图8为本发明实施例提供的一种神经网络系统的训练示意图。与4类似,图8所示的第一浅层特征计算模块803、特征漂移模块405、增强处理模块807、第二浅层特征计算模块809、以及误差计算模块811均是逻辑上的概念,可以是神经网络设备执行的神经网络计算,其中,特征漂移模块405是待训练的神经网络。需要说明的是,图8提供的训练过程可以直接在如图1所示的神经网络电路中进行训练,也可以在中央处理单元(Center Process Unit,CPU)、图像处理单元(Graphic Process Unit,GPU)、张量处理单元(Tensor Process Unit,TPU)等设备上进行训练。在此不对训练的场景进行限定。
实际应用中,可以在训练前根据图像的分辨率挑选多张清晰图像,例如,可以选择15张以上的内容丰富的清晰图像。通过降质图像成像模型,根据选择的多张清晰图像生成多张低质图像,以获得训练集。生成的多张低质图像可以包括各种类型和各种低质程度的降质图像。例如,可以考虑15种降质类型,每种降质类型可以包括至少5种降质程度。也就是说,可以对每张清晰图像生成15种降质类型的低质图像,且每种降质类型可以至少包括5种降质程度。
如图8所示,在训练过程中,可以将低质图像802输入第一特征获取模块803以获得低质图像802的第一特征数据804。其中,所述第一特征数据804可以包括所述低质图像802的浅层特征数据。并且,可以将清晰图像810输入第二特征获取模块809获得清晰图像810的第二特征数据812。其中,所述第二特征数据812可以包括所述清晰图像810的浅层特征数据。与图4所示的特征获取模块403类似,第一特征获取模块803和第二特征获取模块809均可以是神经网络系统的前N层神经网络,其中,N小于所述预设阈值。例如,第一特征获取模块803和第二特征获取模块809均可以是VGG16或AlexNet网络上的前N层神经网络。其中,VGG16或AlexNet是两种网络模型。例如,在VGG16或AlexNet网络中,可以分别选用第一池化“pooling1”层和第一卷积“Conv1”层输出的特征数据作为第一特征数据804。在本发明实施例中,不对用于提取低质图像的特征数据的神经网络的类型进行限定。另外,需要说明的是,在本发明实施例中并不对输入图像的大小以及输出特征图的大小进行限定,可以根据网络和用户需求自行设定。
在获得第一浅层特征数据804后,可以将第一浅层特征数据804和低质图像802的图像数据输入特征漂移模块405。特征漂移模块405可以根据图6所示的网络结构和图7所示的卷积流程获得训练的图像数据的残差数据。关于特征漂移模块405的描述可以参考前述步骤306以及图6和图7的描述。可以理解的是,训练过程的计算过程和前述图像处理过程的计算过程是类似的,只是训练过程中图6所示的各卷积模块中设置的网络参数(或称为卷积核)和训练后在 应用过程中设置的网络参数的数值不同,训练的目的就是为了得到合适的网络参数。在训练过程中,特征漂移模块405可以先根据各卷积模块中初始设置的网络参数对输入的低质图像进行处理。可以理解的是,特征漂移模块405中初始设置的网络参数可以通过高斯初始化的方式获得,也可以通过其他初始化方式(例如Xavier)的方式获得。
在参考图6和图7所示的计算过程获得低质图像802的残差数据806后,可以将获得的残差数据802与第一特征数据804输入增强处理模块807进行累加处理,以获得增强后的图像特征数据808。增强处理模块807可以通过加法器或卷积层来实现。
进一步的,可以通过误差计算模块811将增强的图像特征数据808和清晰图像810的第二浅层特征数据812进行对比,获得增强的图像特征数据808和清晰图像810的第二浅层特征数据812之间的误差。实际应用中,误差计算模块811可以使用均方差函数(Mean Square Error,MSE)计算增强的图像特征数据808和清晰图像810的第二浅层特征数据812之间的误差。在计算误差后,可以根据计算的误差采用梯度回传的方式优化特征漂移模块405中的网络参数。需要说明的是,在根据误差调整网络参数的过程中,可以保持第一浅层特征计算模块803中的权重不便,而只优化特征漂移模块405中的网络参数。也就是说,可以根据误差调整特征漂移模块405中各卷积模块中的权重。
实际应用中,在通过对训练集中的多个低质图像的多次训练学习后,可以使得误差计算模块811获得的误差小于预设阈值,从而可以获得特征漂移模块405的训练后的网络参数。也就是说,可以将误差收敛后的特征漂移模块405的网络参数作为图像处理过程中应用的特征漂移模块405的网络参数。
本发明实施例提供的训练方法,由于训练过程中,利用了图像的特征漂移属性,且训练过程中没有使用语义信号进行监督,因此训练后的特征漂移模块405可以应用于到任何一张与训练数据同类型的低质图像上。也就是说,通过本发明实施例训练后获得的网络参数,无需根据实际的应用场景再次训练即可嵌入已有的神经网络中,处理输入的降质图像。并且由于利用的是低质图像特征和清晰图像特征之间的联系对低质图像进行识别,因此,可以提升网络特征的可辨识性,近而提升低质图像的处理效果,例如,可以提升低质图像的识别精度。
图9为本发明实施例提供的另一种图像处理装置的结构示意图。如图9所示,图像处理装置900可以包括接收模块902、特征增强模块904以及处理模块906。所述接收模块902用于接收目标图像的图像数据,所述目标图像为低质图像。所述特征增强模块904用于基于网络参数对所述图像数据进行处理,获得所述目标图像的增强的图像特征数据,其中,所述网络参数用于指示低质图像的特征数据与清晰图像的特征数据之间的对应关系。所述处理模块906用于基于所述增强的图像特征数据对所述目标图像进行处理。
具体的,特征增强模块904可以根据所述图像数据获得所述目标图像的特征数据,在获得所述特征数据后,可以基于所述网络参数对所述特征数据 以及所述图像数据进行神经网络计算,获得残差数据,并根据所述残差数据以及所述特征数据获得所述目标图像的增强的图像特征数据。其中,所述特征数据是对所述图像数据经过N层神经网络计算获得的特征数据,N大于0且小于预设阈值;所述残差数据用于指示所述目标图像的特征数据与清晰图像的特征数据之间的偏差。
在获得所述目标图像的增强的图像特征数据的过程中,所述特征增强模块904用于基于所述网络参数对所述特征数据以及所述图像数据进行中心_周边卷积计算。在一种实施方式中,特征增强模块904可以基于设置的网络参数对所述特征数据以及所述图像数据至少执行第一级中心_周边卷积计算、第二级中心_周边卷积计算以及第三级中心_周边卷积计算。其中,所述第一级中心_周边卷积计算的输入数据包括所述特征数据以及所述图像数据,所述第二级中心_周边卷积计算的输入数据包括所述第一级中心_周边卷积计算的计算结果,所述第三级中心_周边卷积计算的输入数据包括所述第二级中心_周边卷积计算的计算结果。特征增强模块904可以基于所述第一级中心_周边卷积计算的计算结果、所述第二级中心_周边卷积计算的计算结果以及所述第三级中心_周边卷积计算的计算结果获得所述残差数据。
在一种实现方式中,特征增强模块904可以基于第一卷积核对所述特征数据以及所述图像数据执行第一卷积操作,获得第一中间结果,其中,所述第一卷积核中心区域的权值为0。并且,特征增强模块904可以基于第二卷积核对所述特征数据以及所述图像数据执行第二卷积操作,获得第二中间结果,其中,所述第二卷积核只包括中心区域的权值,所述第一卷积核和所述第二卷积核的大小相同。进一步的,特征增强模块904可以基于所述第一中间结果和所述第二中间结果获得所述第一级中心_周边卷积的计算结果。
在一种实现方式中,特征增强模块904还可以基于第三卷积核对所述第一级中心_周边卷积的计算结果执行第三卷积操作,获得第三中间结果,其中,所述第三卷积核中心区域的数值为0。并且,基于第四卷积核对所述第一级中心_周边卷积的计算结果执行第四卷积操作,获得第四中间结果,其中,所述第四卷积核只包括中心区域的权值,所述第三卷积核和所述第四卷积核的大小相同。从而可以根据所述第三中间结果和所述第四中间结果获得所述第二级中心_周边卷积的计算结果。
在一种实现方式中,特征增强模块904还可以基于第五卷积核对所述第二级中心_周边卷积的计算结果执行第五卷积操作,获得第五中间结果,其中,所述第五卷积核中心区域的权值为0。并且,特征增强模块904还可以基于第六卷积核对所述第二级中心_周边卷积的计算结果执行第六卷积操作,获得第六中间结果,其中,所述第六卷积核只包括中心区域的权值,所述第五卷积核和所述第六卷积核的大小相同。进一步的,特征增强模块904可以基于所述第五中间结果和所述第六中间结果获得所述第三级中心_周边卷积的计算结果。
图9所示的图像处理装置并不预先对低质图像本身做处理,而是在 图像处理过程中,利用设置的网络参数对所述低质图像的图像数据进行处理,获得低质图像的增强的图像特征数据,并基于增强的图像特征数据对所述低质图像进行处理。由于网络参数体现了低质图像的特征数据和清晰图像的特征数据之间的对应关系,从而使得对低质的目标图像的处理效果更好。具体的,本发明实施例提供的图像处理装置利用了图像的特征漂移属性,根据视网膜的非经典感受野结构和双极细胞的感光原理构建的中心_周边卷积机制对低质图像的浅层特征进行处理,以获得增强后的图像特征数据,近而基于增强的图像特征数据对低质图像进行处理,从而使得图像的处理效果更好,识别精度更高。
可以理解的是,图9所示的图像处理装置900中各模块可以分别位于图1所示的图像处理装置中的一个或多个器件中。在本发明实施例中,可以根据实际的需要选择图9所示的实施例中的部分或者全部模块来实现本实施例方案的目的。图9实施例中没有详细描述的地方可以参考图1-8所示实施例中的相关描述。
可以理解的是,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。例如,多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,上述实施例所讨论的模块相互之间的连接可以是电性、机械或其他形式。所述作为分离部件说明的模块可以是物理上分开的,也可以不是物理上分开的。作为模块显示的部件可以是物理模块或者也可以不是物理模块。另外,在申请实施例各个实施例中的各功能模块可以独立存在,也可以集成在一个处理模块中。例如,图9所示的各功能模块可以集成在图1所示的神经网络电路或处理器中,由对应器件来实现。
本发明实施例还提供一种数据处理的计算机程序产品,包括存储了程序代码的计算机可读存储介质,所述程序代码包括的指令用于执行前述任意一个方法实施例所述的方法流程。本领域普通技术人员可以理解,前述的存储介质包括:U盘、移动硬盘、磁碟、光盘、随机存储器(random-access memory,RAM)、固态硬盘(solid state disk,SSD)或者非易失性存储器(non-volatile memory)等各种可以存储程序代码的非短暂性的(non-transitory)机器可读介质。
需要说明的是,本申请所提供的实施例仅仅是示意性的。所属领域的技术人员可以清楚的了解到,为了描述的方便和简洁,在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。在本发明实施例、权利要求以及附图中揭示的特征可以独立存在也可以组合存在。在本发明实施例中以硬件形式描述的特征可以通过软件来执行,反之亦然。在此不做限定。

Claims (24)

  1. 一种图像处理方法,其特征在于,包括:
    接收目标图像的图像数据,所述目标图像为低质图像;
    基于网络参数对所述图像数据进行处理,获得所述目标图像的增强的图像特征数据,其中,所述网络参数用于指示低质图像的特征数据与清晰图像的特征数据之间的对应关系;
    基于所述增强的图像特征数据对所述目标图像进行处理。
  2. 根据权利要求1所述的图像处理方法,其特征在于,所述基于网络参数对所述图像数据进行处理,获得所述目标图像的增强的图像特征数据包括:
    根据所述图像数据获得所述目标图像的特征数据,其中,所述特征数据是对所述图像数据经过N层神经网络计算获得的特征数据,N大于0且小于预设阈值;
    基于所述网络参数对所述特征数据以及所述图像数据进行神经网络计算,获得残差数据,所述残差数据用于指示所述目标图像的特征数据与清晰图像的特征数据之间的偏差;
    根据所述残差数据以及所述特征数据获得所述目标图像的增强的图像特征数据。
  3. 根据权利要求2所述的图像处理方法,其特征在于,所述基于所述网络参数对所述特征数据以及所述图像数据进行神经网络计算包括:
    基于所述网络参数对所述特征数据以及所述图像数据进行中心_周边卷积计算。
  4. 根据权利要求2所述的图像处理方法,其特征在于,所述基于设置的网络参数对所述特征数据以及所述图像数据进行神经网络计算包括:
    基于设置的网络参数对所述特征数据以及所述图像数据至少执行第一级中心_周边卷积计算、第二级中心_周边卷积计算以及第三级中心_周边卷积计算。
  5. 根据权利要求4所述的图像处理方法,其特征在于:所述第一级中心_周边卷积计算的输入数据包括所述特征数据以及所述图像数据,所述第二级中心_周边卷积计算的输入数据包括所述第一级中心_周边卷积计算的计算结果,所述第三级中心_周边卷积计算的输入数据包括所述第二级中心_周边卷积计算的计算结果。
  6. 根据权利要求4或5所述的图像处理方法,其特征在于,所述残差数据是基于所述第一级中心_周边卷积计算的计算结果、所述第二级中心_周边卷积计算的计算结果以及所述第三级中心_周边卷积计算的计算结果获得的。
  7. 根据权利要求4-6任意一项所述的图像处理方法,其特征在于,所述第一级中心_周边卷积计算用于模拟人眼视网膜的中心区域对所述目标图像的响应,所述第二级中心_周边卷积计算用于模拟人眼视网膜的周边区域对所述目标图像的响应,所述第三级中心_周边卷积计算用于模拟人眼视网膜的边缘区域对所述 目标图像的响应。
  8. 根据权利要求4-7任意一项所述的图像处理方法,其特征在于,所述第一级中心_周边卷积计算包括:
    基于第一卷积核对所述特征数据以及所述图像数据执行第一卷积操作,获得第一中间结果,其中,所述第一卷积核中心区域的权值为0;
    基于第二卷积核对所述特征数据以及所述图像数据执行第二卷积操作,获得第二中间结果,其中,所述第二卷积核只包括中心区域的权值,所述第一卷积核和所述第二卷积核的大小相同;
    基于所述第一中间结果和所述第二中间结果获得所述第一级中心_周边卷积的计算结果。
  9. 根据权利要求4-8任意一项所述的图像处理方法,其特征在于,所述第二级中心_周边卷积计算包括:
    基于第三卷积核对所述第一级中心_周边卷积的计算结果执行第三卷积操作,获得第三中间结果,其中,所述第三卷积核中心区域的数值为0;
    基于第四卷积核对所述第一级中心_周边卷积的计算结果执行第四卷积操作,获得第四中间结果,其中,所述第四卷积核只包括中心区域的权值,所述第三卷积核和所述第四卷积核的大小相同;
    基于所述第三中间结果和所述第四中间结果获得所述第二级中心_周边卷积的计算结果。
  10. 根据权利要求4-9任意一项所述的图像处理方法,其特征在于,所述第三级中心_周边卷积计算包括:
    基于第五卷积核对所述第二级中心_周边卷积的计算结果执行第五卷积操作,获得第五中间结果,其中,所述第五卷积核中心区域的权值为0;
    基于第六卷积核对所述第二级中心_周边卷积的计算结果执行第六卷积操作,获得第六中间结果,其中,所述第六卷积核只包括中心区域的权值,所述第五卷积核和所述第六卷积核的大小相同;
    基于所述第五中间结果和所述第六中间结果获得所述第三级中心_周边卷积的计算结果。
  11. 根据权利要求1-10任意一项所述的图像处理方法,其特征在于,所述方法由神经网络设备执行,所述网络参数是通过训练后得到的。
  12. 一种图像处理装置,其特征在于,包括:
    接收模块,用于接收目标图像的图像数据,所述目标图像为低质图像;
    特征增强模块,用于基于网络参数对所述图像数据进行处理,获得所述目标图像的增强的图像特征数据,其中,所述网络参数用于指示低质图像的特征数据与清晰图像的特征数据之间的对应关系;
    处理模块,用于基于所述增强的图像特征数据对所述目标图像进行处理。
  13. 根据权利要求12所述的图像处理装置,其特征在于,所述特征增强模块用于:
    根据所述图像数据获得所述目标图像的特征数据,其中,所述特征数据是对所述图像数据经过N层神经网络计算获得的特征数据,N大于0且小于预设阈值;
    基于所述网络参数对所述特征数据以及所述图像数据进行神经网络计算,获得残差数据,所述残差数据用于指示所述目标图像的特征数据与清晰图像的特征数据之间的偏差;
    根据所述残差数据以及所述特征数据获得所述目标图像的增强的图像特征数据。
  14. 根据权利要求13所述的图像处理装置,其特征在于,所述特征增强模块用于:基于所述网络参数对所述特征数据以及所述图像数据进行中心_周边卷积计算。
  15. 根据权利要求13所述的图像处理装置,其特征在于:所述特征增强模块用于:基于设置的网络参数对所述特征数据以及所述图像数据至少执行第一级中心_周边卷积计算、第二级中心_周边卷积计算以及第三级中心_周边卷积计算。
  16. 根据权利要求15所述的图像处理装置,其特征在于:所述第一级中心_周边卷积计算的输入数据包括所述特征数据以及所述图像数据,所述第二级中心_周边卷积计算的输入数据包括所述第一级中心_周边卷积计算的计算结果,所述第三级中心_周边卷积计算的输入数据包括所述第二级中心_周边卷积计算的计算结果。
  17. 根据权利要求15或16所述的图像处理装置,其特征在于:所述残差数据是基于所述第一级中心_周边卷积计算的计算结果、所述第二级中心_周边卷积计算的计算结果以及所述第三级中心_周边卷积计算的计算结果获得的。
  18. 根据权利要求15-17任意一项所述的图像处理装置,其特征在于:所述第一级中心_周边卷积计算用于模拟人眼视网膜的中心区域对所述目标图像的响应,所述第二级中心_周边卷积计算用于模拟人眼视网膜的周边区域对所述目标图像的响应,所述第三级中心_周边卷积计算用于模拟人眼视网膜的边缘区域对所述目标图像的响应。
  19. 根据权利要求15-17任意一项所述的图像处理装置,其特征在于,所述特征增强模块用于:
    基于第一卷积核对所述特征数据以及所述图像数据执行第一卷积操作,获得第一中间结果,其中,所述第一卷积核中心区域的权值为0;
    基于第二卷积核对所述特征数据以及所述图像数据执行第二卷积操作,获得第二中间结果,其中,所述第二卷积核只包括中心区域的权值,所述第一卷积核和所述第二卷积核的大小相同;
    基于所述第一中间结果和所述第二中间结果获得所述第一级中心_周边卷积的计算结果。
  20. 根据权利要求13-19任意一项所述的图像处理装置,其特征在于,所述 特征增强模块用于:
    基于第三卷积核对所述第一级中心_周边卷积的计算结果执行第三卷积操作,获得第三中间结果,其中,所述第三卷积核中心区域的数值为0;
    基于第四卷积核对所述第一级中心_周边卷积的计算结果执行第四卷积操作,获得第四中间结果,其中,所述第四卷积核只包括中心区域的权值,所述第三卷积核和所述第四卷积核的大小相同;
    基于所述第三中间结果和所述第四中间结果获得所述第二级中心_周边卷积的计算结果。
  21. 根据权利要求13-20任意一项所述的图像处理装置,其特征在于,所述特征增强模块用于:
    基于第五卷积核对所述第二级中心_周边卷积的计算结果执行第五卷积操作,获得第五中间结果,其中,所述第五卷积核中心区域的权值为0;
    基于第六卷积核对所述第二级中心_周边卷积的计算结果执行第六卷积操作,获得第六中间结果,其中,所述第六卷积核只包括中心区域的权值,所述第五卷积核和所述第六卷积核的大小相同;
    基于所述第五中间结果和所述第六中间结果获得所述第三级中心_周边卷积的计算结果。
  22. 根据权利要求12-21任意一项所述的图像处理装置,其特征在于,所述方法由神经网络设备执行,所述网络参数是通过训练后得到的。
  23. 一种图像处理装置,其特征在于,包括用于实现如权利要求1-11任意一项图像处理方法的神经网络。
  24. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质用于存储程序代码,所述程序代码包括的指令被计算机所执行,以实现如权利要求1-11任意一项所述的图像处理方法。
PCT/CN2021/099579 2020-06-12 2021-06-11 图像处理方法及装置 WO2021249523A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/064,132 US20230104428A1 (en) 2020-06-12 2022-12-09 Image processing method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010538341.XA CN113808026A (zh) 2020-06-12 2020-06-12 图像处理方法及装置
CN202010538341.X 2020-06-12

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/064,132 Continuation US20230104428A1 (en) 2020-06-12 2022-12-09 Image processing method and apparatus

Publications (1)

Publication Number Publication Date
WO2021249523A1 true WO2021249523A1 (zh) 2021-12-16

Family

ID=78846901

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/099579 WO2021249523A1 (zh) 2020-06-12 2021-06-11 图像处理方法及装置

Country Status (3)

Country Link
US (1) US20230104428A1 (zh)
CN (1) CN113808026A (zh)
WO (1) WO2021249523A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114663714B (zh) * 2022-05-23 2022-11-04 阿里巴巴(中国)有限公司 图像分类、地物分类方法和装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100321524A1 (en) * 2009-06-17 2010-12-23 Altek Corporation Sharpness processing method and system for digital image
CN109087258A (zh) * 2018-07-27 2018-12-25 中山大学 一种基于深度学习的图像去雨方法及装置
CN110942436A (zh) * 2019-11-29 2020-03-31 复旦大学 一种基于图像质量评价的图像去模糊方法
CN111192200A (zh) * 2020-01-02 2020-05-22 南京邮电大学 基于融合注意力机制残差网络的图像超分辨率重建方法
CN111223046A (zh) * 2019-11-20 2020-06-02 中国科学院遥感与数字地球研究所 一种图像超分辨率重建方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100321524A1 (en) * 2009-06-17 2010-12-23 Altek Corporation Sharpness processing method and system for digital image
CN109087258A (zh) * 2018-07-27 2018-12-25 中山大学 一种基于深度学习的图像去雨方法及装置
CN111223046A (zh) * 2019-11-20 2020-06-02 中国科学院遥感与数字地球研究所 一种图像超分辨率重建方法及装置
CN110942436A (zh) * 2019-11-29 2020-03-31 复旦大学 一种基于图像质量评价的图像去模糊方法
CN111192200A (zh) * 2020-01-02 2020-05-22 南京邮电大学 基于融合注意力机制残差网络的图像超分辨率重建方法

Also Published As

Publication number Publication date
US20230104428A1 (en) 2023-04-06
CN113808026A (zh) 2021-12-17

Similar Documents

Publication Publication Date Title
US11402496B2 (en) Method and apparatus for enhancing semantic features of SAR image oriented small set of samples
US11798132B2 (en) Image inpainting method and apparatus, computer device, and storage medium
CN109685819B (zh) 一种基于特征增强的三维医学图像分割方法
WO2020108474A1 (zh) 图片分类、分类识别模型的生成方法、装置、设备及介质
WO2019228317A1 (zh) 人脸识别方法、装置及计算机可读介质
CN109241982B (zh) 基于深浅层卷积神经网络的目标检测方法
KR20170140228A (ko) 바이어스 항을 통한 딥 신경망들에서의 톱-다운 정보의 병합
CN112529146B (zh) 神经网络模型训练的方法和装置
CN113705769A (zh) 一种神经网络训练方法以及装置
CN108389192A (zh) 基于卷积神经网络的立体图像舒适度评价方法
US11223782B2 (en) Video processing using a spectral decomposition layer
CN113554084B (zh) 基于剪枝和轻量卷积的车辆再辨识模型压缩方法及系统
CN115661943A (zh) 一种基于轻量级姿态评估网络的跌倒检测方法
Raparthi et al. Machine Learning Based Deep Cloud Model to Enhance Robustness and Noise Interference
CN106778910A (zh) 基于本地训练的深度学习系统和方法
CN112766413A (zh) 一种基于加权融合模型的鸟类分类方法及系统
WO2021249523A1 (zh) 图像处理方法及装置
WO2023072175A1 (zh) 点云数据的处理方法、神经网络的训练方法以及相关设备
CN112927209A (zh) 一种基于cnn的显著性检测系统和方法
CN107038419A (zh) 一种基于视频序列深度学习的人物行为语义识别方法
CN107239827B (zh) 一种基于人工神经网络的空间信息学习方法
CN108985442B (zh) 手写模型训练方法、手写字识别方法、装置、设备及介质
CN113435234A (zh) 一种基于双模态视频eeg数据的驾驶员视觉显著性区域预测方法
CN112101456A (zh) 注意力特征图获取方法及装置、目标检测的方法及装置
CN116091844A (zh) 一种基于边缘计算的图像数据处理方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21821977

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21821977

Country of ref document: EP

Kind code of ref document: A1