CN115205150A

CN115205150A - Image deblurring method, device, equipment, medium and computer program product

Info

Publication number: CN115205150A
Application number: CN202210855253.1A
Authority: CN
Inventors: 罗文寒
Original assignee: Tencent Technology Beijing Co Ltd
Current assignee: Tencent Technology Beijing Co Ltd
Priority date: 2022-07-19
Filing date: 2022-07-19
Publication date: 2022-10-18

Abstract

The application discloses a method, a device, equipment, a medium and a computer program product for deblurring an image, and relates to the field of image processing. The method comprises the following steps: acquiring an image to be processed, wherein the image to be processed is an image to be subjected to deblurring processing; inputting an image to be processed into a target image processing model for sharpness enhancement processing to obtain a sharpness-enhanced target image; the target image processing model is obtained by training an image processing model to be trained through a sample image, the image processing model is used for respectively carrying out definition enhancement processing on the sample image to obtain a first predicted image and carrying out the definition enhancement processing on the sample image to obtain a second predicted image, and model parameters of the target image processing model are obtained by carrying out contrast learning on definition ordering conditions and constraint ordering conditions among the sample image, the first predicted image and the second predicted image. The method can improve the processing effect of the target image processing model obtained through training.

Description

Image deblurring method, device, equipment, medium and computer program product

Technical Field

The present application relates to the field of image processing, and in particular, to a method, an apparatus, a device, a medium, and a computer program product for deblurring an image.

Background

The problem of image blurring can be caused by unstable factors in the image acquisition process or the loss of partial information in image data in the image transmission, storage and compression processes.

In the related art, when an image is deblurred based on deep learning, a model capable of deblurring an image is trained by inputting a sample image labeled with image sharpness into a model and driving the model to learn a blur kernel that makes the sharp image blurred.

However, in the implementation process of the above scheme, the labels of the sample images are often obtained through manual labeling, and the accuracy of the labeling affects the effect of the model obtained through downstream training.

Disclosure of Invention

The embodiment of the application provides a method, a device, equipment, a medium and a computer program product for deblurring an image, which can improve the processing effect when deblurring processing is performed on the image. The technical scheme is as follows:

in one aspect, a method for deblurring an image is provided, the method comprising:

acquiring an image to be processed, wherein the image to be processed is an image to be subjected to deblurring processing;

inputting the image to be processed into a target image processing model for sharpness enhancement processing to obtain a sharpness-enhanced target image;

the target image processing model is obtained by training an image processing model to be trained through a sample image, the image processing model is used for respectively carrying out definition enhancement processing on the sample image to obtain a first predicted image and carrying out blur enhancement processing to obtain a second predicted image, and model parameters of the target image processing model are obtained by carrying out contrast learning on definition ordering conditions and constraint ordering conditions among the sample image, the first predicted image and the second predicted image.

In another aspect, there is provided an apparatus for deblurring an image, the apparatus including:

the device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring an image to be processed, and the image to be processed is an image to be subjected to deblurring processing;

the processing module is used for inputting the image to be processed into a target image processing model for sharpness enhancement processing to obtain a sharpness-enhanced target image;

In another aspect, a computer device is provided, the terminal comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by the processor to implement the method of deblurring an image according to any of the embodiments of the present application.

In another aspect, a computer readable storage medium is provided, in which at least one program code is stored, the program code being loaded and executed by a processor to implement the method for deblurring an image according to any of the embodiments of the present application.

In another aspect, a computer program product or computer program is provided, the computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to execute the method for deblurring an image as described in any of the above embodiments.

The technical scheme provided by the application at least comprises the following beneficial effects:

in the training process of an image processing model for enhancing image definition, the definition enhancement processing and the blur enhancement processing are respectively carried out on a sample image, so that a first predicted image and a second predicted image are obtained, the image processing model is trained according to the definition arrangement condition among the sample image, the first predicted image and the second predicted image, the model is trained in an unsupervised mode in a contrast learning mode, labeling on the sample image is not needed, the training efficiency of the model is improved, the problem that the training effect is influenced due to the label accuracy is solved, and the processing effect of the model is ensured when the target image processing model obtained by training is applied to image deblurring.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic illustration of an implementation environment provided by an exemplary embodiment of the present application;

FIG. 2 is a flow chart of a method of deblurring an image provided by an exemplary embodiment of the present application;

FIG. 3 is a flow chart of a method of deblurring an image provided by an exemplary embodiment of the present application;

FIG. 4 is a schematic diagram of an image processing model to be trained provided by an exemplary embodiment of the present application;

FIG. 5 is a flow chart of a method of deblurring an image provided by an exemplary embodiment of the present application;

FIG. 6 is a schematic diagram illustrating determination of sharpness evaluation values provided by an exemplary embodiment of the present application;

FIG. 7 is a flow chart of a method of deblurring an image provided by an exemplary embodiment of the present application;

FIG. 8 is a schematic diagram of an encoder provided in an exemplary embodiment of the present application;

FIG. 9 is a schematic diagram of a decoder provided by an exemplary embodiment of the present application;

FIG. 10 is a block diagram of a method for deblurring an image according to an exemplary embodiment of the present application;

FIG. 11 is a flow chart of a method of deblurring an image provided by an exemplary embodiment of the present application;

FIG. 12 is a block diagram of an apparatus for deblurring an image according to an exemplary embodiment of the present application;

FIG. 13 is a block diagram of an apparatus for deblurring an image according to an exemplary embodiment of the present application;

fig. 14 is a schematic structural diagram of a server according to an exemplary embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

First, terms referred to in the embodiments of the present application are briefly described:

artificial intelligence: the method is a theory, method, technology and application system for simulating, extending and expanding human intelligence by using a digital computer or a machine controlled by the digital computer, sensing the environment, acquiring knowledge and obtaining the best result by using the knowledge. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and the like.

Machine Learning (ML): the method is a multi-field cross discipline and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The method specially studies how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and teaching learning.

Computer Vision technology (CV) Computer Vision is a science for researching how to make a machine "see", and further refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or is transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision techniques typically include image processing, image Recognition, image semantic understanding, image retrieval, optical Character Recognition (OCR), video processing, video semantic understanding, video content/behavior Recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, map construction, autopilot, smart traffic, and the like.

With the above explanations, the application scenarios of the embodiments of the present application are schematically illustrated, and the method for deblurring an image can be applied to any of the following scenarios:

first, the method may be applied to camera applications. Illustratively, when image acquisition is performed by a camera application, the sharpness enhancement processing can be performed on an image acquired by a camera in real time by the image deblurring method, so that the image blurring problem caused by the shake of a terminal device, the motion of a shot object and the like is reduced.

Second, the method may be applied in image processing applications. Illustratively, the image deblurring functionality is provided by an image processing application. Namely, an image deblurring function module is provided in the image processing application, a user can process the image to be processed by uploading the image as the image to be processed and using the image deblurring function module in the image processing application, so that a target image with enhanced definition is obtained, and the image with the blurring problem in the processing such as old photo definition enhancement, low-resolution image definition enhancement, transmission or compression and the like is processed.

Thirdly, the method can be applied to vehicle-mounted terminal application in a vehicle-mounted scene. Illustratively, a vehicle is equipped with a vehicle-mounted terminal, the vehicle-mounted terminal can be connected with an image acquisition device configured on the vehicle, and when the image acquisition device acquires an image of an environment and transmits the acquired image to the vehicle-mounted terminal, the vehicle-mounted terminal can perform definition enhancement on the acquired image by the above method. The image may be an image acquired from the surroundings of the vehicle or an image acquired from the interior of the vehicle, and is applied to imaging shooting of a vehicle rearview mirror, for example.

It should be noted that the three scenes are only exemplary, and the method may also be applied to other scenes that need to be subjected to image deblurring processing, and is not limited in detail herein.

Referring to fig. 1, a schematic diagram of an implementation environment provided by an exemplary embodiment of the present application is shown. The computer system of the embodiment comprises: terminal device 110, server 120, and communication network 130.

The terminal device 110 includes various types of devices such as a mobile phone, a tablet computer, a desktop computer, a portable notebook computer, an intelligent voice interaction device, an intelligent household appliance, a vehicle-mounted terminal, and an aircraft. Illustratively, a target application is running in the terminal device 110, and the target application is provided with an image deblurring function. Optionally, the target application may be traditional application software, may be cloud application software, may be implemented as an applet or an application module in the host application program, or may be a certain web page platform, which is not limited herein. Alternatively, the target application may be a camera application, an image processing application, a mapping application, or the like, without being particularly limited thereto.

The server 120 is used to provide backend services for the target application. Illustratively, the server 120 trains the image processing model to be trained through the sample image to obtain the target image processing model. The terminal device 110 transmits an image processing request for requesting the image deblurring processing provided by the server 120 to the server 120, the image processing request including an image to be processed.

In other embodiments, after the server 120 trains to obtain the target image processing model, the target image processing model may also be sent to the terminal device 110, the terminal device 110 establishes a function module capable of locally implementing image deblurring according to the target image processing model, and when performing deblurring on the image to be processed, the function module is invoked to implement the image deblurring.

After the training of the target image processing model is completed, the sharpness enhancement processing of the image can be realized, illustratively, after receiving the image processing request, the server 120 calls the pre-trained target image processing model, performs sharpness enhancement on the image to be processed to obtain a target image, and the server 120 returns the target image to the terminal device 110.

It should be noted that the server 120 may be an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server that provides basic cloud computing services such as cloud service, a cloud database, cloud security, cloud computing, cloud functions, cloud storage, web service, cloud communication, middleware service, domain name service, security service, content Delivery Network (CDN), big data, and an artificial intelligence platform.

The Cloud Technology (Cloud Technology) is a hosting Technology for unifying series resources such as hardware, software, network and the like in a wide area network or a local area network to realize calculation, storage, processing and sharing of data. The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied in the cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, picture-like websites and more web portals. With the high development and application of the internet industry, each article may have an own identification mark and needs to be transmitted to a background system for logic processing, data of different levels can be processed separately, and various industry data need strong system background support and can be realized only through cloud computing.

In some embodiments, the server 120 described above may also be implemented as a node in a blockchain system. The Blockchain (Blockchain) is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like.

Illustratively, the terminal device 110 and the server 120 are connected through a communication network 130, where the communication network 130 may be a wired network or a wireless network, and is not limited herein.

Referring to fig. 2, a method for deblurring an image according to an embodiment of the present application is shown, in which an application process of deblurring an image is schematically illustrated. The method is schematically illustrated as being applied to a server shown in fig. 1, and it should be noted that the method may also be applied to a terminal device, and is not particularly limited herein. The method comprises the following steps:

step 201, acquiring an image to be processed.

Illustratively, the image to be processed is an image to be subjected to deblurring processing.

Optionally, the image to be processed may be uploaded by a terminal device, or may be read from a database by a server.

In some embodiments, when the image to be processed is uploaded by the terminal device, the image to be processed may be carried in an image processing request instructing the server to deblur the image to be processed.

Optionally, the image to be processed may also be an image frame extracted from a designated video, in an example, the server performs frame division processing on the designated video to obtain at least two image frames, and determines at least one image to be processed from the at least two image frames.

Step 202, inputting the image to be processed into a target image processing model for sharpness enhancement processing, and obtaining a sharpness-enhanced target image.

Illustratively, the server inputs the image to be processed into a function module provided with image deblurring, and the function module performs definition enhancement processing on the image to be processed by calling a target image processing model, so as to output and obtain a target image.

In some embodiments, the server returns the output target image to the terminal device. In other embodiments, when the image to be processed is an image extracted from a designated video, the obtained target image and an unprocessed image frame in the designated video are combined to generate the target video.

Illustratively, the target image processing model is obtained by training an image processing model to be trained through a sample image, the image processing model is used for respectively performing sharpness enhancement processing on the sample image to obtain a first predicted image and performing blur enhancement processing to obtain a second predicted image, and model parameters of the target image processing model are obtained by performing contrast learning on sharpness ordering conditions and constraint ordering conditions among the sample image, the first predicted image and the second predicted image.

In some embodiments, a deblurring subnetwork and a re-blurring subnetwork are included in the image processing model. The deblurring sub-network is used for performing definition enhancement processing on the sample image to obtain a first predicted image; and the re-fuzzy sub-network is used for carrying out fuzzy enhancement processing on the sample image to obtain a second predicted image.

In some embodiments, in the training process of the image processing model, the network parameters of the deblurring sub-network and the re-blurring sub-network are adjusted by performing contrast learning on the definition ordering condition and the constraint ordering condition among the sample image, the first predicted image and the second predicted image, and finally a target image processing model is obtained through training, wherein the target image processing model comprises a trained target deblurring sub-network and a trained target re-blurring sub-network.

In some embodiments, during the application of the target image processing model, since only the sharpness enhancement processing needs to be performed on the image, during the application, only the target deblurring sub-network is used to process the input image to be processed.

In other embodiments, after the target image processing model is obtained through training, the target image processing model is transmitted to the terminal device, that is, steps 201 to 202 are applied to the terminal device, and the terminal device applies the target image processing model to a functional module of the target application.

In one example, after the terminal device acquires the target image processing model, the target image processing model is applied to an image processing component in a camera application, and when the camera application of the terminal device is in an operating state and the image processing component is called, an image acquired by the camera application is input to the target image processing model as an image to be processed, so that a target image with improved definition is output.

For example, when a camera acquires an environmental image in real time, the acquired image is input to the target image processing model in real time, and then the acquired image with enhanced definition is displayed in real time; for example, when a selection operation for a designated image in an album is received and a deblurring processing instruction for the designated image is received, the designated image is input into the target image processing model, and a target image with enhanced definition is output and displayed.

In summary, according to the image deblurring method provided in the embodiment of the present application, in the training process of an image processing model for enhancing image sharpness, sharpness enhancement processing and blur enhancement processing are performed on a sample image respectively, so that a first predicted image and a second predicted image are obtained, and the image processing model is trained according to the sharpness arrangement condition among the sample image, the first predicted image and the second predicted image, so that the model is trained unsupervised in a contrast learning manner without labeling the sample image, the training efficiency of the model is improved, and meanwhile, the problem that the training effect is affected by the labeling accuracy is avoided, so that the processing effect when the trained target image processing model is applied to the image sharpness enhancement is achieved.

Referring to fig. 3, a method for deblurring an image according to an embodiment of the present application is shown, in which a training process of an image processing model to be trained is schematically illustrated, that is, steps 301 to 304 are performed before step 201, and the method includes:

step 301, a sample image is acquired.

Illustratively, the sample image is used for training the image processing model to be trained.

Optionally, the sample image may be uploaded by a terminal device, may be acquired from a database, or may be acquired from a public data set in the internet.

Optionally, the model structure of the image processing model to be trained may be indicated by the terminal device, or may be obtained from a database. Illustratively, before the image processing model to be processed is trained, the model parameters are initialized, so as to obtain the initialization parameters of the model for starting the model training.

In some embodiments, when the deblurring processing of the image is realized by the image processing model, an unsupervised contrast learning mode is adopted in a training stage to realize the training process of the model. Illustratively, the image processing model includes a deblurring subnetwork for implementing sharpness enhancement processing and a re-blur subnetwork for blur enhancement processing.

Step 302, performing sharpness enhancement processing on the sample image through the image processing model to obtain a first predicted image.

In the embodiment of the present application, the sample image is subjected to a sharpness enhancement process by a deblurring subnetwork in the above image processing model to obtain a first predicted image, wherein, in the model overall training process of the image processing model, the sharpness of the first predicted image expected to be obtained by a training target of the deblurring subnetwork is higher than that of the input sample image.

Alternatively, the deblurring sub-Network may use at least one Network structure of a Convolutional Neural Network (CNN), a cyclic Neural Network (RNN), a Visual Geometry Group Network (VGG), a Residual Network (rescen), and the like.

In some embodiments, before performing the sharpness enhancement processing on the sample image, the sample image may be preprocessed, and the preprocessed sample image is input into the deblurring subnetwork. Alternatively, the operation of the above preprocessing may include at least one image processing operation of an edge detection process, a scale transform process, a frequency domain process, a histogram contrast enhancement process, and the like.

And 303, performing ambiguity enhancement processing on the sample image through the image processing model to obtain a second predicted image.

In the embodiment of the present application, the sample image is subjected to the competitive blurring enhancement processing through the double-fuzzy sub-network in the image processing model, so as to obtain the second predicted image, wherein in the overall model training process of the image processing model, the definition of the second predicted image expected to be obtained by the training target of the double-fuzzy sub-network is lower than that of the input sample image.

Alternatively, the deblurring subnetwork may use at least one network structure of CNN, RNN, VGG, resNet, and the like. In some embodiments, the deblurring subnetwork and the re-obfuscating subnetwork may use the same network structure, or the deblurring subnetwork and the re-obfuscating subnetwork may use different network structures, which are not limited herein.

In some embodiments, the sample image may be preprocessed before the blur enhancement processing is performed on the sample image, and the preprocessed sample image is input into the re-blurring sub-network. Alternatively, the operation of the above preprocessing may include at least one image processing operation of an edge detection process, a scale transform process, a frequency domain process, a histogram contrast enhancement process, and the like.

And 304, training the image processing model based on the sequencing loss among the sample image, the first prediction image and the second prediction image to obtain a target image processing model.

Illustratively, the loss of ordering is used to indicate a difference between an ordering of the sharpness of the image between the sample image, the first predicted image and the second predicted image and a constraint ordering indicating that the second sharpness of the first predicted image > the first sharpness of the sample image > the third sharpness of the second predicted image.

After a first prediction image and a second prediction image are obtained through a deblurring sub-network and a re-blurring sub-network respectively, a triple is formed by the sample image, the first prediction image and the second prediction image, and comparison learning is realized according to the constraint relation among the triple, so that the whole model is trained, and the first prediction image obtained through the deblurring sub-network, the second prediction image obtained through the re-blurring sub-network and the sample image can accord with the constraint condition corresponding to a training target.

Illustratively, it can be known from training targets respectively corresponding to a deblurring subnetwork and a re-blurring subnetwork, the deblurring subnetwork needs to improve the definition of an input image as much as possible, and the re-blurring subnetwork needs to improve the definition of the input image as much as possible, so for a triplet composed of a sample image, a first predicted image and a second predicted image, corresponding specified constraints in the training process are that the second definition of the first predicted image is greater than the first definition of the sample image, and the first definition of the sample image is greater than the third definition of the second predicted image, that is, the second definition > the first definition > the third definition.

Training an image processing model for the constraint conditions, illustratively, obtaining a first evaluation value of the sample image indicating the degree of sharpness of the sample image, a second evaluation value of the first predicted image indicating the degree of sharpness of the first predicted image, and a third evaluation value of the second predicted image indicating the degree of sharpness of the second predicted image; determining a ranking loss value based on a difference between a result of comparison among the first evaluation value, the second evaluation value, and the third evaluation value and a specified constraint condition; and training the image processing model based on the sequencing loss value to obtain a target image processing model.

In some embodiments, in determining the difference between the comparison result between the first evaluation value, the second evaluation value, and the third evaluation value and the specified constraint condition, the ranking loss value may be determined by pairwise comparison. Illustratively, if the second evaluation value > the first evaluation value is constrained, the loss L corresponding to the constraint is ₁ As shown in formula I, wherein f ₂ For indicating a second evaluation value, f ₁ For indicating the first evaluation value, and m for indicating the hyper-parameter of the constraint condition.

The formula I is as follows: l is ₁ ＝max{0,m-(f ₂ -f ₁ )}

Similarly, the method also includes constraining the second evaluation value > the third evaluation value, and the first evaluation value > the third evaluation value to respectively correspond to the loss L ₂ And L ₃ And determining the sequencing loss value corresponding to the triplet through the three losses, so as to adjust the image processing model according to the sequencing loss value, namely, adjusting the network parameters of the deblurring subnetwork and the re-blurring subnetwork, and finally training to obtain the target image processing model.

Schematically, as shown in fig. 4, a schematic diagram of an image processing model to be trained provided by an exemplary embodiment of the present application is shown. The image processing model 400 to be trained comprises a deblurring sub-network 410 and a re-blurring sub-network 420, wherein the sample image 401 is respectively input into the deblurring sub-network 410 and the re-blurring sub-network 420, a first prediction image 411 and a second prediction image 421 are obtained through output, the sample image 401, the first prediction image 411 and the second prediction image 421 are input into a loss determination module 430, and the image processing model 400 to be trained is trained according to the sequencing loss value determined by the loss determination module 430.

In some embodiments, after the image processing model to be trained is trained to obtain the target image processing model, the target image processing model includes a trained target deblurring subnetwork, and the target deblurring subnetwork is illustratively used as a network for sharpness enhancement.

Schematically, an image to be processed is input into the target deblurring sub-network, and a target image is obtained through output, wherein the definition of the target image is higher than that of the image to be processed.

In summary, according to the image deblurring method provided in the embodiment of the present application, in the training process of an image processing model for enhancing image sharpness, sharpness enhancement processing and blur enhancement processing are performed on a sample image respectively, so that a first predicted image and a second predicted image are obtained, and the image processing model is trained according to the sharpness arrangement between the sample image, the first predicted image and the second predicted image, so that the model is trained unsupervised in a contrast learning manner without performing label labeling on the sample image, the training efficiency of the model is improved, and meanwhile, the problem that the training effect is affected by label accuracy is avoided, so that the target image processing model obtained by training is applied to the processing effect when the image sharpness is enhanced.

Referring to fig. 5, a method for deblurring an image provided by an exemplary embodiment of the present application is shown, in which the determination of the rank penalty is schematically illustrated. The method comprises the following steps:

step 501, a sample image is obtained.

Illustratively, the sample image is used for training the image processing model to be trained. The image processing model to be trained comprises a deblurring sub-network and a re-blurring sub-network to be trained, wherein the deblurring sub-network is used for realizing definition enhancement processing, and the re-blurring sub-network is used for blurring enhancement processing.

Step 502, performing sharpness enhancement processing on the sample image through a deblurring sub-network to obtain a first predicted image.

Step 503, performing ambiguity enhancement processing on the sample image through a re-ambiguity sub-network to obtain a second predicted image.

Step 504 acquires a first evaluation value of the sample image, a second evaluation value of the first predicted image, and a third evaluation value of the second predicted image.

In the embodiment of the present application, a first evaluation value indicating the sharpness of the sample image, a second evaluation value indicating the sharpness of the first predicted image, and a third evaluation value indicating the sharpness of the second predicted image are acquired for the sample image, the first predicted image, and the third predicted image.

Alternatively, the evaluation value of the image clarity may be determined by at least one of:

first, the sharpness of an image is determined by calculating the gradient of the image.

Schematically, the priori experience for the image sharpness shows that the edge in the image with higher sharpness is clearer, the calculated gradient is larger, the edge in the blurred image is blurred, and the calculated gradient is smaller, so that for two images with the same content, if one image is clearer and one image is blurred, obviously, the gradient information calculated by the clearer image is richer than the gradient information calculated by the blurred image, namely, the gradient is larger. Therefore, the sharpness of the image can be indicated by calculating the gradient information of the image.

Optionally, when gradient information of the image is calculated, a gradient in a specified direction may be selectively calculated, and the gradient information corresponding to the gradient in the specified direction is taken as the definition of the image; the gradient in multiple directions can also be calculated, and the definition of the image can be determined by combining the gradient information corresponding to the gradient in multiple directions. Illustratively, the gradient is a vector for describing a rate of change of a pixel point in an image in a certain direction, and the gradient information is used for indicating that the image gradient can be used as data for determining image definition after appropriate intermediate processing. When the image definition is determined only for the gradient in one specified direction, the gradient in the specified direction can be directly used as gradient information for determining the image definition as each pixel only comprises the gradient in one direction; when determining the sharpness of an image with respect to gradients in a plurality of directions, since each pixel includes gradients in the plurality of directions, when determining gradient information corresponding to the gradient in the ith direction, the gradient in the ith direction may be subjected to intermediate processing to obtain gradient information corresponding to the ith direction, and the sharpness of the image may be determined with respect to the gradient information in each direction. Optionally, the intermediate processing may include at least one of determining an absolute value corresponding to the gradient in the ith direction, determining a square corresponding to the gradient in the ith direction, determining a mapping value of the gradient in the ith direction in a normalization direction of the plurality of directions, and the like.

Taking the example of determining the evaluation value from gradient information in two perpendicular directions, schematically, first gradient information in a first direction of the sample image and second gradient information in a second direction are determined, wherein the first direction and the second direction are perpendicular to each other; determining a first evaluation value based on the first gradient information and the second gradient information; determining third gradient information of the first prediction image in the first direction and fourth gradient information in the second direction; determining a second evaluation value based on the third gradient information and the fourth gradient information; determining fifth gradient information of the second prediction image in the first direction and sixth gradient information in the second direction; the third evaluation value is determined based on the fifth gradient information and the sixth gradient information.

Illustratively, the image gradient is calculated in the manner shown in equation two and equation three, where equation two indicates the gradient G in the y-direction for the image I _y The calculation mode, x is the direction perpendicular to y, and the formula III indicates the gradient G of the image I in the x direction _x And (4) calculating mode.

The formula II is as follows: g _y ＝I(x,y+1)-I(x,y-1)

The formula III is as follows: g _x ＝I(x+1,y)-I(x-1,y)

Illustratively, when the evaluation value is determined by gradient information in two directions, overall gradient information can be determined by formula four as an evaluation value of image clarity.

The formula four is as follows:

in one example, the first direction and the second direction may be a transverse direction and a longitudinal direction, respectively.

Second, the sharpness of the image is determined by a sharpness evaluation sub-network trained in advance.

Illustratively, the definition of the image can also be driven by data, that is, the definition of the input image is predicted by a definition evaluation sub-network obtained by training in advance, and a corresponding evaluation value is output.

Schematically, as shown in fig. 6, which shows a schematic diagram of determining a definition evaluation value provided by an exemplary embodiment of the present application, an image to be detected (including a sample image, a first predicted image, and a second predicted image) 601 is input to a definition evaluation sub-network 610, and a definition evaluation value 602 corresponding to the image to be detected 601 is output, where the definition evaluation sub-network 610 is pre-trained by big data 603.

In some embodiments, feature extraction is performed on the sample image, the first predicted image and the second predicted image, so as to obtain a first feature representation corresponding to the sample image, a second feature representation corresponding to the first predicted image and a third feature representation corresponding to the second predicted image respectively; and inputting the first feature representation, the second feature representation and the third feature representation into a definition evaluation sub-network respectively, predicting the distribution condition of the feature representations in the feature space, and outputting a first evaluation value of the sample image, a second evaluation value of the first predicted image and a third evaluation value of the second predicted image.

Alternatively, the sharpness evaluation sub-network may be at least one network that can be used for image sharpness estimation, such as CNN, RNN, resNet, impulse-coupled neural network, and the like.

Step 505 determines a ranking loss value based on a difference between the comparison result between the first evaluation value, the second evaluation value, and the third evaluation value and a specified constraint condition.

After the first predicted image and the second predicted image are obtained through the deblurring subnetwork and the re-blurring subnetwork respectively, a triple is formed by the sample image, the first predicted image and the second predicted image, and the triple is compared and learned according to constraint relations among the triple, so that the whole model is trained, and the first predicted image obtained through the deblurring subnetwork, the second predicted image obtained through the re-blurring subnetwork and the sample image can meet constraint conditions corresponding to a training target.

Illustratively, the result of ranking the evaluation values expected to be reached by the specified constraint condition corresponding to the image processing model to be trained is the second evaluation value > the first evaluation value > the third evaluation value, so for determining the ranking loss, the relationship between the first evaluation value and the second evaluation value, the relationship between the first evaluation value and the third evaluation value, and the relationship between the second evaluation value and the third evaluation value may be separated into three relationships, and the loss values corresponding to the three relationships respectively determine the ranking loss value of the whole model, as shown in step 204, which is not described herein again.

Step 506, training the network parameters of the deblurring subnetwork and the re-fuzzy subnetwork based on the sequencing loss value to obtain the target deblurring subnetwork and the target re-fuzzy subnetwork.

In some embodiments, in response to convergence of the ordering loss values, it is determined that training of the deblurring subnetwork and the re-ambiguity subnetwork is complete, and the model parameters at the time of convergence are used as network parameters of the target deblurring subnetwork and the target re-ambiguity subnetwork.

And 507, inputting the image to be processed into a target deblurring sub-network for sharpness enhancement processing to obtain a sharpness-enhanced target image.

Referring to fig. 7, a method for deblurring an image according to an exemplary embodiment of the present application is shown, in which an image processing model in a training phase includes a deblurring subnetwork, a re-blurring subnetwork, an a priori calculation module, a data-driven calculation module, and a loss determination module, where the deblurring subnetwork and the re-blurring subnetwork both use an encoding-decoding network structure. The method comprises the following steps:

step 701, obtaining a sample image.

Illustratively, before the sample image is subjected to image processing through the deblurring subnetwork and the re-blurring subnetwork, the sample image needs to be subjected to feature extraction, and feature representations corresponding to the sample image are input to the deblurring subnetwork and the re-blurring subnetwork for prediction, so that a first prediction image and a second prediction image are obtained.

In some embodiments, in order to achieve better sharpness enhancement or blur enhancement effect when the image processing is performed through the deblurring subnetwork and the re-blur subnetwork, before the sample image is input into the deblurring subnetwork and the re-blur subnetwork, the sharpness enhancement and the blur enhancement are assisted based on the image semantics by extracting the image semantic features of the sample image as the input of the deblurring subnetwork and the re-blur subnetwork.

Schematically, performing semantic classification on image semantics of a sample image in at least two semantic spaces to respectively obtain image semantic features corresponding to the semantic spaces; performing feature connection on the semantic features of the images to obtain sample image features; inputting the sample image characteristics into a deblurring sub-network, and guiding the sharpness enhancement of the sample image characteristics by the deblurring sub-network based on the image semantics of the sample image to obtain a first predicted image; and inputting the sample image features into a double-fuzzy sub-network, and guiding the blurring degree enhancement of the sample image features by the double-fuzzy sub-network based on the image semantics of the sample image to obtain a second predicted image.

In some embodiments, a visual feature map of a sample image may be extracted by using CNN, and semantic features of images corresponding to different semantic spaces are obtained from the visual feature map through convolution operation, so as to implement separation of semantics corresponding to different contents in an image. Optionally, the semantic space may be divided according to statement objects corresponding to the content, and the semantic space may include a subject semantic space and an object semantic space, or the semantic space may also be an object semantic space, a relationship semantic space, or the like.

Schematically, image semantic features corresponding to the sample image are respectively determined according to at least two semantic spaces, so that entity semantic structure information of the sample image with the space is represented. In some embodiments, the sample image features for the input deblurring subnetwork and the re-blurring subnetwork are obtained by combining the image semantic features of different semantic spaces.

Specifically, when sample image features are obtained through image semantic features, feature extraction can be performed on a sample image through CNN to obtain a visual feature map, the visual feature map is convolved into the at least two semantic spaces, taking the example that the semantic spaces include a subject semantic space and an object semantic space, convolution feature maps on the subject semantic space and the object semantic space are respectively obtained, the convolution feature maps respectively corresponding to the subject semantic space and the object semantic space are taken as leaf nodes of a feature tree, the leaf nodes are convolved and merged into feature maps of corresponding father nodes to obtain an overall feature map corresponding to the sample image, and the sample image features are obtained through conversion of the overall feature map through a nonlinear function, average pooling operation and complete connection operation, wherein the form of the sample image features can be a matrix or a vector.

Illustratively, the process of extracting image semantics as features is realized by a semantic feature extraction sub-network obtained by pre-training, wherein in the training process of the semantic feature extraction sub-network, supervised training can be performed according to labels corresponding to a feature tree, and the process of the supervised training is realized by calculating cross entropy loss between leaf nodes.

Step 7021, the sample image is input to a first encoder for encoding, so as to obtain a first encoding characteristic.

Illustratively, the deblurring subnetwork in the embodiment of the present application adopts an encoding-decoding structure, that is, the deblurring subnetwork is composed of a first encoder for extracting features of a sample image and encoding the features, and a first decoder for decoding the extracted features and returning the features to an image space to output a first predicted image.

In some embodiments, the structure of the first encoder may include at least one convolution layer, and the sample image is convolved by the first encoder, such that the spatial resolution of the feature map of the sample image is reduced while the number of channels is increased.

In one example, as shown in fig. 8, which shows a schematic diagram of an encoder provided by an exemplary embodiment of the present application, an encoder 800 includes a first convolutional layer 810 and a second convolutional layer 820, after a sample image 801 is input to the first convolutional layer 810, an intermediate feature 802 with reduced resolution is obtained, and the intermediate feature 802 obtains a first encoding feature 803 with an increased number of channels through the second convolutional layer 820.

Step 7022, the first encoding characteristic is input to a first decoder for decoding to obtain a first predicted image.

In some embodiments, the network structure of the first decoder may be symmetrical to that of the first encoder, and illustratively, the first decoder raises the control resolution of the feature map by deconvolution, while the number of feature channels is reduced, and finally decodes back to the image space corresponding to the sample image to obtain the first predicted image.

In an example, as shown in fig. 9, which illustrates a schematic diagram of a decoder provided in an exemplary embodiment of the present application, wherein the decoder 900 includes a third convolutional layer 910 and a fourth convolutional layer 920, after a sample image 901 is input to the third convolutional layer 910, an intermediate feature 902 with an enhanced resolution is obtained, and the intermediate feature 902 obtains a first prediction image 903 with a reduced number of channels through the fourth convolutional layer 920.

Step 7031, inputting the sample image to a second encoder for encoding to obtain a second encoding characteristic;

illustratively, the double-fuzzy sub-network in the embodiment of the present application adopts an encoding-decoding structure, that is, the double-fuzzy sub-network is composed of a second encoder and a second decoder, wherein the second encoder is used for extracting the features of the sample image so as to encode the features, and the second decoder is used for decoding the extracted features and returning the features to the image space so as to output the second prediction image.

Alternatively, the second encoder may be identical in structure to the first encoder. Illustratively, the sample image is convolved by the second encoder, so that the spatial resolution of the feature map of the sample image is reduced while the number of channels is increased.

Step 7032, the second encoding characteristic is input to a second decoder for decoding, and a second predicted image is obtained.

Alternatively, the second decoder may have the same structure as the first decoder. Illustratively, the second decoder raises the control resolution of the feature map by deconvolution, reduces the number of feature channels, and finally decodes the feature map back to the image space corresponding to the sample image to obtain a second predicted image.

Step 7041, the sample image, the first predicted image, and the second predicted image are input to the prior computing module, and a first evaluation value corresponding to the sample image, a second evaluation value corresponding to the first predicted image, and a third evaluation value corresponding to the second predicted image are obtained, respectively.

Illustratively, the a priori computation module determines the sharpness of the image by computing gradients of the image. Optionally, when calculating gradient information of an image, calculating a gradient in a specified direction may be selected, and the gradient information in the specified direction is taken as the definition of the image; it is also possible to calculate gradients in a plurality of directions and determine an evaluation value corresponding to the sharpness of the image by combining information of the gradients in the plurality of directions.

Taking determination of the evaluation value from gradient information in two perpendicular directions as an example, schematically, first gradient information in a first direction of the sample image and second gradient information in a second direction are determined, wherein the first direction and the second direction are perpendicular to each other; determining a first evaluation value based on the first gradient information and the second gradient information; determining third gradient information of the first prediction image in the first direction and fourth gradient information in the second direction; determining a second evaluation value based on the third gradient information and the fourth gradient information; determining fifth gradient information of the second prediction image in the first direction and sixth gradient information in the second direction; the third evaluation value is determined based on the fifth gradient information and the sixth gradient information.

Step 7042, the sample image, the first predicted image, and the second predicted image are input to the data-driven computation module, and a fourth evaluation value corresponding to the sample image, a fifth evaluation value corresponding to the first predicted image, and a sixth evaluation value corresponding to the second predicted image are obtained, respectively.

Illustratively, the data-driven computation module determines the sharpness of the image by pre-training the resulting sharpness evaluation sub-network.

In some embodiments, feature extraction is performed on the sample image, the first prediction image and the second prediction image, so as to obtain a first feature representation corresponding to the sample image, a second feature representation corresponding to the first prediction image and a third feature representation corresponding to the second prediction image respectively; and inputting the first feature representation, the second feature representation and the third feature representation into a definition evaluation sub-network respectively, predicting the distribution condition of the feature representations in the feature space, and outputting a fourth evaluation value of the sample image, a fifth evaluation value of the first predicted image and a sixth evaluation value of the second predicted image.

Step 705 determines a ranking loss value in conjunction with a specified constraint condition based on the results of comparison among the first evaluation value, the second evaluation value, and the third evaluation value, and the results of comparison among the fourth evaluation value, the fifth evaluation value, and the sixth evaluation value.

Illustratively, the ranking results of the evaluation values desired to be reached by the specified constraint condition are the second evaluation value > the first evaluation value > the third evaluation value, and the fifth evaluation value > the fourth evaluation value > the sixth evaluation value, so the ranking loss value can be determined collectively by calculating the loss between the magnitude relationship between each of the first evaluation value, the second evaluation value, and the third evaluation value and the specified constraint condition, and the loss between the magnitude relationship between each of the fourth evaluation value, the fifth evaluation value, and the sixth evaluation value and the specified constraint condition. Alternatively, the above-described ranking loss value may be the sum of the losses corresponding to the two sets of evaluation values, or the average of the losses corresponding to the two sets of evaluation values.

And step 706, training the image processing model based on the sequencing loss value to obtain a target image processing model.

In some embodiments, in response to convergence of the ordering loss value, it is determined that training of the image processing model is completed, and a target image processing model is obtained, where the target image processing model includes a target deblurring subnetwork and a target re-blurring subnetwork.

In one example, as shown in fig. 10, which illustrates a framework of the deblurring method of an image provided by an exemplary embodiment of the present application, a sample image 1001 is input into a deblurring subnetwork 1010 including a first encoder 1011 and a first decoder 1012, and a re-blur subnetwork 1020 including a second encoder 1021 and a second decoder 1022, respectively, and a first prediction image 1002 and a second prediction image 903 are output, wherein, in addition to a direct connection, skip Connections (Skip Connections) are performed between the first encoder 1011 and the first decoder 1012, that is, an output of each convolutional layer in the first encoder 1011 is skipped to a convolutional layer in the first decoder 1012, for example, 1012, with the first encoder 1011 including the first convolutional layer and the second convolutional layer, the first decoder includes the third convolutional layer and the fourth convolutional layer as an example, an output of the first convolutional layer needs to be input to the third convolutional layer and the fourth convolutional layer in addition to be input by a direct connection, and redundancy of input to the fourth convolutional layer needs to be reduced by direct connection, and learning of redundancy is required. The sample image 1001, the first predicted image 1002, and the second predicted image 1003 are input to the prior calculation module 1030 and the data-driven calculation module 1040, and the output evaluation value indicating the sharpness of the image is used to determine a ranking loss value by the loss determination module 1050, so that the entire model is trained according to the ranking loss value.

And step 707, inputting the image to be processed into a target deblurring sub-network in the target image processing model for performing definition enhancement processing to obtain a definition-enhanced target image.

Referring to fig. 11, a method for deblurring an image according to an exemplary embodiment of the present application is shown, and in the embodiment of the present application, the method is schematically illustrated as being applied to a vehicle-mounted scene. The method comprises the following steps:

step 1101, acquiring an image to be processed acquired by the vehicle-mounted equipment.

In the embodiment of the application, a target image processing model obtained through training is applied to an image processing process in a vehicle-mounted scene, wherein a training target of the target image processing model in a training stage comprises deblurring processing on an image with motion blur. Illustratively, in the training phase of the image processing model, the input sample image includes image content in which a moving object exists.

Illustratively, the image to be processed is an image acquired by the vehicle-mounted device from the surrounding environment, and in some embodiments, the image to be processed includes a target area with motion blur.

In an example, the image to be processed may be a driving scene, when the environment around the vehicle is displayed through an electronic rearview mirror, an image capturing device corresponding to the electronic rearview mirror performs image capturing on the environment around the vehicle to obtain a continuous environment image frame, and the environment image frame is used as the image to be processed to perform sharpness enhancement processing.

In another example, the image to be processed may be a generation scene of map data, and illustratively, an information collection vehicle collects an environment along a road, and then generates map data for a specified target, for example, an electronic eye along the road is identified and aggregated, so as to provide the map data corresponding to the electronic eye for a map application. Illustratively, the acquisition device carried on the information acquisition vehicle acquires an environment image, and before map data is generated for the environment image, the definition of the acquired environment image can be enhanced through the target image processing model, so as to improve the data accuracy during the generation of subsequent map data.

Step 1102, inputting the image to be processed into a target image processing model, and performing definition enhancement processing on the target area by the target image processing model to obtain a target image.

Optionally, the vehicle-mounted terminal uploads the acquired image to be processed to a server through a network, and the server calls a target image processing model to perform definition enhancement processing on the image to be processed; or, the vehicle-mounted terminal is provided with a component comprising a target image processing model, and the vehicle-mounted terminal carries out definition enhancement processing on the image to be processed by calling the component.

Illustratively, for example, when the electronic rearview mirror displays the vehicle surroundings, the image capturing device corresponding to the electronic rearview mirror inputs the captured image frames of the surroundings to a chip or a processing unit equipped with the target image processing model component, performs sharpness enhancement processing to obtain image frames of the target with enhanced sharpness, and displays the vehicle surroundings on the electronic rearview mirror according to the image frames of the target.

Illustratively, for example, the information collecting vehicle collects an along-road environment, an image collecting device corresponding to the information collecting vehicle collects a surrounding environment according to a specified collecting frequency, and transmits a collected environment image to a server to generate map data, wherein after receiving the environment image, the server firstly performs definition enhancement on the environment image through a target image processing model to obtain a target image, and the target image is used for generating the map data, for example, aggregating electronic eyes in a plurality of target images to establish a relationship between the electronic eyes in the image and electronic eye entities in the real world.

In summary, according to the image deblurring method provided in the embodiment of the present application, the trained target image processing model is applied to the vehicle-mounted scene, and the definition of the environmental image acquired in the vehicle-mounted scene is enhanced, so that the definition of the environmental image applied in the vehicle-mounted scene is improved, for example, the definition of the display content in the electronic rearview mirror is improved, or the definition of the image used for generating the map data is improved to improve the map data generation accuracy.

It should be noted that the method for deblurring an image provided in the embodiment of the present application can also be applied to camera applications. Illustratively, in a scene of camera application, the camera application may input an acquired image as an image to be processed into a target image processing model for sharpness enhancement, so as to obtain a target image. Optionally, the target image processing model may be set in a server, or may be set in a terminal device corresponding to a camera application.

In some embodiments, in order to improve the effect of the target image processing model when performing definition enhancement, model training may be performed according to a shooting habit of a photographer when shooting through a camera application in the terminal device, that is, a training target of the image processing model to be trained includes learning the shooting habit corresponding to the target account, and the shooting habit is applied to the image definition enhancement. The shooting habit is used for indicating factors causing image blurring when the target account carries out image shooting. In one example, the situation that the target account is shaken and blurred frequently when image shooting is carried out is determined according to the historical shooting images, so that the shaking habit included in the shooting habit of the target account is known, for example, a photographer often has a shaking situation when holding a terminal device in a shooting process, and the shaking blur of an image captured by a camera is caused. In another example, it is determined from the history captured images that brightness of the target account is low frequently when image capturing is performed, so that capturing habits of the target account including capturing habits of a low-brightness environment are known, for example, when a photographer performs capturing at night, the definition of an image is reduced due to low brightness of the environment, so that when an image processing model is trained, a blur kernel corresponding to low-brightness blur is learned through the history captured images with low brightness, so that a trained deblurring network is more prone to process images with low-brightness blur, and a more targeted image processing function is provided for the target account.

Illustratively, the terminal device sends a history shooting image to the server under the condition that the terminal device obtains sufficient authorization of the target account, wherein the history shooting image is an image of the target account when the terminal device performs image acquisition in a history period. The server receives a historical shooting image sent by a target account, trains an image processing model to be trained by taking the historical shooting image as a sample image, and takes the trained target image processing model as a processing model used when the image definition is enhanced by a camera application through the target account. Optionally, the target image processing model is sent to a terminal device corresponding to the target account, or the target image processing model and the account id of the target account are stored correspondingly.

Optionally, the training frequency for performing model training through the history captured images of the target account may be specified by the target account, or may be preset by the system, for example, the target image processing model is updated at a frequency of training once a month.

Referring to fig. 12, a block diagram of an apparatus for deblurring an image according to an exemplary embodiment of the present application is shown, where the apparatus includes the following modules:

an obtaining module 1210, configured to obtain an image to be processed, where the image to be processed is an image to be deblurred;

the processing module 1220 is configured to input the image to be processed to a target image processing model for performing sharpness enhancement processing, so as to obtain a sharpness-enhanced target image;

In some alternative embodiments, as shown in fig. 13, the apparatus further comprises a training module 1230;

the acquiring module 1210 is further configured to acquire the sample image;

the training module 1230 includes:

the first processing sub-module 1231 is configured to perform sharpness enhancement processing on the sample image through the image processing model to obtain the first predicted image;

the second processing sub-module 1232 is configured to perform blur enhancement processing on the sample image through the image processing model to obtain the second predicted image;

a training sub-module 1233, configured to train the image processing model based on a ranking loss between the sample image, the first predicted image, and the second predicted image, so as to obtain a target image processing model, where the ranking loss is used to indicate a difference between a ranking condition of image sharpness and a constraint ranking condition between the sample image, the first predicted image, and the second predicted image.

In some optional embodiments, the training sub-module 1233 further includes:

a determination unit 1234 configured to acquire a first evaluation value of the sample image indicating the degree of sharpness of the sample image, a second evaluation value of the first predicted image indicating the degree of sharpness of the first predicted image, and a third evaluation value of the second predicted image indicating the degree of sharpness of the second predicted image;

a loss determining unit 1235 for determining an ordering loss value based on a difference between a result of comparison among the first evaluation value, the second evaluation value, and the third evaluation value and a specified constraint condition indicating that the second definition of the first predicted image is larger than the first definition of the sample image and the first definition of the sample image is larger than the third definition of the second predicted image;

a training unit 1236, configured to train the image processing model based on the ranking loss value, so as to obtain the target image processing model.

In some optional embodiments, the determining unit 1234 is further configured to determine first gradient information of the sample image in a first direction and second gradient information in a second direction, where the first direction and the second direction are perpendicular to each other;

the determining unit 1234 is further configured to determine the first evaluation value based on the first gradient information and the second gradient information;

the determining unit 1234 is further configured to determine third gradient information of the first prediction image in the first direction and fourth gradient information in the second direction;

the determination unit 1234 is further configured to determine the second evaluation value based on the third gradient information and the fourth gradient information;

the determining unit 1234 is further configured to determine fifth gradient information of the second predicted image in the first direction and sixth gradient information in the second direction;

the determination unit 1234 is further configured to determine the third evaluation value based on the fifth gradient information and the sixth gradient information.

In some optional embodiments, the training sub-module 1233 further includes:

an extracting unit 1237, configured to perform feature extraction on the sample image, the first predicted image, and the second predicted image to obtain a first feature representation corresponding to the sample image, a second feature representation corresponding to the first predicted image, and a third feature representation corresponding to the second predicted image, respectively;

a prediction unit 1238, configured to input the first feature representation, the second feature representation, and the third feature representation to a sharpness evaluation sub-network, respectively, perform prediction with respect to a distribution of feature representations in a feature space, and output the first evaluation value of the sample image, the second evaluation value of the first predicted image, and the third evaluation value of the second predicted image.

In some optional embodiments, the image processing model includes a first encoder, a first decoder, a second encoder and a second decoder, the first encoder and the first decoder are used for performing sharpness enhancement on the sample image, and the second encoder and the second decoder are used for performing blur enhancement on the sample image;

the first processing sub-module 1231 is further configured to input the sample image to the first encoder for encoding, so as to obtain a first encoding characteristic;

the first processing sub-module 1231 is further configured to input the first encoding characteristic to the first decoder for decoding, so as to obtain the first predicted image;

the second processing sub-module 1232 is further configured to input the sample image to the second encoder for encoding, so as to obtain a second encoding characteristic;

the second processing sub-module 1232 is further configured to input the second encoding characteristic to the second decoder for decoding, so as to obtain the second predicted image.

In some optional embodiments, the image processing model includes a deblurring subnetwork for sharpness enhancement processing and a re-blurring subnetwork for blur enhancement processing;

the training module 1230 further comprises:

the extraction submodule 1239 is configured to perform semantic classification on the image semantics of the sample image in at least two semantic spaces, and obtain image semantic features corresponding to the semantic spaces respectively;

the extraction submodule 1239 is further configured to perform feature connection on the image semantic features to obtain sample image features;

the first processing sub-module 1231 is further configured to input the sample image feature to the deblurring sub-network, and the deblurring sub-network guides the sharpness enhancement of the sample image feature based on the image semantic of the sample image to obtain the first predicted image;

the second processing sub-module 1232 is further configured to input the sample image features to the re-blur sub-network, and the re-blur sub-network guides the blur enhancement of the sample image features based on the image semantics of the sample image to obtain the second predicted image.

In some optional embodiments, the training target of the image processing model to be trained comprises deblurring the image with motion blur;

the obtaining module 1210 is further configured to obtain an image to be processed acquired by a vehicle-mounted device, where the image to be processed is an image acquired by the vehicle-mounted device from a surrounding environment, and the image to be processed includes a target area with motion blur;

the processing module 1220 is further configured to input the image to be processed to the target image processing model, and perform sharpness enhancement processing on the target area by using the target image processing model to obtain the target image.

In some optional embodiments, the training target of the image processing model to be trained includes learning a shooting habit corresponding to a target account, and applying the shooting habit to image definition enhancement, where the shooting habit is used to indicate a factor that causes image blur when the target account performs image shooting;

the obtaining module 1210 is further configured to receive a history shot image sent by a target account, where the history shot image is obtained by the target account through terminal equipment in a history period;

the obtaining module 1210 is further configured to use the history shooting image as the sample image.

In summary, in the image deblurring apparatus provided in the embodiment of the present application, in the training process of an image processing model for enhancing image definition, a sample image is subjected to definition enhancement processing and blur enhancement processing, so as to obtain a first predicted image and a second predicted image, and the image processing model is trained according to the definition arrangement between the sample image, the first predicted image, and the second predicted image, thereby implementing unsupervised training of the model by using a contrast learning method without performing label labeling on the sample image, improving the training efficiency of the model, and avoiding the problem that the training effect is affected by label accuracy, so that the target image processing model obtained by training is applied to the processing effect when the image definition is enhanced.

It should be noted that: the image deblurring apparatus provided in the above embodiment is only illustrated by dividing the above functional modules, and in practical applications, the above function allocation may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the above described functions. In addition, the image deblurring device provided in the above embodiment and the image deblurring method embodiment belong to the same concept, and the specific implementation process thereof is described in the method embodiment in detail and is not described herein again.

Fig. 14 shows a schematic structural diagram of a server according to an exemplary embodiment of the present application. Specifically, the structure includes the following structure.

The server 1400 includes a Central Processing Unit (CPU) 1401, a system Memory 1404 including a Random Access Memory (RAM) 1402 and a Read Only Memory (ROM) 1403, and a system bus 1405 connecting the system Memory 1404 and the Central Processing Unit 1401. The server 1400 also includes a mass storage device 1406 for storing an operating system 1413, application programs 1414, and other program modules 1415.

The mass storage device 1406 is connected to the central processing unit 1401 through a mass storage controller (not shown) connected to the system bus 1405. The mass storage device 1406 and its associated computer-readable media provide non-volatile storage for the server 1400. That is, the mass storage device 1406 may include a computer-readable medium (not shown) such as a hard disk or Compact disk Read Only Memory (CD-ROM) drive.

Without loss of generality, computer readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, erasable Programmable Read-Only Memory (EPROM), electrically Erasable Programmable Read-Only Memory (EEPROM), flash Memory or other solid state Memory technology, CD-ROM, digital Versatile Disks (DVD), or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will appreciate that computer storage media is not limited to the foregoing. The system memory 1404 and the mass storage device 1406 described above may collectively be referred to as memory.

According to various embodiments of the present application, the server 1400 may also operate as a remote computer connected to a network through a network, such as the Internet. That is, the server 1400 may be connected to the network 1412 through the network interface unit 1411 connected to the system bus 1405, or the network interface unit 1411 may be used to connect to other types of networks or remote computer systems (not shown).

The memory further includes one or more programs, and the one or more programs are stored in the memory and configured to be executed by the CPU.

Embodiments of the present application further provide a computer device, which includes a processor and a memory, where at least one instruction, at least one program, a set of codes, or a set of instructions is stored in the memory, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the method for deblurring an image provided by the above-mentioned method embodiments. Optionally, the computer device may be a terminal or a server.

Embodiments of the present application further provide a computer-readable storage medium, on which at least one instruction, at least one program, a code set, or a set of instructions are stored, and the at least one instruction, the at least one program, the code set, or the set of instructions are loaded and executed by a processor to implement the method for deblurring an image provided by the above-mentioned method embodiments.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to make the computer device execute the method for deblurring the image in any of the above embodiments.

Optionally, the computer-readable storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a Solid State Drive (SSD), or an optical disc. The Random Access Memory may include a resistive Random Access Memory (ReRAM) and a Dynamic Random Access Memory (DRAM). The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is intended only to illustrate the alternative embodiments of the present application, and should not be construed as limiting the present application, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method of deblurring an image, the method comprising:

2. The method of claim 1, wherein the training process of the image processing model comprises:

acquiring the sample image;

performing sharpness enhancement processing on the sample image through the image processing model to obtain a first predicted image;

carrying out ambiguity enhancement processing on the sample image through the image processing model to obtain the second predicted image;

training the image processing model based on the sequencing loss among the sample image, the first predicted image and the second predicted image to obtain a target image processing model, wherein the sequencing loss is used for indicating the difference between the sequencing situation of the image definition among the sample image, the first predicted image and the second predicted image and the constraint sequencing situation.

3. The method according to claim 2, wherein the training the image processing model to be trained based on the loss of ordering among the sample image, the first predictive image, and the second predictive image to obtain a target image processing model comprises:

acquiring a first evaluation value of the sample image indicating the degree of sharpness of the sample image, a second evaluation value of the first predicted image indicating the degree of sharpness of the first predicted image, and a third evaluation value of the second predicted image indicating the degree of sharpness of the second predicted image;

determining a value of a loss of ordering based on a difference between a result of comparison among the first evaluation value, the second evaluation value, and the third evaluation value and a specified constraint condition indicating that the second definition of the first predicted image is larger than the first definition of the sample image and the first definition of the sample image is larger than the third definition of the second predicted image;

and training the image processing model based on the sequencing loss value to obtain the target image processing model.

4. The method according to claim 3, wherein the acquiring the first evaluation value of the sample image, the second evaluation value of the first prediction image, and the third evaluation value of the second prediction image includes:

determining first gradient information of the sample image in a first direction and second gradient information in a second direction, wherein the first direction and the second direction are perpendicular to each other;

determining the first evaluation value based on the first gradient information and the second gradient information;

determining third gradient information of the first prediction image in the first direction and fourth gradient information in the second direction;

determining the second evaluation value based on the third gradient information and the fourth gradient information;

determining fifth gradient information of the second prediction image in the first direction and sixth gradient information in the second direction;

determining the third evaluation value based on the fifth gradient information and the sixth gradient information.

5. The method according to claim 3, wherein the acquiring the first evaluation value of the sample image, the second evaluation value of the first prediction image, and the third evaluation value of the second prediction image includes:

performing feature extraction on the sample image, the first prediction image and the second prediction image to obtain a first feature representation corresponding to the sample image, a second feature representation corresponding to the first prediction image and a third feature representation corresponding to the second prediction image respectively;

inputting the first feature representation, the second feature representation and the third feature representation into a sub-network for sharpness evaluation, respectively, predicting a distribution of feature representations in a feature space, and outputting the first evaluation value of the sample image, the second evaluation value of the first predicted image and the third evaluation value of the second predicted image.

6. The method according to any one of claims 2 to 5, wherein the image processing model comprises a first encoder, a first decoder, a second encoder and a second decoder, the first encoder and the first decoder are used for performing sharpness enhancement on the sample image, and the second encoder and the second decoder are used for performing blur enhancement on the sample image;

the processing method of the image processing model includes that the sharpness enhancement processing is performed on the sample image through the image processing model to obtain a first predicted image, and the processing method includes the following steps:

inputting the sample image into the first encoder for encoding to obtain a first encoding characteristic;

inputting the first encoding characteristic into the first decoder for decoding to obtain the first prediction image;

the obtaining a second predicted image by performing the blur enhancement processing on the sample image through the image processing model includes:

inputting the sample image into the second encoder for encoding to obtain a second encoding characteristic;

and inputting the second coding characteristics into the second decoder for decoding to obtain the second predicted image.

7. The method according to any one of claims 2 to 5, wherein the image processing model comprises a deblurring subnetwork and a re-blurring subnetwork, the deblurring subnetwork is used for sharpness enhancement processing, and the re-blurring subnetwork is used for blur enhancement processing;

before the sharpness enhancement processing is performed on the sample image through the image processing model to obtain a first predicted image, the method further includes:

performing semantic classification on the image semantics of the sample image in at least two semantic spaces to respectively obtain image semantic features corresponding to the semantic spaces;

performing feature connection on the image semantic features to obtain sample image features;

inputting the sample image features into the deblurring sub-network, and guiding the sharpness enhancement of the sample image features by the deblurring sub-network based on the image semantics of the sample image to obtain the first predicted image;

the obtaining a second prediction image by performing the blur enhancement processing on the sample image through the image processing model includes:

inputting the sample image features into the double-fuzzy sub-network, and guiding the blurring enhancement of the sample image features by the double-fuzzy sub-network based on the image semantics of the sample image to obtain the second predicted image.

8. The method according to any one of claims 1 to 5, wherein the training target of the image processing model to be trained comprises deblurring the image with motion blur;

the inputting the image to be processed into a target image processing model for sharpness enhancement processing to obtain a sharpness-enhanced target image includes:

acquiring an image to be processed acquired by vehicle-mounted equipment, wherein the image to be processed is an image acquired by the vehicle-mounted equipment for the surrounding environment, and the image to be processed comprises a target area with motion blur;

and inputting the image to be processed into the target image processing model, and performing definition enhancement processing on the target area by the target image processing model to obtain the target image.

9. The method according to any one of claims 2 to 5, wherein the training target of the image processing model to be trained comprises learning a shooting habit corresponding to a target account, and applying the shooting habit to image definition enhancement, wherein the shooting habit is used for indicating a factor causing image blurring when the target account performs image shooting;

the acquiring of the sample image comprises:

receiving a historical shot image sent by a target account, wherein the historical shot image is acquired by the target account through terminal equipment in a historical time period;

and taking the historical shooting image as the sample image.

10. An apparatus for deblurring an image, the apparatus comprising:

the processing module is used for inputting the image to be processed into a target image processing model for performing definition enhancement processing to obtain a target image with enhanced definition;

the target image processing model is obtained by training an image processing model to be trained through a sample image, the image processing model is used for respectively performing definition enhancement processing on the sample image to obtain a first predicted image and performing the definition enhancement processing on the sample image to obtain a second predicted image, and model parameters of the target image processing model are obtained by performing contrast learning on definition ordering conditions and constraint ordering conditions among the sample image, the first predicted image and the second predicted image.

11. A computer device comprising a processor and a memory, the memory having stored therein at least one program which is loaded and executed by the processor to implement a method of deblurring an image as claimed in any one of claims 1 to 9.

12. A computer-readable storage medium, characterized in that at least one program code is stored therein, which is loaded and executed by a processor to implement a method of deblurring an image as claimed in any one of claims 1 to 9.

13. A computer program product comprising a computer program or instructions which, when executed by a processor, implement a method of deblurring an image as claimed in any one of claims 1 to 9.