CN115423691A - Training method of image correction model, image correction method, device and storage medium - Google Patents

Training method of image correction model, image correction method, device and storage medium Download PDF

Info

Publication number
CN115423691A
CN115423691A CN202211048861.8A CN202211048861A CN115423691A CN 115423691 A CN115423691 A CN 115423691A CN 202211048861 A CN202211048861 A CN 202211048861A CN 115423691 A CN115423691 A CN 115423691A
Authority
CN
China
Prior art keywords
image
training
network
sampling
correction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211048861.8A
Other languages
Chinese (zh)
Inventor
叶嘉权
魏新明
王孝宇
肖嵘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Yuntian Lifei Technology Co ltd
Shenzhen Intellifusion Technologies Co Ltd
Original Assignee
Qingdao Yuntian Lifei Technology Co ltd
Shenzhen Intellifusion Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Yuntian Lifei Technology Co ltd, Shenzhen Intellifusion Technologies Co Ltd filed Critical Qingdao Yuntian Lifei Technology Co ltd
Priority to CN202211048861.8A priority Critical patent/CN115423691A/en
Publication of CN115423691A publication Critical patent/CN115423691A/en
Priority to PCT/CN2022/142238 priority patent/WO2024045442A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/60Rotation of whole images or parts thereof
    • G06T3/608Rotation of whole images or parts thereof by skew deformation, e.g. two-pass or three-pass rotation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/02Affine transformations

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to the field of image correction, and particularly discloses a training method of an image correction model, an image correction method, computer equipment and a storage medium, wherein the training method of the image correction model comprises the following steps: acquiring training data, wherein the training data comprises a training image and a rotating image corresponding to the training image; inputting the rotating image into a preset correction network to obtain an affine transformation matrix; performing affine transformation on the rotating image based on the affine transformation matrix, and inputting the obtained transformation data into a sampling network for image sampling to obtain a corrected image corresponding to the rotating image; and calculating a loss function value between the correction image and the training image, performing iterative training on a preset correction network and a preset sampling network according to the loss function value, and taking the preset correction network and the preset sampling network as an image correction model together when the training is finished. The training method of the image correction model can reduce the training cost of the image correction model.

Description

Training method of image correction model, image correction method, device and storage medium
Technical Field
The present application relates to the field of image correction, and in particular, to a training method for an image correction model, an image correction method, a computer device, and a storage medium.
Background
With the continuous development of science and technology, the face recognition technology is widely applied to the society, such as the field of security and protection. However, the setting position of the device for acquiring the face is usually fixed, and the walking track of the pedestrian and the body direction of the pedestrian are not fixed, so that the acquired face image often has a certain inclination or rotation, and therefore, the face image with the inclination or rotation needs to be corrected first to ensure the accuracy of face recognition. At present, in the prior art, when image correction is performed, key points in a face image are usually labeled, and then a neural network is trained by using the labeled images. However, this training method requires a large amount of labeled data, the labeling cost is high, and errors are easily introduced, so that the accuracy of the finally trained neural network is not high.
Disclosure of Invention
The application provides a training method of an image correction model, an image correction method, computer equipment and a storage medium, so as to reduce the training cost of the image correction model and improve the accuracy of the obtained image correction model.
In a first aspect, the present application provides a method for training an image rectification model, the method including:
acquiring training data, wherein the training data comprises a training image and a rotating image corresponding to the training image;
inputting the rotating image into a preset correction network to obtain an affine transformation matrix corresponding to the rotating image;
performing affine transformation on the rotating image based on the affine transformation matrix, and inputting the obtained transformation data into a sampling network for image sampling to obtain a corrected image corresponding to the rotating image;
and calculating a loss function value between the correction image and the training image, performing iterative training on the preset correction network and the sampling network according to the loss function value, and taking the preset correction network and the sampling network as an image correction model together when the training is finished.
In a second aspect, the present application further provides an image rectification method, including:
acquiring an image to be corrected;
inputting the image to be corrected into a pre-trained image correction model to obtain a corrected image, wherein the pre-trained image correction model is obtained by training with the image correction model training method in the first aspect.
In a third aspect, the present application also provides a computer device comprising a memory and a processor; the memory is used for storing a computer program; the processor is configured to execute the computer program and, when executing the computer program, implement the training method of the image rectification model according to the first aspect and/or the image rectification method according to the second aspect.
In a fourth aspect, the present application further provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program causes the processor to implement the method for training an image rectification model according to the first aspect and/or the method for image rectification according to the second aspect.
The application discloses a training method of an image correction model, an image correction method, computer equipment and a storage medium, wherein a training image and a rotating image corresponding to the training image are used as training data of the image correction model, and the number of marked data is reduced when the image correction model is trained, so that the dependence on the marked data can be reduced, the training cost of the image correction model is reduced, and the accuracy of the image correction model obtained by training can be improved. In addition, a corrected image is obtained based on the affine transformation matrix and the sampling network, then iterative training is carried out on the preset correction network and the sampling network according to the corrected image and the training image, the sampling network and the preset correction network are jointly used as an image correction model after the training is finished, the sampling network is also used as one part of the image correction model to participate in the training of the image correction model, the image correction model is subjected to unsupervised training by utilizing the corrected image, and the accuracy of the image correction model obtained by the training can be ensured under the condition of reducing the labeling cost.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flow chart of a training method of an image rectification model according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram of a step of performing affine transformation on a rotation image according to an embodiment of the present application;
FIG. 3 is a schematic diagram illustrating a step of performing image sampling according to an embodiment of the present application;
fig. 4 is a schematic flowchart of an image rectification method according to an embodiment of the present application;
FIG. 5 is a schematic block diagram of an apparatus for training an image rectification model according to an embodiment of the present disclosure;
FIG. 6 is a schematic block diagram of an image rectification apparatus provided in an embodiment of the present application;
fig. 7 is a schematic block diagram of a structure of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The flow diagrams depicted in the figures are merely illustrative and do not necessarily include all of the elements and operations/steps, nor do they necessarily have to be performed in the order depicted. For example, some operations/steps may be decomposed, combined or partially combined, so that the actual execution order may be changed according to the actual situation.
It is to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
It is further understood that in the specific implementation of the present application, related data such as human face is involved, when the above embodiments of the present application are applied to specific products or technologies, user permission or consent needs to be obtained, and the collection, use and processing of related data need to comply with relevant laws and regulations and standards of relevant countries and regions.
Embodiments of the present application provide a training method of an image rectification model, an image rectification method, a computer device, and a storage medium. The image correction model obtained by the training method of the image correction model can be used for carrying out image correction on the collected face image, and the accuracy of the obtained corrected image is improved, so that the recognition success rate and the accuracy of face recognition are improved in the subsequent face recognition.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments and features of the embodiments described below can be combined with each other without conflict.
Referring to fig. 1, fig. 1 is a schematic flowchart of a training method of an image rectification model according to an embodiment of the present disclosure. The training method of the image correction model carries out iterative training on the image correction model in a mode of simulating training data, can reduce training cost and improve the accuracy of the image correction model obtained by training.
As shown in fig. 1, the training method of the image rectification model specifically includes: step S101 to step S105.
Step S101, training data are obtained, wherein the training data comprise training images and rotating images corresponding to the training images.
And acquiring training data, wherein the training data is used for training the image correction model, and the training data comprises training images and rotating images corresponding to the training images. The rotation image is obtained by rotating the training image by a preset angle.
In an embodiment, before step S101, the method may further include the steps of: and acquiring a training image, and rotating the training image to obtain a rotating image corresponding to the training image.
That is, the training image may be rotated by a predetermined angle to obtain a rotated image, and the training image, the rotated image corresponding to the training image, and the rotation angle corresponding to the training image may be directly obtained when the training data is obtained. In a specific implementation process, a plurality of rotated images can be obtained by rotating a different training image by different angles, so that the number of training data is increased. In addition, the rotation angles at the time of rotation may be different from each other or may have the same rotation angle for different training images.
In the specific implementation process, a large number of pictures of the front face can be screened out from the business scene to serve as training images, wherein the training images comprise images without obvious angle rotation, and then a large number of rotating images for training are generated in a mode of artificially introducing random rotation angles.
When the image correction model is trained, if the used training data are the face image with the label and the corresponding key points, a large amount of labeled data are needed for network training, so that the acquisition cost of the training data is high, and the accuracy of the image correction model also depends on the accuracy of the labeled data.
And S102, inputting the rotating image into a preset correction network to obtain an affine transformation matrix corresponding to the rotating image.
And inputting the rotating image into a preset correction network to obtain an affine transformation matrix corresponding to the rotating image. The affine transformation matrix is also an affine transformation matrix corresponding to a rotation image output by a preset correction network. The affine transformation matrix may be a two-row three-column matrix, which includes information required for correcting the face image, including rotation, translation, scaling, and the like. The affine transformation matrix can be expressed as:
Θ=f loc (U)
wherein U is an input picture, and
Figure BDA0003823022210000051
and theta is an affine transformation matrix. The shape of the affine transformation matrix is 2*3, and therefore,the preset correction network can be regarded as a regressor with six nodes, and then Θ can be expressed as:
Figure BDA0003823022210000052
in addition, in an embodiment, the preset correction network may be a pre-trained correction network trained based on a small amount of labeled data, and the supervised pre-training is performed on the correction network by using a small amount of labeled data, so that the training efficiency of the image correction model can be improved.
In an embodiment, the method may further comprise the steps of: obtaining sample data, wherein the sample data comprises a sample image and key points corresponding to the sample image; inputting the sample image into a convolutional neural network to obtain an output affine matrix; determining a supervised affine matrix based on the key points corresponding to the sample image and the preset positions; and calculating a loss function value between the output affine matrix and the supervision affine matrix, and pre-training the convolutional neural network according to the loss function value to obtain a pre-trained correction network.
And acquiring sample data, wherein the sample data refers to a sample image with a label and comprises the sample image and key points in the labeled sample image. In a specific implementation process, the sample data may be labeled data in the training data or new data other than the training data. And then inputting the sample image into a convolutional neural network to obtain the output of the convolutional neural network, namely the output affine matrix.
And determining a supervision affine matrix based on the key points corresponding to the sample image and the preset positioning points. The preset positioning point refers to a preset point with a determined coordinate position, and each key point can correspond to one preset positioning point. In the implementation process, by setting the preset points, the key points at the same position of different images can be corrected to a fixed position, for example, the position coordinates of eyes and mouth in the face image are always at a certain fixed point.
After the supervised affine matrix is obtained, calculating loss values of the output affine matrix and the supervised affine matrix, and performing iterative training on the convolutional neural network according to the calculated loss function value to obtain a pre-trained correction network.
In particular implementations, a convolutional neural network may be iteratively trained using a Mean Absolute Error (MAE) loss function. For example, because the output affine matrix and the supervised affine matrix both include six numerical values, the numerical values at corresponding positions in the output affine matrix and the supervised affine matrix may be subtracted from each other to calculate absolute values thereof, and the absolute values may be added and averaged to obtain a loss function value between the output affine matrix and the supervised affine matrix.
Parameters of the convolutional neural network are continuously adjusted through loss function values between the output affine matrix and the supervision affine matrix, so that the output affine matrix output by the convolutional neural network can be closer to the supervision affine matrix until the loss value of the loss function of the convolutional neural network reaches a preset value, at the moment, the convolutional neural network is converged, and the converged convolutional neural network can be used as a pre-trained correction network. The loss function value between the output affine matrix and the supervision affine matrix is utilized to conduct supervised training on the convolutional neural network in a matrix mode, a preset correction network can be obtained quickly, and therefore the training speed of the image correction model is improved.
In another embodiment, the method may further comprise the steps of: acquiring training sample data, wherein the training sample data comprises a rotating sample image corresponding to a training sample image and a rotating angle corresponding to the training sample image; inputting the rotating sample image into a convolutional neural network to obtain an output affine matrix corresponding to the rotating sample image; determining a supervision affine matrix according to the rotation sample image and the rotation angle corresponding to the rotation sample image; and calculating a loss function value between the output affine matrix and the supervision affine matrix, and pre-training the convolutional neural network according to the loss function value to obtain a pre-trained correction network.
And acquiring training sample data, wherein the training sample data is used for pre-training the convolutional neural network, and the training sample data comprises a rotating sample image corresponding to the training sample image and a rotating angle corresponding to the training sample image. The rotation sample image is obtained by rotating the training sample image by a preset angle, for example, the rotation sample image is obtained by clockwise rotating the training sample image by 15 °, and then the rotation angle corresponding to the training sample image is 15 °. In a specific implementation process, the training sample data may be training data used for iterative training of a preset correction network and a sampling network, so that the amount of samples required in the training process of the image correction model can be reduced, and the training cost of the image correction model is further reduced.
And inputting the rotation sample image into a convolution neural network to obtain an output affine matrix corresponding to the rotation sample image. Since the rotation sample image is obtained by rotating the training sample image through a certain angle, that is, the rotation angle corresponding to the rotation sample image is known, and since the rotation of the picture is equivalent to multiplying the picture matrix of the picture by an affine matrix, which is related to the rotation angle, in the case that the rotation angle corresponding to the rotation sample image is known, the supervising affine matrix corresponding to the rotation image can be directly calculated based on the rotation angles corresponding to the rotation sample image and the rotation sample image.
And after the supervised affine matrix is obtained, calculating loss values of the output affine matrix and the supervised affine matrix, and performing iterative training on the convolutional neural network according to the calculated loss function value to obtain a pre-trained correction network.
In particular implementations, a convolutional neural network may be iteratively trained using a Mean Absolute Error (MAE) loss function. Parameters of the convolutional neural network are continuously adjusted through loss function values between the output affine matrix and the supervision affine matrix, so that the output affine matrix output by the convolutional neural network can be closer to the supervision affine matrix until the loss value of the loss function of the convolutional neural network reaches a preset value, at the moment, the convolutional neural network converges, and the converged convolutional neural network can be used as a pre-trained correction network.
In the specific implementation process, the finally generated image correction model is used for correcting the face image and is often a pre-task for face recognition. Therefore, the image correction model should have as few parameters as possible and operate at a high speed to improve the recognition speed of face recognition. For example, a convolutional neural network may select a CNN network having the network structure of MobileNet.
In an embodiment, before the step of inputting the rotated image into a preset correction network to obtain an affine transformation matrix corresponding to the rotated image, the method may further include the following steps: performing image preprocessing on the rotated image, wherein the image preprocessing comprises size adjustment and/or image enhancement.
The image preprocessing is performed on the rotated image, wherein the image preprocessing includes resizing, that is, resizing the rotated image to a fixed size, and the image preprocessing may further include image enhancement, which is to make the image clearer.
In a specific implementation process, the image preprocessing may be performed on the rotation image, or the image preprocessing may be performed on the training image, and after the image preprocessing is performed on the training image, the obtained rotation image is the preprocessed image, that is, the image preprocessing may not be performed on the rotation image.
And S103, carrying out affine transformation on the rotating image based on the affine transformation matrix, and inputting the obtained transformation data into a sampling network for image sampling to obtain a corrected image corresponding to the rotating image.
And carrying out affine transformation on the rotating image based on the affine transformation matrix to obtain transformation data, wherein the transformation data comprises mapping coordinates of each pixel point in the rotating image after affine transformation. And then inputting the transformation data into a sampling network for image sampling, so as to obtain a corrected image corresponding to the rotating image. The image sampling may be differential image sampling, which means that interpolation and rounding are performed on a pixel position value obtained by affine transformation of each pixel point to obtain a corresponding actual sampling coordinate, and finally, the pixel point is sampled in the rotated image according to the actual sampling coordinate, that is, for each pixel position in the corrected image, a pixel value at a corresponding position is searched in the rotated image for filling.
In an embodiment, please refer to fig. 2, which is a schematic diagram illustrating a step of performing affine transformation on a rotation image according to an embodiment of the present application. As shown in fig. 2, the affine transforming the rotated image based on the affine transformation matrix may include: step S1031, obtaining the pixel coordinate of each pixel point in the rotation image; step S1032, the pixel coordinates of each pixel point are mapped respectively based on the affine transformation matrix, and the mapping coordinates of each pixel point are obtained.
Firstly, the pixel coordinate of a pixel point in a rotating image is obtained and can be recorded as
Figure BDA0003823022210000081
After passing through the affine transformation matrix theta, the mapping coordinates corresponding to the pixel point become
Figure BDA0003823022210000082
The specific affine transformation process is as follows:
Figure BDA0003823022210000083
and performing coordinate mapping on all pixel points in the rotating image according to the above mode to obtain mapping coordinates corresponding to each pixel point.
In an embodiment, the step of inputting the obtained transformation data into a sampling network for image sampling to obtain a corrected image corresponding to the rotated image may include:
when the mapping coordinates of the pixel points are not integers, inputting the mapping coordinates of the pixel points to a sampling network for image sampling to obtain a corrected image corresponding to the rotating image; and when the mapping coordinates of the pixel points are integers, acquiring the pixel points corresponding to the mapping coordinates in the rotating image according to the mapping coordinates to fill the pixel points so as to obtain a corrected image.
In general, the value of the mapping coordinate corresponding to the pixel point calculated by the affine transformation matrix is a floating point value rather than an integer value, and therefore, in order to obtain the corrected image, interpolation and rounding need to be performed on the coordinates of which the mapping coordinate is not an integer, so that pixel point filling can be performed.
In a specific implementation process, after the mapping coordinate corresponding to each pixel point is obtained, whether the mapping coordinate corresponding to the pixel point is an integer is judged, if the mapping coordinate corresponding to the pixel point is an integer, that is to say, the mapping coordinate corresponding to the pixel point just corresponds to some spatial positions in the rotating image, then the pixel point corresponding to the sampling coordinate can be directly obtained in the rotating image according to the mapping coordinate of the pixel point to perform pixel point filling. However, if the mapping coordinates corresponding to the pixel points are not integers, it is indicated that the mapping coordinates cannot directly correspond to the spatial position in the rotated image, and at this time, the mapping coordinates of the pixel points need to be input to a sampling network for image sampling, and after all the pixel points are filled with the pixel points, a corrected image corresponding to the rotated image is obtained.
In an embodiment, please refer to fig. 3, which is a schematic diagram illustrating a step of performing image sampling according to an embodiment of the present application. As shown in fig. 3, the step of inputting the mapping coordinates of the pixel points to a sampling network for image sampling to obtain a corrected image corresponding to the rotated image may include:
step S1033, carrying out interpolation and rounding on the mapping coordinates of the pixel points to obtain sampling coordinates corresponding to the pixel points; and S1034, obtaining pixel points corresponding to the sampling coordinates in the rotating image according to the sampling coordinates, and filling the pixel points to obtain a corrected image.
The mapped coordinates of the pixel points are interpolated and rounded, and in a specific implementation, the interpolation and rounding can be performed in various ways, such as bilinear interpolation and nearest neighbor interpolation. The bilinear interpolation is to find out four integer points nearest to the coordinate, and the weight is increased when the distance is shorter according to the sum of the weights. The interpolation rounding process may be:
Figure 3
wherein the content of the first and second substances,
Figure 2
the function of the sampling is represented by,
Figure 1
the parameters of the sampling function are represented by,
Figure BDA0003823022210000092
to rotate the value of the image at (n, m, c), V i c Represents the corrected image in
Figure BDA0003823022210000093
The value of (c).
After the sampling coordinate is obtained, sampling pixel points according to corresponding positions of the sampling coordinate in the rotating image, then filling the pixel points until all the pixel points in the rotating image are traversed, and taking the image obtained through pixel point filling as a correction image.
And step S104, calculating a loss function value between the correction image and the training image, performing iterative training on the preset correction network and the sampling network according to the loss function value, and taking the preset correction network and the sampling network as an image correction model together when the training is finished.
After obtaining the corrected image, a loss function value between the corrected image and the training image is calculated using a loss function, which may be an average square error (MSE) loss function in a specific implementation. And performing iterative training on a preset correction network and a preset sampling network according to the calculated loss function value.
In the specific implementation process, because the sampling network is conductive and meets the back propagation condition, the sampling network and the preset rectification network can be trained end to end, that is, the sampling network is used as a network layer in the image rectification model to participate in the training of the preset rectification network.
When the preset correction network and the sampling network are subjected to iterative training, the preset correction network and the sampling network can be trained together, or the preset correction network can be subjected to iterative training by using the sampling network with fixed parameters.
When a preset correction network and a sampling network are trained together, a loss function value between a correction image and a training image is calculated based on the loss function, and a sampling parameter in the sampling network and a weight value of a network parameter in the preset correction network are respectively adjusted based on the loss function value, so that the correction image output by the image correction model can be close to the training image to a greater extent, and the accuracy of image correction of the image correction model is improved. In a specific implementation process, the weight value of the network parameter in the preset correction network can be preferentially adjusted, and after the weight value of the network parameter in the preset correction network is properly adjusted, the sampling parameter in the sampling network is properly adjusted.
When the parameters of the sampling network are fixed, the loss function value between the correction image and the training image is calculated based on the loss function, and at the moment, the weight value of the network parameters in the preset correction network can be adjusted according to the calculated loss function value, so that the affine transformation matrix output by the preset correction network is more accurate. And after the training is finished, the trained preset correction network and the sampling network are jointly used as an image correction model to participate in the image correction. In the specific implementation process, for example, a random gradient descent method, a newton method, a quasi-newton method, a conjugate gradient method, or the like may be used to adjust the weight values of the parameters in the preset correction network.
When the loss function value between the correction image and the training image is calculated by using the average square error loss function, the distance between each pixel position in the correction image and the training image can be calculated, the square of the distance can be calculated, and the loss function value between the correction image and the training image can be obtained by averaging after summing the squares. When the value of the loss function between the correction image and the training image reaches a preset value or reaches a minimum value, the preset correction network can be considered to be trained completely.
Through the output of the sampling network, the preset correction network and the sampling network are subjected to iterative training by using the correction image and the training image, so that the accuracy of an affine transformation matrix output by the preset correction network can be improved, and the accuracy of the correction image output by the whole image correction model are comprehensively improved.
In the training method for the image correction model provided in the above embodiment, the training image and the rotation image corresponding to the training image are used as the training data of the image correction model, and the number of the labeled data is reduced when the image correction model is trained, so that the dependency on the labeled data can be reduced, the training cost of the image correction model can be reduced, and the accuracy of the image correction model obtained by training can be improved. In addition, a corrected image is obtained based on the affine transformation matrix and the sampling network, then iterative training is carried out on the preset correction network and the sampling network according to the corrected image and the training image, the sampling network and the preset correction network are jointly used as an image correction model after the training is finished, the sampling network is also used as one part of the image correction model to participate in the training of the image correction model, the image correction model is subjected to unsupervised training by utilizing the corrected image, and the accuracy of the image correction model obtained by the training can be ensured under the condition of reducing the labeling cost.
Referring to fig. 4, fig. 4 is a schematic diagram illustrating a step of an image rectification method according to an embodiment of the present disclosure. As shown in fig. 4, the image rectification method includes steps S201 and S202.
Step S201, acquiring an image to be corrected.
Before image correction, image preprocessing can be performed on the acquired image to obtain an image to be corrected. The image preprocessing comprises the step of modifying the size of an image to adjust the image to a preset size so as to ensure the accuracy of a corrected image obtained by image correction.
Step S202, inputting the image to be corrected into a pre-trained image correction model to obtain a corrected image, wherein the pre-trained image correction model is obtained by training by adopting the image correction model training method.
And inputting the image to be corrected into the image correction model trained by adopting the training method of the image correction model, so as to obtain a corrected image corresponding to the image to be corrected.
In the specific implementation process, after an image to be corrected is input to a pre-trained image correction model, an affine transformation matrix corresponding to the image to be corrected is output through a correction network in the pre-trained image correction model, affine transformation is performed on the image to be corrected by using the affine transformation matrix, and transformation data corresponding to the image to be corrected is obtained, wherein the transformation data comprises mapping coordinates of each pixel point in the image to be corrected after the affine transformation. And then inputting the transformation data corresponding to the image to be corrected into a sampling network for image sampling, and taking the corrected image generated after image sampling as the output of the pre-trained image correction model, thereby obtaining the corrected image corresponding to the image to be corrected.
The image correction method provided by the embodiment utilizes the pre-trained image correction model to correct the image to be corrected, and can correct the image with high accuracy, so that the success rate and the accuracy rate of recognition can be improved when face recognition or other actions are subsequently performed.
Referring to fig. 5, fig. 5 is a schematic block diagram of a training apparatus for an image rectification model according to an embodiment of the present application. As shown in fig. 5, the training apparatus for the image rectification model includes: a data acquisition module 301, a matrix generation module 302, an image generation model 303, and an iterative training module 304. Wherein the content of the first and second substances,
a data obtaining module 301, configured to obtain training data, where the training data includes a training image and a rotation image corresponding to the training image.
The matrix generating module 302 is configured to input the rotated image into a preset correction network, so as to obtain an affine transformation matrix corresponding to the rotated image.
And the image generation model 303 is configured to perform affine transformation on the rotated image based on the affine transformation matrix, and input the obtained transformation data to a sampling network to perform image sampling, so as to obtain a corrected image corresponding to the rotated image.
And the iterative training module 304 is configured to calculate a loss function value between the correction image and the training image, perform iterative training on the preset correction network and the sampling network according to the loss function value, and use the preset correction network and the sampling network together as an image correction model when the training is completed.
Referring to fig. 6, fig. 6 is a schematic block diagram of an image rectification apparatus according to an embodiment of the present application. As shown in fig. 6, the image rectification apparatus includes: an image acquisition module 401 and an image rectification module 402.
Wherein the content of the first and second substances,
an image obtaining module 401, configured to obtain an image to be corrected.
The image rectification module 402 is configured to input the image to be rectified into a pre-trained image rectification model to obtain a rectified image, where the pre-trained image rectification model is obtained by using the above training method for the image rectification model.
Referring to fig. 7, fig. 7 is a schematic block diagram of a computer device according to an embodiment of the present disclosure. The computer device may be a server or a terminal.
As shown in fig. 7, the computer device includes a processor, a memory, and a network interface connected by a system bus, wherein the memory may include a nonvolatile storage medium and an internal memory.
The non-volatile storage medium may store an operating system and a computer program. The computer program includes program instructions that, when executed, cause a processor to perform any of the methods for training an image rectification model.
The processor is used for providing calculation and control capability and supporting the operation of the whole computer equipment.
The internal memory provides an environment for execution of a computer program in a non-volatile storage medium, which when executed by the processor, causes the processor to perform any one of the methods for training an image rectification model.
The network interface is used for network communication, such as sending assigned tasks and the like. Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
It should be understood that the Processor may be a Central Processing Unit (CPU), and the Processor may be other general purpose processors, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, etc. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Wherein, in one embodiment, the processor is configured to execute a computer program stored in the memory to implement the steps of:
acquiring training data, wherein the training data comprises a training image and a rotating image corresponding to the training image;
inputting the rotating image into a preset correction network to obtain an affine transformation matrix corresponding to the rotating image;
performing affine transformation on the rotating image based on the affine transformation matrix, and inputting the obtained transformation data into a sampling network for image sampling to obtain a corrected image corresponding to the rotating image;
and calculating a loss function value between the correction image and the training image, performing iterative training on the preset correction network and the sampling network according to the loss function value, and taking the preset correction network and the sampling network as an image correction model together when the training is finished.
In one embodiment, the preset correction network is a pre-trained correction network; the processor is further configured to implement:
obtaining sample data, wherein the sample data comprises a sample image and key points corresponding to the sample image;
inputting the sample image into a convolutional neural network to obtain an output affine matrix;
determining a supervised affine matrix based on the key points and preset points corresponding to the sample image;
and calculating a loss function value between the output affine matrix and the supervision affine matrix, and pre-training the convolutional neural network according to the loss function value to obtain a pre-trained correction network.
In one embodiment, the preset correction network is a pre-trained correction network; the processor is further configured to implement:
acquiring training sample data, wherein the training sample data comprises a rotating sample image corresponding to a training sample image and a rotating angle corresponding to the training sample image;
inputting the rotating sample image into a convolutional neural network to obtain an output affine matrix corresponding to the rotating sample image;
determining a supervised affine matrix according to the rotating sample image and the rotating angle corresponding to the rotating sample image;
and calculating a loss function value between the output affine matrix and the supervision affine matrix, and pre-training the convolutional neural network according to the loss function value to obtain a pre-trained correction network.
In one embodiment, the processor, when implementing the affine transformation of the rotated image based on the affine transformation matrix, is configured to implement:
acquiring the pixel coordinate of each pixel point in the rotating image;
and respectively mapping the pixel coordinates of each pixel point based on the affine transformation matrix to obtain the mapping coordinates of each pixel point.
In one embodiment, when the processor implements the inputting of the obtained transformation data into a sampling network for image sampling to obtain a corrected image corresponding to the rotated image, the processor is configured to implement:
when the mapping coordinates of the pixel points are not integers, inputting the mapping coordinates of the pixel points to a sampling network for image sampling to obtain a corrected image corresponding to the rotating image;
and when the mapping coordinates of the pixel points are integers, acquiring the pixel points corresponding to the mapping coordinates in the rotating image according to the mapping coordinates to fill the pixel points so as to obtain a corrected image.
In an embodiment, when the processor implements that the mapping coordinates of the pixel points are input to a sampling network for image sampling to obtain a corrected image corresponding to the rotated image, the processor is configured to implement:
carrying out interpolation and rounding on the mapping coordinates of the pixel points to obtain sampling coordinates corresponding to the pixel points;
and acquiring pixel points corresponding to the sampling coordinates in the rotating image according to the sampling coordinates to fill the pixel points to obtain a corrected image.
In one embodiment, before the step of inputting the rotated image into a preset correction network to obtain an affine transformation matrix corresponding to the rotated image, the processor is configured to:
performing image preprocessing on the rotated image, wherein the image preprocessing comprises size adjustment and/or image enhancement.
Wherein, in another embodiment, the processor is configured to run a computer program stored in the memory to implement the steps of:
acquiring an image to be corrected;
and inputting the image to be corrected into a pre-trained image correction model to obtain a corrected image, wherein the pre-trained image correction model is obtained by training by adopting the training method of the image correction model.
Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, where the computer program includes program instructions, and the processor executes the program instructions to implement the training method and/or the image rectification method for the image rectification model according to any one of the embodiments of the present application.
The computer-readable storage medium may be an internal storage unit of the computer device described in the foregoing embodiment, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the computer device.
While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method for training an image rectification model, the method comprising:
acquiring training data, wherein the training data comprises a training image and a rotating image corresponding to the training image;
inputting the rotating image into a preset correction network to obtain an affine transformation matrix corresponding to the rotating image;
performing affine transformation on the rotating image based on the affine transformation matrix, and inputting the obtained transformation data into a sampling network for image sampling to obtain a corrected image corresponding to the rotating image;
and calculating a loss function value between the correction image and the training image, performing iterative training on the preset correction network and the sampling network according to the loss function value, and taking the preset correction network and the sampling network as an image correction model together when the training is finished.
2. The method for training an image rectification model according to claim 1, wherein the preset rectification network is a pre-trained rectification network; before inputting the rotated image to a preset rectification network, the method further comprises:
obtaining sample data, wherein the sample data comprises a sample image and key points corresponding to the sample image;
inputting the sample image into a convolutional neural network to obtain an output affine matrix;
determining a supervised affine matrix based on the key points and preset points corresponding to the sample image;
and calculating a loss function value between the output affine matrix and the supervision affine matrix, and pre-training the convolutional neural network according to the loss function value to obtain a pre-trained correction network.
3. The method for training an image rectification model according to claim 1, wherein the preset rectification network is a pre-trained rectification network; before inputting the rotated image to a preset rectification network, the method further comprises:
acquiring training sample data, wherein the training sample data comprises a rotating sample image corresponding to a training sample image and a rotating angle corresponding to the training sample image;
inputting the rotating sample image into a convolutional neural network to obtain an output affine matrix corresponding to the rotating sample image;
determining a supervision affine matrix according to the rotation sample image and the rotation angle corresponding to the rotation sample image;
and calculating a loss function value between the output affine matrix and the supervision affine matrix, and pre-training the convolutional neural network according to the loss function value to obtain a pre-trained correction network.
4. The method for training an image rectification model according to claim 1, wherein the affine transforming the rotation image based on the affine transformation matrix comprises:
acquiring the pixel coordinate of each pixel point in the rotating image;
and respectively mapping the pixel coordinates of each pixel point based on the affine transformation matrix to obtain the mapping coordinates of each pixel point.
5. The method for training the image rectification model according to claim 4, wherein the inputting the obtained transformation data into a sampling network for image sampling to obtain the rectified image corresponding to the rotated image comprises:
when the mapping coordinates of the pixel points are not integers, inputting the mapping coordinates of the pixel points to a sampling network for image sampling to obtain a corrected image corresponding to the rotating image;
and when the mapping coordinates of the pixel points are integers, acquiring the pixel points corresponding to the mapping coordinates in the rotating image according to the mapping coordinates to fill the pixel points so as to obtain a corrected image.
6. The method for training the image rectification model according to claim 5, wherein the step of inputting the mapping coordinates of the pixel points into a sampling network for image sampling to obtain the rectified image corresponding to the rotated image comprises:
carrying out interpolation and rounding on the mapping coordinates of the pixel points to obtain sampling coordinates corresponding to the pixel points;
and acquiring pixel points corresponding to the sampling coordinates in the rotating image according to the sampling coordinates to fill the pixel points to obtain a corrected image.
7. The method for training the image rectification model according to any one of claims 1 to 6, wherein before the step of inputting the rotated image into a preset rectification network and obtaining the affine transformation matrix corresponding to the rotated image, the method comprises:
and performing image preprocessing on the rotation image, wherein the image preprocessing comprises size adjustment and/or image enhancement.
8. An image rectification method, characterized in that the method comprises:
acquiring an image to be corrected;
inputting the image to be corrected into a pre-trained image correction model to obtain a corrected image, wherein the pre-trained image correction model is obtained by training by using the training method of the image correction model according to any one of claims 1 to 7.
9. A computer device, wherein the computer device comprises a memory and a processor;
the memory is used for storing a computer program;
the processor is configured to execute the computer program and to implement a training method of an image rectification model according to any one of claims 1 to 7 and/or an image rectification method according to claim 8 when the computer program is executed.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, causes the processor to implement the method of training an image rectification model according to any one of claims 1 to 7 and/or the method of image rectification according to claim 8.
CN202211048861.8A 2022-08-30 2022-08-30 Training method of image correction model, image correction method, device and storage medium Pending CN115423691A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202211048861.8A CN115423691A (en) 2022-08-30 2022-08-30 Training method of image correction model, image correction method, device and storage medium
PCT/CN2022/142238 WO2024045442A1 (en) 2022-08-30 2022-12-27 Image correction model training method, image correction method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211048861.8A CN115423691A (en) 2022-08-30 2022-08-30 Training method of image correction model, image correction method, device and storage medium

Publications (1)

Publication Number Publication Date
CN115423691A true CN115423691A (en) 2022-12-02

Family

ID=84200055

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211048861.8A Pending CN115423691A (en) 2022-08-30 2022-08-30 Training method of image correction model, image correction method, device and storage medium

Country Status (2)

Country Link
CN (1) CN115423691A (en)
WO (1) WO2024045442A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115861393A (en) * 2023-02-16 2023-03-28 中国科学技术大学 Image matching method, spacecraft landing point positioning method and related device
CN116757964A (en) * 2023-08-16 2023-09-15 山东省地质矿产勘查开发局第八地质大队(山东省第八地质矿产勘查院) Image correction method for geographical information display
WO2024045442A1 (en) * 2022-08-30 2024-03-07 青岛云天励飞科技有限公司 Image correction model training method, image correction method, device and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107749050B (en) * 2017-09-30 2020-05-15 珠海市杰理科技股份有限公司 Fisheye image correction method and device and computer equipment
CN108399408A (en) * 2018-03-06 2018-08-14 李子衿 A kind of deformed characters antidote based on deep space converting network
CN109993137A (en) * 2019-04-09 2019-07-09 安徽大学 A kind of fast face antidote based on convolutional neural networks
CN110782421B (en) * 2019-09-19 2023-09-26 平安科技(深圳)有限公司 Image processing method, device, computer equipment and storage medium
CN112651389B (en) * 2021-01-20 2023-11-14 北京中科虹霸科技有限公司 Correction model training, correction and recognition method and device for non-emmetropic iris image
CN115423691A (en) * 2022-08-30 2022-12-02 青岛云天励飞科技有限公司 Training method of image correction model, image correction method, device and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024045442A1 (en) * 2022-08-30 2024-03-07 青岛云天励飞科技有限公司 Image correction model training method, image correction method, device and storage medium
CN115861393A (en) * 2023-02-16 2023-03-28 中国科学技术大学 Image matching method, spacecraft landing point positioning method and related device
CN115861393B (en) * 2023-02-16 2023-06-16 中国科学技术大学 Image matching method, spacecraft landing point positioning method and related device
CN116757964A (en) * 2023-08-16 2023-09-15 山东省地质矿产勘查开发局第八地质大队(山东省第八地质矿产勘查院) Image correction method for geographical information display
CN116757964B (en) * 2023-08-16 2023-11-03 山东省地质矿产勘查开发局第八地质大队(山东省第八地质矿产勘查院) Image correction method for geographical information display

Also Published As

Publication number Publication date
WO2024045442A1 (en) 2024-03-07

Similar Documents

Publication Publication Date Title
US11798132B2 (en) Image inpainting method and apparatus, computer device, and storage medium
CN115423691A (en) Training method of image correction model, image correction method, device and storage medium
US20200327675A1 (en) Foreground-aware image inpainting
CN111476719B (en) Image processing method, device, computer equipment and storage medium
CN111428579A (en) Face image acquisition method and system
CN111860398B (en) Remote sensing image target detection method and system and terminal equipment
CN111386550A (en) Unsupervised learning of image depth and ego-motion predictive neural networks
CN110163087B (en) Face gesture recognition method and system
CN112308866B (en) Image processing method, device, electronic equipment and storage medium
US11080833B2 (en) Image manipulation using deep learning techniques in a patch matching operation
CN111598087B (en) Irregular character recognition method, device, computer equipment and storage medium
WO2023035531A1 (en) Super-resolution reconstruction method for text image and related device thereof
CN112651490B (en) Training method and device for human face key point detection model and readable storage medium
CN110705625A (en) Image processing method and device, electronic equipment and storage medium
CN112085056A (en) Target detection model generation method, device, equipment and storage medium
CN111358430B (en) Training method and device for magnetic resonance imaging model
CN113516697B (en) Image registration method, device, electronic equipment and computer readable storage medium
CN108734712B (en) Background segmentation method and device and computer storage medium
CN113822097B (en) Single-view human body posture recognition method and device, electronic equipment and storage medium
US20230401737A1 (en) Method for training depth estimation model, training apparatus, and electronic device applying the method
WO2024021504A1 (en) Facial recognition model training method and apparatus, recognition method, and device and medium
US20230401670A1 (en) Multi-scale autoencoder generation method, electronic device and readable storage medium
CN116934591A (en) Image stitching method, device and equipment for multi-scale feature extraction and storage medium
CN116844124A (en) Three-dimensional object detection frame labeling method, three-dimensional object detection frame labeling device, electronic equipment and storage medium
CN116596748A (en) Image stylization processing method, apparatus, device, storage medium, and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination