CN112215906A

CN112215906A - Image processing method and device and electronic equipment

Info

Publication number: CN112215906A
Application number: CN202010925853.1A
Authority: CN
Inventors: 熊鹏飞; 谭婧
Original assignee: Beijing Megvii Technology Co Ltd
Current assignee: Beijing Megvii Technology Co Ltd
Priority date: 2020-09-04
Filing date: 2020-09-04
Publication date: 2021-01-12

Abstract

The invention provides an image processing method, an image processing device and electronic equipment, wherein the image processing method comprises the following steps: acquiring an image to be processed containing a target object and a target object mask corresponding to the image to be processed; inputting the image to be processed and the target object mask into a distortion model to obtain a deformed grid map corresponding to the image to be processed; and carrying out deformation correction on the image to be processed based on the deformation grid graph to obtain a corrected image of the image to be processed. According to the method, the image to be processed and the target object mask are processed by adopting the distortion model, the deformation grid map corresponding to the image to be processed is directly obtained, and then the image to be processed is subjected to deformation correction based on the deformation grid map, parameter optimization is not needed in the process, the operation efficiency can be greatly improved, and the correction effect is improved; the use of the target object mask is beneficial to enhancing the attention of the model to the region where the target object is located; the deformed grid map is used for correcting the whole image, so that the boundary transition of the target object and the background in the corrected image is more natural.

Description

Image processing method and device and electronic equipment

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to an image processing method and apparatus, and an electronic device.

Background

With the development of shooting equipment, people widely use wide-angle and ultra-wide-angle lenses to shoot images. When a wide-angle or ultra-wide-angle lens is adopted to shoot a target object, the imaging point of the target object can shift, and various distorted images can be obtained. As shown in fig. 1, when a target object P is photographed by a wide-angle lens, an imaging point P may be shifted differently on an imaging plane, so as to obtain an imaging point P1 and an imaging point P2, and two kinds of distorted images, namely a barrel-shaped distorted image and a pincushion distorted image, are correspondingly obtained. Since the distorted image will affect the application value of the image, the distorted image needs to be corrected by an image undistortion (image undistortion) method to obtain an undistorted image that conforms to the visual characteristics of the human body, and the distorted image has an increasing role in image video processing. In general, the degree of distortion of the foreground (i.e., the subject) and the background of an image is different, and if it is desired to obtain an undistorted image that matches the visual characteristics of a human body, different distortion correction methods are required for both.

The conventional distortion correction method is as follows: and obtaining distortion parameters of the camera lens through calibration, and then inversely transforming the distorted image into a normal image. In the process of inverse deformation, the four corners of a distorted image are inevitably stretched, and when a target object is positioned on the boundary of the image, the target object is deformed. The current mainstream algorithm is to obtain the position of a target object first, and then adopt different distortion correction modes for a background and the target object in the distortion correction process, or adopt the same distortion correction mode for the whole image first, and then independently adjust the part of the target object. In this way, it is ensured that both the background and the target object are corrected to the visual characteristics. Even so, the above image processing method still has several problems: firstly, the optimization process is slow, and in order to solve the optimization algorithm, a very large number of iterations are usually required to converge to a correct solution, and the risk of converging to local minimum exists; secondly, the parameter adaptability is poor, the correction parameters suitable for the image A are not necessarily suitable for the image B, and the correction parameters suitable for all images or most images are difficult to find; thirdly, the optimization effect is poor, and the boundary of the target object and the background often has unnatural effects such as line distortion.

In conclusion, the conventional image processing method has the technical problems of poor effect and low efficiency.

Disclosure of Invention

In view of the above, the present invention provides an image processing method, an image processing apparatus, and an electronic device, so as to alleviate the technical problems of poor effect and low efficiency of the existing image processing method.

In a first aspect, an embodiment of the present invention provides an image processing method, including: acquiring an image to be processed containing a target object and a target object mask corresponding to the image to be processed; inputting the image to be processed and the target object mask into a distortion model to obtain a deformed grid map corresponding to the image to be processed, wherein the pixel value of each pixel point of the deformed grid map represents the offset of the corresponding pixel point in the image to be processed; and carrying out deformation correction on the image to be processed based on the deformation grid graph to obtain a corrected image of the image to be processed.

Further, the method further comprises: and zooming the image to be processed to a first preset scale to obtain the image to be processed with the first preset scale.

Further, acquiring a target object mask corresponding to the image to be processed includes: and inputting the image to be processed with the first preset scale into a segmentation model to obtain the target object mask.

Further, the method further comprises: and respectively zooming the image to be processed and the target object mask to a second preset scale to obtain the image to be processed of the second preset scale and the target object mask of the second preset scale.

Further, inputting the image to be processed and the target object mask into a distortion model comprises: and inputting the image to be processed with the second preset scale and the target object mask with the second preset scale into a distortion model to obtain a deformed grid map corresponding to the image to be processed.

Further, the performing deformation correction on the image to be processed based on the deformed grid map includes: scaling the deformed grid map to a target scale to obtain a deformed grid map of the target scale, wherein the target scale is equal to the scale of the image to be processed; and offsetting the corresponding pixel points in the image to be processed according to the offset represented by the pixel value of each pixel point in the deformed grid graph of the target scale to obtain a corrected image of the image to be processed.

Further, the method further comprises: obtaining a training sample set, wherein the training sample set comprises: training object images, training object masks corresponding to the training object images, and deformed grid graphs corresponding to the training object images; and training an original distortion model through the training sample set to obtain the distortion model.

In a second aspect, an embodiment of the present invention further provides an image processing apparatus, including: the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring an image to be processed containing a target object and a target object mask corresponding to the image to be processed; the processing unit is used for inputting the image to be processed and the target object mask into a distortion model to obtain a deformed grid map corresponding to the image to be processed, wherein the pixel value of each pixel point of the deformed grid map represents the offset of the corresponding pixel point in the image to be processed; and the deformation correction unit is used for carrying out deformation correction on the image to be processed based on the deformation grid graph to obtain a corrected image of the image to be processed.

In a third aspect, an embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the method according to any one of the above first aspects when executing the computer program.

In a fourth aspect, an embodiment of the present invention provides a computer-readable medium having non-volatile program code executable by a processor, where the program code causes the processor to perform the steps of the method according to any one of the first aspect.

In the embodiment of the invention, firstly, a to-be-processed image containing a target object and a target object mask corresponding to the to-be-processed image are obtained, then the to-be-processed image and the target object mask are input into a distortion model to obtain a deformed grid map corresponding to the to-be-processed image, wherein the pixel value of each pixel point of the deformed grid map represents the offset of the corresponding pixel point in the to-be-processed image; and carrying out deformation correction on the image to be processed based on the deformation grid graph to obtain a corrected image of the image to be processed. According to the method, the image to be processed and the target object mask are processed by adopting the distortion model, the deformation grid map corresponding to the image to be processed is directly obtained, and then the image to be processed is subjected to deformation correction based on the deformation grid map, parameter optimization is not needed in the process, the operation efficiency can be greatly improved, and the correction effect is improved; a target object mask is used in the process of training and using the distortion model, so that the attention of the model to the region where the target object is located is enhanced; the deformed grid graph output by the distortion model is used for correcting the whole image, and the target object and the background are not distinguished any more, so that the boundary transition of the target object and the background in the corrected image is more natural.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a schematic diagram of a distorted image generated according to an embodiment of the present invention;

fig. 2 is a schematic diagram of an electronic device according to an embodiment of the present invention;

FIG. 3 is a flowchart of an image processing method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of an image to be processed according to an embodiment of the present invention;

FIG. 5 is a target object mask corresponding to FIG. 4 provided by an embodiment of the present invention;

FIG. 6 is a deformed grid diagram corresponding to the head mask of the portrait of FIGS. 4 and 5, according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of the rectified image of FIG. 4 according to an embodiment of the present invention;

FIG. 8 is a process diagram of an image processing method according to an embodiment of the present invention;

fig. 9(a) is a schematic diagram of an image to be processed according to an embodiment of the present invention;

FIG. 9(b) is a schematic diagram of the rectified image of FIG. 9(a) according to an embodiment of the present invention;

fig. 10 is a schematic diagram of an image processing apparatus according to an embodiment of the present invention.

Detailed Description

The technical solutions of the present invention will be described clearly and completely with reference to the following embodiments, and it should be understood that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1:

first, an electronic device 100 for implementing an embodiment of the present invention, which can be used to execute the image processing method of embodiments of the present invention, is described with reference to fig. 2.

As shown in FIG. 2, electronic device 100 includes one or more processors 102, one or more memories 104, an input device 106, an output device 108, and a camera 110, which are interconnected via a bus system 112 and/or other form of connection mechanism (not shown). It should be noted that the components and structure of the electronic device 100 shown in fig. 2 are exemplary only, and not limiting, and the electronic device may have other components and structures as desired.

The processor 102 may be implemented in at least one hardware form of a Digital Signal Processor (DSP), a Field-Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), and an asic (application Specific Integrated circuit), and the processor 102 may be a Central Processing Unit (CPU) or other form of Processing Unit having data Processing capability and/or instruction execution capability, and may control other components in the electronic device 100 to perform desired functions.

The memory 104 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. On which one or more computer program instructions may be stored that may be executed by processor 102 to implement client-side functionality (implemented by the processor) and/or other desired functionality in embodiments of the invention described below. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer-readable storage medium.

The input device 106 may be a device used by a user to input instructions and may include one or more of a keyboard, a mouse, a microphone, a touch screen, and the like.

The output device 108 may output various information (e.g., images or sounds) to the outside (e.g., a user), and may include one or more of a display, a speaker, and the like.

The camera 110 is configured to capture an image to be processed, wherein the image to be processed captured by the camera is processed by the image processing method to obtain a corrected image, for example, the camera may capture an image (e.g., a photo, a video, etc.) desired by a user and then process the image by the image processing method to obtain a corrected image, and the camera may further store the captured image in the memory 104 for use by other components.

Exemplarily, an electronic device for implementing an image processing method according to an embodiment of the present invention may be implemented as a smart mobile terminal such as a smartphone, a tablet computer, or the like.

Example 2:

according to an embodiment of the present invention, there is provided an image processing method, it should be noted that the steps shown in the flowchart of the drawings may be executed in a computer system such as a set of computer executable instructions, and that although a logical order is shown in the flowchart, in some cases, the steps shown or described may be executed in an order different from that here.

Fig. 3 is a flowchart of an image processing method according to an embodiment of the present invention, as shown in fig. 3, the method including the steps of:

step S302, acquiring an image to be processed containing a target object and a target object mask corresponding to the image to be processed;

in the embodiment of the present invention, the target object may be a human being, an animal, or any other real object, and the target object is not particularly limited in the embodiment of the present invention.

The image to be processed may be an image frame containing a target object in a video stream acquired in real time, or an image containing a target object obtained by previous photographing. That is, the method may process the image frames containing the target object in the video stream in real time, and may also perform post-processing on the captured image containing the target object.

In the embodiment of the present invention, the target object mask corresponding to the image to be processed is used to represent the positions of the pixel points of the target object that needs the distortion model to pay attention to, and the target object mask corresponding to the image to be processed may be the entire target object or may be a part of the target object. For example, the target object mask may be an outer contour of the target object, or may be a partial region of the target object, such as an outer contour of the head. When only a part of the target object needs to be concerned by the distortion model, the target object mask only needs the positions of pixel points corresponding to the part of the target object; when distortion correction needs to keep the whole target object accurate, the target object mask needs to contain the positions of all pixel points of the target object. In the target object mask, the target object may correspond to a position of 1, and the other positions may be 0.

When the target object is a person, if the image to be processed is as shown in fig. 4, two target object masks corresponding to fig. 4 are shown in fig. 5, one is a whole portrait mask, and the other is a head mask of a portrait.

Step S304, inputting the image to be processed and the target object mask into a distortion model to obtain a deformed grid map corresponding to the image to be processed, wherein the pixel value of each pixel point of the deformed grid map represents the offset of the corresponding pixel point in the image to be processed;

the distortion model is a distortion model obtained by pre-training, and the deformation grid map can be a two-channel deformation matrix, wherein the deformation matrix of each channel is a two-dimensional matrix comprising two directions of x and y. And the pixel value of each pixel point in the deformation matrix represents the offset of the corresponding pixel point in the image to be processed. For example, the pixel values of the pixel points of (1, 1) in the two channels of the two-channel deformation matrix are (2, 2), so that it can be determined that the pixel points of the image to be processed (1, 1) should be shifted by 2 pixel points to the right and shifted by 2 pixel points to the up, and the corrected image can be obtained. The pixel value of each pixel point in the deformation matrix can also represent the position of the corresponding pixel point in the image to be processed after correction. For example, if the pixel value of the pixel point of (1, 1) in the two-channel deformation matrix is (2, 2) in the two channels, it may be determined that the position of the pixel point of the image to be processed (1, 1) in the corrected image is (2, 2).

In one example, inputting the to-be-processed image and the target object mask into the distortion model input distortion model means that R, G, B, and target object masks of RGB to-be-processed images are input as a 4-layer image distortion model. The visualization effect of the deformed grid map corresponding to the head mask of the portrait in fig. 4 and 5 is shown in fig. 6.

And S306, performing deformation correction on the image to be processed based on the deformation grid map to obtain a corrected image of the image to be processed.

Specifically, the position of a certain pixel point a in the image to be processed in the corrected image can be calculated based on the deformed grid map, and the pixel value of the pixel point at the corresponding position in the corrected image is modified into the pixel value of the pixel point a in the image to be processed, so that the offset of the pixel point in the image to be processed can be realized.

It can be understood that, if a plurality of pixel points in the image to be processed correspond to the same position B of the corrected image according to the deformation matrix, the pixel values of the plurality of pixel points in the image to be processed may be fused to obtain the pixel value of the position B in the corrected image. If the position C in the corrected image does not correspond to any pixel point in the image to be processed according to the deformation matrix, the pixel value of the position C can be obtained through interpolation according to the pixel values of the positions around the position C in the corrected image.

Fig. 7 shows a corrected image obtained by performing distortion correction on the image to be processed in fig. 4 based on the distortion grid map in fig. 6.

The foregoing briefly introduces the image processing method of the present invention, and the details thereof are described in detail below.

In an optional embodiment of the present invention, in step S302, the step of acquiring a target object mask corresponding to the image to be processed includes: and inputting the image to be processed with the first preset scale into the segmentation model to obtain the target object mask. The first preset scale is smaller than the scale of the image to be processed.

The image to be processed with the first preset scale is obtained by zooming the image to be processed to the first preset scale before the image to be processed is input to the segmentation model after the image to be processed containing the target object is obtained, so that the image to be processed with the first preset scale is obtained, and then the image to be processed with the first preset scale is input to the segmentation model.

Generally, the original resolution of the image to be processed is relatively high, the image to be processed is scaled to a first preset scale (which may be 640 × 480, and the first preset scale is not specifically limited in the embodiments of the present invention), and then is input into the segmentation model, so that the calculation amount of the model can be greatly reduced, the calculation efficiency is improved, and the whole image processing method can be run on a terminal, such as a mobile phone, in real time.

The segmentation model can be any image segmentation model or an example segmentation model, an image to be processed with a first preset scale is input, features of the image with different scales are extracted through a multilayer convolution and down-sampling feature network, and the features with different scales are fused through a decoding network with continuous up-sampling to output and obtain a target object mask. In the invention, a real-time portrait segmentation model is adopted, the image to be processed is zoomed to a first preset scale and is input into the portrait segmentation model, and the image segmentation can be completed after 10ms is realized at a mobile phone end.

In an alternative embodiment of the present invention, the step of inputting the image to be processed and the mask of the target object into the distortion model in step S304 comprises: and inputting the image to be processed with the second preset scale and the target object mask with the second preset scale into the distortion model to obtain a deformed grid map corresponding to the image to be processed.

The to-be-processed image with the second preset scale and the target object mask with the second preset scale are obtained by respectively scaling the to-be-processed image and the target object mask to the second preset scale before the to-be-processed image and the target object mask are input into the distortion model after the to-be-processed image and the target object mask are obtained, so that the to-be-processed image with the second preset scale and the target object mask with the second preset scale are obtained, and then the to-be-processed image with the second preset scale and the target object mask with the second preset scale are input into the distortion model.

The original resolution of the image to be processed is larger, the resolution of the target object mask is the first preset scale, the image to be processed and the target object mask are respectively zoomed to the second preset scale and then input to the distortion model, the calculated amount of the model can be greatly reduced, the calculation efficiency is improved, and the whole image processing method can run on a terminal such as a mobile phone in real time.

The second preset scale may be the same as the first preset scale, or may be different from the first preset scale, and the second preset scale is not specifically limited in the embodiment of the present invention.

The distortion model can be any image generation model, an image to be processed with a second preset scale and a target object mask with the second preset scale are input, the features of the image with different scales are extracted through a multilayer convolution and downsampling feature network, the features with different scales are fused through a decoding network with continuous upsampling, and a deformed grid map with the same size as the image to be processed with the second preset scale is output. In the embodiment of the invention, in order to realize rapid deformation correction, a lightweight distortion model is adopted, an image to be processed and a target object mask are scaled to a second scale input distortion model, and the deformation grid graph output can be completed after 10ms is realized on a terminal such as a mobile phone.

In an optional embodiment of the present invention, in step S306, the step of performing distortion correction on the image to be processed based on the distortion mesh map includes: scaling the deformed grid map to a target scale to obtain a deformed grid map of the target scale, wherein the target scale is equal to the scale of the image to be processed; and offsetting the corresponding pixel points in the image to be processed according to the offset represented by the pixel value of each pixel point in the deformed grid graph of the target scale to obtain a corrected image of the image to be processed.

The procedure of the image processing method of the present invention is described in its entirety with reference to fig. 8. Referring to fig. 8, an image to be processed is obtained, the image to be processed is scaled to a first preset scale to obtain an image to be processed of a first preset scale, the image to be processed of the first preset scale is input to a segmentation model, a target object mask of the first preset scale is output, the image to be processed and the target object mask of the first preset scale are respectively scaled to a second preset scale to obtain an image to be processed of a second preset scale and a target object mask of the second preset scale, the image to be processed of the second preset scale and the target object mask of the second preset scale are input to a distortion model to obtain a distortion grid map corresponding to the image to be processed, the distortion grid map is scaled to a target scale (the target scale is equal to the scale of the image to be processed) to obtain a distortion grid map of the target scale, and finally, the distortion grid map of the target scale is obtained according to offset represented by a pixel value of each pixel point in the distortion grid map of the target scale And shifting the pixel points to obtain a corrected image of the image to be processed.

The above description describes the process of processing the image to be processed by the distortion model of the present invention in detail, and the following describes the training process of the distortion model in detail.

In an alternative embodiment of the invention, the training process for the distortion model is as follows:

(1) obtaining a training sample set, wherein the training sample set comprises: training object images, training object masks corresponding to the training object images and deformed grid graphs corresponding to the training object images;

the training object mask corresponding to the training object image may be specifically generated based on a training object position for labeling a training object in the training object image, or may be obtained by inputting the training object image into a trained segmentation model.

After the training object image is obtained, a deformation grid graph required for deformation correction of the training object image can be estimated by using a traditional deformation correction method. The training sample set is generated under the line, so when the deformation grid graph of the training object image is estimated by using the traditional distortion correction method, the deformation grid graph corresponding to each training object image is estimated according to a group of parameters in the traditional distortion correction method, then an error result is selected from all the obtained deformation grid graphs in an artificial mode, then the parameters in the traditional distortion correction method are adjusted, and the deformation grid graph of the training object image corresponding to the error result is estimated by using the parameter adjusted method until the deformation grid graphs corresponding to most training object images are accurate, so that the deformation grid graph corresponding to the training object image can be obtained.

(2) And training the original distortion model through the training sample set to obtain the distortion model.

Inputting the training object image and the training object mask corresponding to the training object image into a distortion model, determining loss according to a deformation grid graph output by the distortion model and the deformation grid graph corresponding to the training object image, and adjusting distortion model parameters according to the loss. And when the training completion condition is met, finishing the training. The training object mask is beneficial to prompting the distortion model to pay attention to the training object mask, and compared with a mode that the training object image is input into the distortion model but the training object mask is not input into the distortion model, the method is easier to obtain the deformation grid graph closer to the true value.

In one example, inputting the training target image and the training target mask corresponding to the training target image into the distortion model means inputting the R-layer, G-layer, B-layer, and training target mask of the RGB training target image as 4-layer images into the distortion model.

The image processing method can be used for correcting the image distortion and can also be used for correcting the video distortion, the image processing speed is high, and the image correction effect is good. As can be seen from the experiment of the method of the present invention on a mobile phone with a highpass 855, the time required to correct the image in fig. 9(a) is 25ms, and the corresponding corrected image is shown in fig. 9 (b).

Example 3:

an embodiment of the present invention further provides an image processing apparatus, which is mainly used for executing the image processing method provided by the foregoing content of the embodiment of the present invention, and the image processing apparatus provided by the embodiment of the present invention is specifically described below.

Fig. 10 is a schematic diagram of an image processing apparatus according to an embodiment of the present invention, as shown in fig. 10, the image processing apparatus mainly includes: an acquisition unit 10, a processing unit 20 and a deformation correction unit 30, wherein:

an acquisition unit 10 configured to acquire an image to be processed including a target object and a target object mask corresponding to the image to be processed;

the processing unit 20 is configured to input the image to be processed and the target object mask into a distortion model to obtain a deformed grid map corresponding to the image to be processed, where a pixel value of each pixel point of the deformed grid map indicates an offset of the corresponding pixel point in the image to be processed;

and the deformation correcting unit 30 is configured to perform deformation correction on the image to be processed based on the deformation grid map, so as to obtain a corrected image of the image to be processed.

Optionally, the apparatus is further configured to: and zooming the image to be processed to a first preset scale to obtain the image to be processed with the first preset scale.

Optionally, the obtaining unit is further configured to: and inputting the image to be processed with the first preset scale into the segmentation model to obtain the target object mask.

Optionally, the apparatus is further configured to: and respectively zooming the image to be processed and the target object mask to a second preset scale to obtain the image to be processed of the second preset scale and the target object mask of the second preset scale.

Optionally, the processing unit is further configured to: and inputting the image to be processed with the second preset scale and the target object mask with the second preset scale into the distortion model to obtain a deformed grid map corresponding to the image to be processed.

Optionally, the deformation correcting unit is further configured to: scaling the deformed grid map to a target scale to obtain a deformed grid map of the target scale, wherein the target scale is equal to the scale of the image to be processed; and offsetting the corresponding pixel points in the image to be processed according to the offset represented by the pixel value of each pixel point in the deformed grid graph of the target scale to obtain a corrected image of the image to be processed.

Optionally, the apparatus is further configured to: obtaining a training sample set, wherein the training sample set comprises: training object images, training object masks corresponding to the training object images and deformed grid graphs corresponding to the training object images; and training the original distortion model through the training sample set to obtain the distortion model.

The image processing apparatus provided by the embodiment of the present invention has the same implementation principle and technical effect as the method embodiment in the foregoing embodiment 2, and for the sake of brief description, reference may be made to the corresponding contents in the foregoing method embodiment for the part where the embodiment of the apparatus is not mentioned.

In another embodiment, there is also provided a computer readable medium having non-volatile program code executable by a processor, the program code causing the processor to perform the steps of the method of any of the above embodiments 2.

In addition, in the description of the embodiments of the present invention, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.

In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc., indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, a division of a unit is merely a division of one logic function, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. An image processing method, comprising:

acquiring an image to be processed containing a target object and a target object mask corresponding to the image to be processed;

inputting the image to be processed and the target object mask into a distortion model to obtain a deformed grid map corresponding to the image to be processed, wherein the pixel value of each pixel point of the deformed grid map represents the offset of the corresponding pixel point in the image to be processed;

and carrying out deformation correction on the image to be processed based on the deformation grid graph to obtain a corrected image of the image to be processed.

2. The method of claim 1, further comprising:

and zooming the image to be processed to a first preset scale to obtain the image to be processed with the first preset scale.

3. The method of claim 2, wherein obtaining a target object mask corresponding to the image to be processed comprises:

and inputting the image to be processed with the first preset scale into a segmentation model to obtain the target object mask.

4. The method according to any one of claims 1-3, further comprising:

and respectively zooming the image to be processed and the target object mask to a second preset scale to obtain the image to be processed of the second preset scale and the target object mask of the second preset scale.

5. The method of claim 4, wherein inputting the image to be processed and the target object mask into a distortion model comprises:

and inputting the image to be processed with the second preset scale and the target object mask with the second preset scale into a distortion model to obtain a deformed grid map corresponding to the image to be processed.

6. The method according to any one of claims 1-5, wherein performing a warp rectification on the image to be processed based on the warp grid map comprises:

scaling the deformed grid map to a target scale to obtain a deformed grid map of the target scale, wherein the target scale is equal to the scale of the image to be processed;

and offsetting the corresponding pixel points in the image to be processed according to the offset represented by the pixel value of each pixel point in the deformed grid graph of the target scale to obtain a corrected image of the image to be processed.

7. The method according to any one of claims 1-6, further comprising:

obtaining a training sample set, wherein the training sample set comprises: training object images, training object masks corresponding to the training object images, and deformed grid graphs corresponding to the training object images;

and training an original distortion model through the training sample set to obtain the distortion model.

8. An image processing apparatus characterized by comprising:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring an image to be processed containing a target object and a target object mask corresponding to the image to be processed;

the processing unit is used for inputting the image to be processed and the target object mask into a distortion model to obtain a deformed grid map corresponding to the image to be processed, wherein the pixel value of each pixel point of the deformed grid map represents the offset of the corresponding pixel point in the image to be processed;

and the deformation correction unit is used for carrying out deformation correction on the image to be processed based on the deformation grid graph to obtain a corrected image of the image to be processed.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the method of any of the preceding claims 1 to 7 are implemented when the computer program is executed by the processor.

10. A computer-readable medium having non-volatile program code executable by a processor, characterized in that the program code causes the processor to perform the steps of the method of any of the preceding claims 1 to 7.