WO2023202570A1

WO2023202570A1 - Image processing method and processing apparatus, electronic device and readable storage medium

Info

Publication number: WO2023202570A1
Application number: PCT/CN2023/088947
Authority: WO
Inventors: 任帅
Original assignee: 维沃移动通信有限公司
Priority date: 2022-04-21
Filing date: 2023-04-18
Publication date: 2023-10-26
Also published as: CN114792285A

Abstract

An image processing method and processing apparatus, an electronic device, and a readable storage medium. The image processing method comprises: positioning a target region in a first image to obtain position information of the target region (102); erasing image information of a preset region in the first image to obtain a second image after the erasing (104), the size of the second image being the same as that of the first image, and the preset region comprising the target region; according to the position information, clipping from the second image a region image corresponding to the target region (106); and, according to the region image and the first image, generating a target image after the processing (108).

Description

Image processing methods and processing devices, electronic equipment and readable storage media

Cross-references to related applications

This application requires the priority of the Chinese patent application submitted to the China Patent Office on April 21, 2022, with application number 202210420012.4 and titled "Image processing method and processing device, electronic equipment and readable storage medium", and its entire content has been approved This reference is incorporated into this application.

Technical field

This application belongs to the field of image processing technology, and specifically relates to an image processing method and processing device, electronic equipment and readable storage media.

Background technique

In related technologies, users sometimes have a need to erase or hide specific content in a picture, such as erasing text in a picture.

Processing methods in related technologies, such as erasing specific areas through image restoration, only use the texture information of the erased area, resulting in poor erasure effects and leaving obvious repair traces on the picture. If you use deep learning to process pictures, you can reduce repair traces, but it is easy to erase other areas in the picture that are similar to specific areas, and the processing effect is not good.

Contents of the invention

The purpose of the embodiments of the present application is to provide an image processing method and processing device, electronic equipment and readable storage medium, which can solve the problem of poor processing effect of erasing image content in related technologies.

In a first aspect, embodiments of the present application provide an image processing method, including:

Locate the target area in the first image and obtain the location information of the target area;

Through the image processing model, the image information of the preset area in the first image is erased to obtain an erased second image, where the size of the second image is the same as the first image, and the preset area includes the target area;

According to the position information, intercept an area image corresponding to the target area in the second image;

A processed target image is generated based on the area image and the first image.

In a second aspect, embodiments of the present application provide an image processing device, including:

A positioning module, used to locate the target area in the first image and obtain the position information of the target area;

The erasing module is used to erase the image information of the preset area in the first image through the image processing model to obtain an erased second image, where the size of the second image is the same as the first image, and the preset area includes target area;

An interception module, configured to intercept an area image corresponding to the target area in the second image based on the location information;

A processing module, configured to generate a processed target image based on the area image and the first image.

In a third aspect, embodiments of the present application provide an electronic device, including a processor and a memory. The memory stores programs or instructions that can be run on the processor. When the program or instructions are executed by the processor, the method of the first aspect is implemented. step.

In a fourth aspect, embodiments of the present application provide a readable storage medium that stores a program or instructions, and when the program or instructions are executed by a processor, the steps of the method in the first aspect are implemented.

In a fifth aspect, embodiments of the present application provide a chip. The chip includes a processor and a communication interface. The communication interface is coupled to the processor. The processor is used to run programs or instructions to implement the steps of the method in the first aspect. .

In a sixth aspect, embodiments of the present application provide a computer program product, the program product is stored in a storage medium, and the program product is executed by at least one processor to implement the method as described in the first aspect.

In a seventh aspect, an embodiment of the present application provides an image processing device, including the device being configured to perform the method described in the first aspect.

In this embodiment of the present application, the image area that needs to be processed is first positioned and the location information is recorded. Then, image processing is performed on the first image as a whole through the adversarial model. The image processing steps are based on image recognition technology and deep learning technology. The model automatically identifies the parts that need to be processed, and based on the global information of the first image, the parts that need to be processed are Erase the preset area and keep the image of the erased preset area consistent with the overall image.

After the erasure is completed, according to the position information of the marked target area, in the second image erased by the image processing model, the area image at the same position is intercepted, and the area image is combined with the original first image. , that is, only the parts of the second image that the user needs to eliminate are selected to process the original image. Therefore, the final target image retains the consistency of the overall image and the content of the target area that needs to be processed. Erasing is performed while ensuring that other similar areas in the image will not be erased by mistake, which improves the processing efficiency when erasing specific content in the image.

Description of the drawings

Figure 1 shows one of the flowcharts of an image processing method according to an embodiment of the present application;

Figure 2 shows one of the schematic diagrams of an image processing method according to an embodiment of the present application;

Figure 3 shows the second schematic diagram of the image processing method according to the embodiment of the present application;

Figure 4 shows the third schematic diagram of the image processing method according to the embodiment of the present application;

Figure 5 shows a schematic structural diagram of an image processing model according to an embodiment of the present application;

Figure 6 shows the second flowchart of the image processing method according to the embodiment of the present application;

Figure 7 shows a structural block diagram of an image processing device according to an embodiment of the present application;

Figure 8 shows a structural block diagram of an electronic device according to an embodiment of the present application;

FIG. 9 is a schematic diagram of the hardware structure of an electronic device implementing an embodiment of the present application.

Specific embodiments

The technical solutions in the embodiments of the present application will be clearly described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art fall within the scope of protection of this application.

The terms "first", "second", etc. in the description and claims of this application are used to distinguish similar objects and are not used to describe a specific order or sequence. It is to be understood that the figures so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in orders other than those illustrated or described herein, and that "first,""second," etc. are distinguished The object is usually A category does not limit the number of objects. For example, the first object can be one or multiple. In addition, "and/or" in the description and claims indicates at least one of the connected objects, and the character "/" generally indicates that the related objects are in an "or" relationship.

The image processing method and processing device, electronic equipment and readable storage medium provided by the embodiments of the present application will be described in detail below with reference to the accompanying drawings through specific embodiments and application scenarios.

In some embodiments of the present application, an image processing method is provided. Figure 1 shows one of the flowcharts of the image processing method according to the embodiment of the present application. As shown in Figure 1, the method includes:

Step 102, locate the target area in the first image and obtain the location information of the target area;

Step 104: Erase the image information of the preset area in the first image through the image processing model to obtain the erased second image;

In step 104, the size of the second image is the same as the first image, and the preset area includes the target area;

Step 106: According to the position information, intercept the area image corresponding to the target area in the second image;

Step 108: Generate a processed target image based on the area image and the first image.

In the embodiment of the present application, when the user wants to hide or erase specific content in the image, for example, when the user wants to hide the text content in the image, first, the text area in the image can be Position and record the location information corresponding to the area. The location information may be coordinate information.

After obtaining the position information of the target area that needs to be erased, the original image of the first image is input into the preset image processing model. Specifically, the image processing model can perform the first image processing according to the type of information that the user needs to erase. Automatically identify the target information to be erased in an image, and process the entire first image based on the global information of the target image.

Specifically, Figure 2 shows one of the schematic diagrams of an image processing method according to an embodiment of the present application. As shown in Figure 2, the first image 200 includes text information 202, and the user needs to erase the text information 202. First, by means such as Optical Character Recognition (OCR), the area where the text information 202 is located in the first image 200 is marked, that is, the target area. Domain 204 simultaneously records the coordinate information of the target area 204, thereby recording the location information of the target area 204.

After obtaining the position information of the target area, the first image is input to a preset image processing model. The image processing model can automatically identify the image information content that the user wants to erase, such as text information, and erase it based on the global information of the target image, including color information, pixel information, etc.

Figure 3 shows the second schematic diagram of the image processing method according to the embodiment of the present application. As shown in Figure 3, the image processing model processes the entire first image 300 and processes several pre-images that are identified as containing text. All areas are erased. As shown in Figure 3, the image processing model performs erasure processing on both preset areas. The first preset area 302 contains the text information that the user needs to erase, and the second preset area 304 contains Because the characteristics of the QR code information and text information are close, the QR code information is mistakenly recognized as text and erased.

Further, after the image processing model outputs the second image, according to the identified position information of the target area, in the second image, the corresponding area image is intercepted according to the same coordinates, and the area image is intercepted according to the coordinates of the target area. , superimposed on the original image of the first image, thereby covering the target area of the first image, thereby generating the target image.

Figure 4 shows the third schematic diagram of the image processing method according to the embodiment of the present application. As shown in Figure 4, on the target image 400, the text information in the target area 402 is erased, while the QR code 404 is retained.

According to the position information of the marked target area, the embodiment of the present application intercepts the area image at the same position in the second image erased by the image processing model, and combines the area image with the original first image, that is, Only the parts of the second image that the user needs to eliminate are selected to process the original image. Therefore, the final target image has the content of the target area that needs to be processed while retaining the coordination of the overall image. Erase, while ensuring that other similar areas in the image will not be erased by mistake, improving the processing efficiency when erasing specific content in the image.

In some embodiments of the present application, generating a processed target image based on the area image and the first image includes: covering the area image on the target area according to the position information to obtain the target image.

In the embodiment of this application, after obtaining the second image through the image processing model, according to the target The location information of the area, such as the coordinate information, is intercepted from the second image, and the area image corresponding to the position of the target area is intercepted.

After obtaining the regional image, the intercepted regional image processed by the image processing model is overlaid on the original first image according to the coordinate information of the target area, so that the image content in the target area is completely replaced by area image, so that in the replaced target image, only the area that needs to be erased is replaced by the processed image, and the content of the target area that needs to be processed is erased while retaining the consistency of the overall image. It also ensures that other similar areas in the image will not be erased by mistake, which improves the processing efficiency when erasing specific content in the image.

In some embodiments of the present application, before erasing the image information of the preset area in the first image, the method further includes:

Obtaining a first training image and a second training image, wherein the second training image is an image obtained by removing preset image information from the first training image;

Train the preset model through the first training image and the second training image to obtain a trained image processing model, where the image processing model includes a first network and a second network;

Erase the image information of the preset area in the first image, including:

Perform erasure processing on the first image through the first network to obtain a processed third image;

Perform downsampling on the third image to obtain the fourth image;

Perform erasure processing on the fourth image through the second network to obtain the processed fifth image;

Perform upsampling processing on the fifth image to obtain the second image.

In this embodiment of the present application, a preset adversarial model is trained to obtain a trained image processing model. Specifically, first, a first image is collected, and the first image is manually processed to generate a second image. The first image is an original image, and the second image is a second image obtained by erasing preset areas such as text content through image editing or image modification software.

Based on the first image and the second image, used as a training set, a preset generative adversarial network is trained to perform end-to-end erasing of text in the entire image. The specific method is to build a network model. For the generation network, the unet network structure is used. The image is first down-sampled to obtain the image semantic information, and then the image is up-sampled and restored to the original size to obtain the image output. Considering that complex scenes are processed once by the model, excellent The result is not good, so after the unet (a U-shaped network for dense prediction segmentation) structure, a layer of lightweight unet results are added for further processing to obtain the final processing result.

Further, a discriminant network is constructed. The discriminant network includes a dual-scale network. The first scale network includes several cascaded convolutions, in which the step size of the convolution can be set to 1. The pooling layer is not introduced to ensure The image resolution is not reduced.

Then, after the first scale network, another layer of the same scale network is added, but the input of the second layer of scale network is obtained by downsampling the output of the first layer network twice. The ground truth of the dual-scale network is obtained from the erased image and the image that is downsampled twice as much as the erased image.

The output result of the generation network is input to the discriminant network, and the difference between the currently generated erased image and the original annotated erased image (i.e., the second image) is judged through the discriminant network, and is used as the reverse loss (loss) Propagate the optimized network parameters and finally obtain the optimized network structure.

In this optimized network structure, the model obtained after removing the discriminant network is the above-mentioned image processing model. Figure 5 shows a schematic structural diagram of an image processing model according to an embodiment of the present application. As shown in Figure 5, the image processing model 500 includes a 2-layer unet network structure, that is, a first network 502 and a second network 504.

When erasing specific content in the first image 506 through the image processing model 500, first, the first image 506 to be processed is input into the first network 502, and the specific information therein is processed through the first network 502. Erase.

After erasing, the processed third image 508 is obtained. The third image 508 is down-sampled to obtain a fourth image 510 with reduced resolution, thereby obtaining the image semantic definition. The fourth image 510 obtained by down-sampling is used as the second image. layer network, that is, the input of the second network 504.

Finally, the fifth image 512 output by the second network 504 is upsampled and restored to its original size to obtain the final second image 514.

It can be understood that the content contained in the first image is the content that the user needs to erase. If the user needs to erase the text information in the image, the first image contains the text information. If the user needs to erase the face, the first image contains face information.

This application trains the image recognition model so that the image recognition model can The concentrated image content is erased, and the corresponding content area in the first image is erased, so that the erased image can maintain the coordination of the overall image and improve image processing efficiency.

In some embodiments of the present application, the target area is a character image area, and locating the target area in the first image includes:

Perform optical character recognition on the first image and obtain a character detection frame in the first image;

Locate the target area in the first image according to the coordinate information of the character detection frame.

In this embodiment of the present application, the target area specifically includes a character image area, that is to say, the user needs to erase the character area in the first image. Specifically, optical character recognition (OCR) is first performed on the first image, so that the positions of characters in the first image are detected, and at the same time, character detection frames are formed based on the positions of these characters.

Specifically, the first image is first preprocessed. In some embodiments, a denoising algorithm is used to remove noise on the first image. After that, according to the trained OCR detection model, the text or characters are obtained through the OCR detection algorithm, and the coordinate information is located. The OCR detection algorithm can obtain character detection frames in various scenarios such as horizontal, vertical, and curved.

For horizontal and vertical regular character detection frames, a four-point coordinate frame can be used to represent them; while for curved irregular character detection frames, an eight-point coordinate frame can be used. If there is no text information or character information in the current picture, an empty character coordinate box will be returned.

The character detection frame is marked with coordinate information, and the coordinate information refers to the coordinates of the character detection frame in the first image. The target area is located in the first image through the coordinate information of the character detection frame, so that after the second image is obtained through the image processing model, the processed area image is overlaid on the target area according to the coordinate information, so that the generated In the target image, it not only ensures the coordination of the overall image, but also effectively erases the content of the target area that needs to be processed. It also ensures that other similar areas in the image will not be mistakenly erased, which improves the efficiency of erasing specific areas in the image. content processing efficiency.

In some embodiments of the present application, Figure 6 shows the second flowchart of the image processing method according to the embodiment of the present application. As shown in Figure 6, the method includes:

Step 602: Perform text positioning on the original image and record the text coordinate information in the original image;

In step 602, the image is first preprocessed, and a denoising algorithm is used to reduce the noise on the image. Remove. Then train the OCR detection model and obtain text positioning information through the OCR detection algorithm. The OCR detection algorithm can obtain text detection frames in various scenarios such as horizontal, vertical, and curved. Horizontal and vertical text boxes are represented by a four-point coordinate frame, while curved text boxes are represented by an eight-point coordinate frame. If there is no text information in the current picture, an empty text coordinate box will be returned.

Step 604: Collect paired data and train a generative adversarial network to perform text erasure;

In step 604, pairs of <original pictures, text-erased pictures> are collected, and model training is performed using a generative adversarial method to obtain an image processing model.

Specifically, the original image is first collected, and the corresponding text-erased image is obtained through PS, thereby obtaining a pair of <original image, text-erased image>.

Afterwards, a generative adversarial network is trained to perform end-to-end erasure of entire image text. The specific method is to build a network model. For the generative network, the unet network structure is used. The image is first down-sampled to obtain image semantic information, and then the image is up-sampled. Restore the original size to obtain the image output. Considering that good results cannot be obtained through one-time model optimization for complex scenes, a layer of lightweight unet is added after the unet structure for further processing to obtain the final processing result. Construct a discriminant network. The discriminant network consists of a dual-scale network. The first scale network consists of several convolution cascades. The step size of the convolution is set to 1. The pooling layer is not introduced to ensure the resolution of the image. does not decrease, then add a layer of the same scale network after the first scale network, but the input of the second layer of scale network is obtained by downsampling twice the output of the first layer of network. The ground truth of the dual-scale network is obtained from the erased image and the image that is downsampled twice as much as the erased image.

The output result of the generation network is input into the discriminant network to determine the difference between the currently generated erased image and the original annotated erased image, and is used as loss backpropagation to optimize the network parameters, and finally the optimized network structure is obtained.

Step 606: Use the optimized model to infer the original image to obtain the erased image result;

In step 606, the optimized generative adversarial model is used to infer the original image to obtain the erased image result. The specific method is to input the original image into the trained model. At this time, the discriminator needs to be removed and only the generator is retained. That is, the output image obtained after passing through the generator is the wiped Image result after division.

Step 608: Map the obtained text coordinate information to the erased picture, and crop out the erased area where the text coordinates are located;

In step 608, map the text coordinate information obtained in the step to the obtained erased picture result, and cut out the erased area where the text coordinates are located; considering that some areas that are very similar to text, such as fences and textures, , flowers, grass, etc. are easily erased as text. Therefore, in order to ensure that this information is not mistakenly scrawled, it is necessary to map the text coordinate information recorded from the original image to the acquired erased image. Because the coordinates are non-rectangular frames, it is necessary to Set a mask image as large as the input image. The mask image is originally pure black. Set the area where the coordinate box is located to white to get the erased area where the text coordinates that need to be cropped are located.

Step 610: Paste the erased area back to the original image to obtain the final text-erased image.

In step 610, the cropped erasure area is pasted back to the original image to obtain the final text erasure image.

The specific method is to obtain the mask image corresponding to the text area that needs to be erased. At this time, you only need to paste the erased area corresponding to the pure white area in the mask image back to the original image, so that you can get the final result. The required text is erased from the picture, thus avoiding the accidental painting of areas that are very similar to the text.

In some embodiments of the present application, an image processing device is provided. Figure 7 shows a structural block diagram of the image processing device according to an embodiment of the present application. As shown in Figure 7, the image processing device 700 includes:

Positioning module 702 is used to locate the target area in the first image and obtain the location information of the target area;

The erasing module 704 is used to erase the image information of the preset area in the first image through the image processing model to obtain an erased second image, where the size of the second image is the same as the first image, and the preset area Include target areas;

The interception module 706 is used to intercept the area image corresponding to the target area in the second image according to the location information;

The processing module 708 is used to generate a processed target image according to the area image and the first image.

In some embodiments of the present application, the image processing device further includes: a covering module, configured to overlay the area image on the target area according to the location information to obtain the target image.

After obtaining the regional image, the embodiment of the present application also overlays the intercepted regional image processed by the image processing model on the original first image according to the coordinate information of the target area, thereby making the image content in the target area , completely replaced by a regional image, so that in the replaced target image, only the area that needs to be erased is replaced by the processed image. While retaining the consistency of the overall image, the target area that needs to be processed is improved. The content is erased while ensuring that other similar areas in the image will not be accidentally erased, which improves the processing efficiency when erasing specific content in the image.

In some embodiments of the present application, the processing device further includes:

An acquisition module, configured to acquire a first training image and a second training image, where the second training image is an image obtained by removing preset image information from the first training image;

A training module used to train a preset model through the first training image and the second training image to obtain a trained image processing model, where the image processing model includes a first network and a second network;

The erasing module is also used to erase the first image through the first network to obtain the processed third image;

The sampling module is used to downsample the third image to obtain the fourth image;

The erasing module is also used to erase the fourth image through the second network to obtain the processed fifth image;

The sampling module is also used to upsample the fifth image to obtain the second image.

In some embodiments of the present application, the target area is a character image area, and the processing device further includes:

A recognition module, configured to perform optical character recognition on the first image and obtain a character detection frame in the first image;

The positioning module is also used to locate the target area in the first image based on the coordinate information of the character detection frame.

The embodiment of the present application locates the target area in the first image, so that after the second image is obtained through the image processing model, the processed area image is overlaid on the target area according to the coordinate information, so that in the generated target image, It not only ensures the coordination of the overall image, but also effectively erases the content of the target area that needs to be processed. It also ensures that other similar areas in the image will not be mistakenly erased, improving the processing when erasing specific content in the image. efficiency.

The image processing device in the embodiment of the present application may be an electronic device or a component in the electronic device, such as an integrated circuit or a chip. The electronic device may be a terminal or other devices other than the terminal. For example, the electronic device can be a mobile phone, a tablet computer, a notebook computer, a handheld computer, a vehicle-mounted electronic device, a mobile internet device (Mobile Internet Device, MID), or augmented reality (AR)/virtual reality (VR). ) equipment, robots, wearable devices, ultra-mobile personal computers (UMPC), netbooks or personal digital assistants (personal digital assistants, PDA), etc., and can also be servers, network attached storage (Network Attached Storage), NAS), personal computer (PC), television (TV), teller machine or self-service machine, etc., the embodiments of this application are not specifically limited.

The image processing device in the embodiment of the present application may be a device with an operating system. The operating system can be an Android operating system, an iOS operating system, or other possible operating systems, which are not specifically limited in the embodiments of this application.

The image processing device provided by the embodiments of the present application can implement various processes implemented by the above method embodiments. To avoid duplication, they will not be described again here.

Optionally, the embodiment of the present application also provides an electronic device. Figure 8 shows a structural block diagram of the electronic device according to the embodiment of the present application. As shown in Figure 8, the electronic device 800 includes a processor 802, a memory 804, and a storage device 800. Programs or instructions on the memory 804 that can be run on the processor 802, when executed by the processor 802, implement the various processes of the above method embodiments, and can achieve the same technical effect. To avoid duplication, I won’t go into details here.

It should be noted that the electronic devices in the embodiments of the present application include the above-mentioned mobile electronic devices and non-mobile electronic devices.

The electronic device 900 includes but is not limited to: radio frequency unit 901, network module 902, audio output unit 903, input unit 904, sensor 905, display unit 906, user input unit 907, interface unit 908, memory 909, processor 910 and other components .

Those skilled in the art can understand that the electronic device 900 may also include a power supply (such as a battery) that supplies power to various components. The power supply may be logically connected to the processor 910 through a power management system, thereby managing charging, discharging, and function through the power management system. Consumption management and other functions. The structure of the electronic device shown in Figure 9 does not constitute a limitation on the electronic device. The electronic device may include more or less components than shown in the figure, or combine certain components, or arrange different components, which will not be described again here. .

Wherein, the processor 910 is used to locate the target area in the first image and obtain the location information of the target area;

According to the position information of the marked target area, the embodiment of the present application intercepts the area image at the same position in the second image erased by the image processing model, and combines the area image with the original first image, that is, Only the parts of the second image that the user needs to eliminate are selected to process the original image. Therefore, the final target image retains the consistency of the overall image. In this case, the content of the target area that needs to be processed is erased, while ensuring that other similar areas in the image will not be erased by mistake, which improves the processing efficiency when erasing specific content in the image.

Optionally, the processor 910 is also configured to overlay the area image on the target area according to the location information to obtain the target image.

Optionally, the processor 910 is also configured to obtain a first training image and a second training image, where the second training image is image data obtained after erasing a preset area in the first training image;

Erase the image information of the preset area in the first image, including:

Perform downsampling on the third image to obtain the fourth image;

Perform upsampling processing on the fifth image to obtain the second image.

This application trains the image recognition model, so that the image recognition model can erase the corresponding content area in the first image according to the image content in the training set, and can make the erased image maintain the coordination of the overall image. , improve image processing efficiency.

Optionally, the target area is a character image area, and the processor 910 is further configured to perform optical character recognition on the first image, and obtain a character detection frame in the first image;

The embodiment of the present application locates the target area in the first image, so as to obtain the target area through the image processing model. After arriving at the second image, based on the coordinate information, the processed area image is overlaid on the target area, so that the generated target image not only ensures the coordination of the overall image, but also adjusts the content of the target area that needs to be processed. In order to effectively erase, it also ensures that other similar areas in the image will not be accidentally erased, and improves the processing efficiency when erasing specific content in the image.

It should be understood that in the embodiment of the present application, the input unit 904 may include a graphics processor (Graphics Processing Unit, GPU) 9041 and a microphone 9042. The graphics processor 9041 is responsible for the image capture device (GPU) in the video capture mode or the image capture mode. Process the image data of still pictures or videos obtained by cameras (such as cameras). The display unit 906 may include a display panel 9061, which may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 907 includes a touch panel 9071 and at least one of other input devices 9072 . Touch panel 9071, also known as touch screen. The touch panel 9071 may include two parts: a touch detection device and a touch controller. Other input devices 9072 may include but are not limited to physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which will not be described again here.

Memory 909 can be used to store software programs as well as various data. The memory 909 may mainly include a first storage area for storing programs or instructions and a second storage area for storing data, wherein the first storage area may store an operating system, an application program or instructions required for at least one function (such as a sound playback function, Image playback function, etc.) etc. Additionally, memory 909 may include volatile memory or nonvolatile memory, or memory 909 may include both volatile and nonvolatile memory. Among them, the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electrically removable memory. Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory. Volatile memory can be random access memory (Random Access Memory, RAM), static random access memory (Static RAM, SRAM), dynamic random access memory (Dynamic RAM, DRAM), synchronous dynamic random access memory (Synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDRSDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (Synch link DRAM) , SLDRAM) and Direct Rambus RAM (DRRAM). Memory 909 in embodiments of the present application includes, but is not limited to, these and any other suitable types of memory.

The processor 910 may include one or more processing units; optionally, the processor 910 integrates an application processor and a modem processor, where the application processor mainly handles operations related to the operating system, user interface, application programs, etc., Modem processors mainly process wireless communication signals, such as baseband processors. It can be understood that the above modem processor may not be integrated into the processor 910.

Embodiments of the present application also provide a readable storage medium. Programs or instructions are stored on the readable storage medium. When the program or instructions are executed by a processor, each process of the above method embodiments is implemented and the same technology can be achieved. The effect will not be described here to avoid repetition.

Wherein, the processor is the processor in the electronic device described in the above embodiment. The readable storage media includes computer-readable storage media, such as computer read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disks or optical disks, etc.

An embodiment of the present application further provides a chip. The chip includes a processor and a communication interface. The communication interface is coupled to the processor. The processor is used to run programs or instructions to implement various processes of the above method embodiments. , and can achieve the same technical effect, so to avoid repetition, they will not be described again here.

It should be understood that the chips mentioned in the embodiments of this application may also be called system-on-chip, system-on-a-chip, system-on-a-chip or system-on-chip, etc.

Embodiments of the present application provide a computer program product. The program product is stored in a storage medium. The program product is executed by at least one processor to implement the processes of the above method embodiments and can achieve the same technical effect. To avoid repetition, we will not go into details here.

It should be noted that, in this document, the terms "comprising", "comprises" or any other variations thereof are intended to cover a non-exclusive inclusion, such that a process, method, article or device that includes a series of elements not only includes those elements, It also includes other elements not expressly listed or inherent in the process, method, article or apparatus. without further restrictions In this case, an element defined by the statement "comprises a..." does not exclude the presence of other identical elements in the process, method, article or device including the element. In addition, it should be pointed out that the scope of the methods and devices in the embodiments of the present application is not limited to performing functions in the order shown or discussed, but may also include performing functions in a substantially simultaneous manner or in reverse order according to the functions involved. Functions may be performed, for example, the methods described may be performed in an order different from that described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better. implementation. Based on this understanding, the technical solution of the present application can be embodied in the form of a computer software product that is essentially or contributes to the existing technology. The computer software product is stored in a storage medium (such as ROM/RAM, disk , optical disk), including several instructions to cause a terminal (which can be a mobile phone, computer, server, or network device, etc.) to execute the methods described in various embodiments of this application.

The embodiments of the present application have been described above in conjunction with the accompanying drawings. However, the present application is not limited to the above-mentioned specific implementations. The above-mentioned specific implementations are only illustrative and not restrictive. Those of ordinary skill in the art will Inspired by this application, many forms can be made without departing from the purpose of this application and the scope protected by the claims, all of which fall within the protection of this application.

Claims

An image processing method, including:

Locate the target area in the first image and obtain the location information of the target area;

Erase the image information of the preset area in the first image to obtain an erased second image, wherein the size of the second image is the same as the size of the first image, and the preset area includes the Describe the target area;

According to the position information, intercept an area image corresponding to the target area in the second image;

A processed target image is generated based on the area image and the first image.
The image processing method according to claim 1, wherein generating the processed target image according to the area image and the first image includes:

According to the position information, the area image is overlaid on the target area to obtain the target image.
The image processing method according to claim 1, wherein before erasing the image information of the preset area in the first image, the method further includes:

Obtain a first training image and a second training image, wherein the first training image contains preset image information, and the second training image is obtained after removing the preset image information from the first training image. the resulting image;

Train a preset model through the first training image and the second training image to obtain a trained image processing model, where the image processing model includes a first network and a second network;

The erasing of the image information of the preset area in the first image includes:

Perform erasure processing on the first image through the first network to obtain a processed third image;

Perform downsampling processing on the third image to obtain a fourth image;

Perform erasure processing on the fourth image through the second network to obtain a processed fifth image;

Perform upsampling processing on the fifth image to obtain the second image.
The image processing method according to any one of claims 1 to 3, wherein the target area is a character image area, and locating the target area in the first image includes:

Perform optical character recognition on the first image, and obtain a character detection frame in the first image;

The target area is located in the first image according to the coordinate information of the character detection frame.
An image processing device, which includes:

A positioning module, used to locate the target area in the first image and obtain the position information of the target area;

An erasing module, configured to erase the image information of the preset area in the first image to obtain an erased second image, wherein the size of the second image is the same as the size of the first image, so The preset area includes the target area;

An interception module, configured to intercept an area image corresponding to the target area in the second image according to the location information;

A processing module, configured to generate a processed target image according to the area image and the first image.
The image processing device according to claim 5, further comprising:

A covering module, configured to cover the area image on the target area according to the location information to obtain the target image.
The image processing device according to claim 5, further comprising:

An acquisition module, configured to acquire a first training image and a second training image, wherein the second training image is image data obtained after erasing the preset area in the first training image;

A training module, configured to train a preset model through the first training image and the second training image to obtain the trained image processing model, where the image processing model includes a first network and a second network;

The erasing module is also configured to perform erasing processing on the first image through the first network to obtain a processed third image;

A sampling module, used to perform downsampling processing on the third image to obtain a fourth image;

The erasing module is also configured to perform erasing processing on the fourth image through the second network to obtain a processed fifth image;

The sampling module is also used to perform upsampling processing on the fifth image to obtain the second image.
The image processing device according to any one of claims 5 to 7, wherein the target area is a character image area, and the processing device further includes:

A recognition module, configured to perform optical character recognition on the first image and obtain a character detection frame in the first image;

The positioning module is also used to position the target area in the first image according to the coordinate information of the character detection frame.
An electronic device, which includes a processor and a memory, the memory stores programs or instructions that can be run on the processor, and when the programs or instructions are executed by the processor, the implementation of claims 1 to 4 is achieved. The steps of the image processing method described in any one of the above.
A readable storage medium, wherein a program or instructions are stored on the readable storage medium, and when the program or instructions are executed by a processor, the steps of the image processing method according to any one of claims 1 to 4 are implemented. .
A chip, wherein the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the method described in any one of claims 1-4 Image processing method steps.
A computer program product, wherein the program product is stored in a non-volatile storage medium, and the program product is executed by at least one processor to implement the image processing method according to any one of claims 1-4 A step of.
An image processing device, wherein the device is configured to perform the image processing method according to any one of claims 1-4.