CN114757840A

CN114757840A - Image processing method, image processing device, readable medium and electronic equipment

Info

Publication number: CN114757840A
Application number: CN202210303740.7A
Authority: CN
Inventors: 王杰; 赵珉怿; 周水庚
Original assignee: Fudan University; Beijing Zitiao Network Technology Co Ltd
Current assignee: Fudan University; Beijing Zitiao Network Technology Co Ltd
Priority date: 2022-03-24
Filing date: 2022-03-24
Publication date: 2022-07-15

Abstract

The disclosure relates to an image processing method, an image processing device, a readable medium and an electronic device, and relates to the technical field of image processing, wherein the method comprises the following steps: the method comprises the steps of obtaining a text image to be processed, wherein the text image to be processed comprises a text to be restored, identifying the text to be restored in the text image to be processed according to the text image to be processed, determining text clue characteristics corresponding to the text to be restored according to a text identification result of the text to be restored, and restoring the image according to the text image to be processed and the text clue characteristics to obtain a restored target text image. The method and the device can determine the text clue characteristics for repairing the text image to be repaired according to the recognized text recognition result of the text to be repaired, and carry out image repairing by utilizing the text clue characteristics, so that the high-quality target text image after the text image to be repaired is obtained.

Description

Image processing method, image processing device, readable medium and electronic equipment

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to an image processing method, an image processing apparatus, a readable medium, and an electronic device.

Background

A text image refers to a text existing in an image format, which is widely used in various fields as an important information representation form. At present, text images are mainly high-quality images shot after accurate focusing, so that the difficulty in recognizing texts on the images is low. However, it may be difficult to ensure the quality of the obtained text image in some scenarios, and if the quality of the text image is low, the difficulty of recognizing the text on the text image may be increased, and the accuracy of text recognition may be reduced. Therefore, repairing a text image with low quality is an important problem to be solved urgently.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

In a first aspect, the present disclosure provides a method of image processing, the method comprising:

acquiring a text image to be processed; the text image to be processed comprises a text to be repaired;

Identifying the text to be repaired in the text image to be processed according to the text image to be processed, and determining text clue characteristics corresponding to the text to be repaired according to a text identification result of the text to be repaired;

and performing image restoration according to the text image to be processed and the text clue characteristics to obtain a restored target text image.

In a second aspect, the present disclosure provides an image processing apparatus, the apparatus comprising:

the acquisition module is used for acquiring a text image to be processed; the text image to be processed comprises a text to be repaired;

the processing module is used for identifying the text to be repaired in the text image to be processed according to the text image to be processed and determining text clue characteristics corresponding to the text to be repaired according to a text identification result of the text to be repaired;

and the processing module is also used for carrying out image restoration according to the text image to be processed and the text clue characteristics to obtain a restored target text image.

In a third aspect, the present disclosure provides a computer readable medium having stored thereon a computer program which, when executed by a processing apparatus, performs the steps of the method of the first aspect of the present disclosure.

In a fourth aspect, the present disclosure provides an electronic device comprising:

a storage device having a computer program stored thereon;

processing means for executing the computer program in the storage means to implement the steps of the method of the first aspect of the present disclosure.

According to the technical scheme, the method comprises the steps of firstly obtaining a to-be-processed text image containing a to-be-restored text, then identifying the to-be-restored text in the to-be-processed text image according to the to-be-processed text image, determining text clue characteristics corresponding to the to-be-restored text according to a text identification result of the to-be-restored text, and then performing image restoration according to the to-be-processed text image and the text clue characteristics to obtain a restored target text image. The method and the device can determine the text clue characteristics for repairing the text image to be processed according to the recognized text recognition result of the text to be repaired, and carry out image repairing by using the text clue characteristics, thereby obtaining the high-quality target text image after the text image to be processed is repaired.

Additional features and advantages of the present disclosure will be set forth in the detailed description which follows.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale. In the drawings:

FIG. 1 is a flow diagram illustrating a method of image processing according to an exemplary embodiment;

FIG. 2 is a flow diagram illustrating a method of training an image processing model according to an exemplary embodiment;

FIG. 3 is a block diagram of an image processing apparatus shown in accordance with an exemplary embodiment;

FIG. 4 is a block diagram of a processing module shown in accordance with the embodiment shown in FIG. 3;

FIG. 5 is a block diagram illustrating an electronic device in accordance with an example embodiment.

Detailed Description

It is understood that before the technical solutions disclosed in the embodiments of the present disclosure are used, the type, the use range, the use scene, etc. of the personal information related to the present disclosure should be informed to the user and obtain the authorization of the user through a proper manner according to the relevant laws and regulations.

For example, in response to receiving an active request from a user, a prompt message is sent to the user to explicitly prompt the user that the requested operation to be performed would require the acquisition and use of personal information to the user. Thus, the user can autonomously select whether to provide personal information to software or hardware such as an electronic device, an application program, a server, or a storage medium that performs the operations of the disclosed technical solution, according to the prompt information.

As an alternative but non-limiting implementation manner, in response to receiving an active request from the user, the manner of sending the prompt information to the user may be, for example, a pop-up window manner, and the prompt information may be presented in a text manner in the pop-up window. In addition, a selection control for providing personal information to the electronic device by the user's selection of "agreeing" or "disagreeing" can be carried in the popup.

It is understood that the above notification and user authorization process is only illustrative and not limiting, and other ways of satisfying relevant laws and regulations may be applied to the implementation of the present disclosure.

It will be appreciated that the data involved in the subject technology, including but not limited to the data itself, the acquisition or use of the data, should comply with the requirements of the corresponding laws and regulations and related regulations.

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that the various steps recited in method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

All actions of acquiring signals, information or data in the present disclosure are performed under the premise of complying with the corresponding data protection regulation policy of the country of the location and obtaining the authorization given by the owner of the corresponding device.

FIG. 1 is a flow diagram illustrating a method of image processing according to an exemplary embodiment. As shown in fig. 1, the method may include the steps of:

step 101, acquiring a text image to be processed. The text image to be processed comprises a text to be repaired.

For example, when a text in a text image is recognized, if the quality of the text image is low, the text in the text image may be blurred, which increases the difficulty of recognizing the text in the text image, thereby affecting the accuracy of text recognition. In order to accurately recognize the text in the low-quality text image, the low-quality text image needs to be repaired, so that the blurred text in the low-quality text image is clearer, and the text recognition is facilitated. Since the text image has text characteristics (e.g., strokes, characters, etc.) specific to the text, the text image of low quality can be repaired using the text characteristics of the text image.

Specifically, a to-be-processed text image including a to-be-repaired text may be acquired first. The text image to be processed is a low-quality text image, and the text to be repaired is a fuzzy text in the text image to be processed. For example, in an automatic driving scenario, it is generally necessary to capture a road traffic sign board by using an image capturing device, and identify a corresponding road traffic text from the captured image, so as to obtain corresponding road information (e.g., a road name, a road speed limit condition, etc.). Due to the jitter of shooting and the limitation of the performance of the image acquisition equipment, the shot image may have serious motion blur and noise, so that the road traffic text in the shot image is blurred, and the accuracy of text recognition is influenced. Then the shot image with serious motion blur and noise can be used as a text image to be processed, and the blurred road traffic text therein can be used as a text to be repaired.

And 102, identifying a text to be repaired in the text image to be processed according to the text image to be processed, and determining text clue characteristics corresponding to the text to be repaired according to a text identification result of the text to be repaired.

And 103, repairing the image according to the text image to be processed and the text clue characteristics to obtain a repaired target text image.

For example, after the text image to be processed is obtained, the text to be repaired in the text image to be processed may be recognized first, so as to obtain a text recognition result of the text to be repaired. The text recognition result can be understood as all possible texts corresponding to the text to be repaired, which is recognized by the image processing model, under a clear condition. Then, according to the text recognition result of the text to be repaired, text clue features capable of representing the text characteristics of the text to be repaired are determined, and the text image to be processed is repaired by using the text clue features, so that a final target text image (namely the text image to be processed after image repairing) is obtained.

The method for obtaining the target text image comprises a plurality of implementation modes, wherein one implementation mode is to repair the text image to be processed through a pre-trained image processing model for extracting text clue characteristics to obtain the target text image. Specifically, after the text image to be processed is obtained, the image of the text image to be processed may be restored through a pre-trained image processing model according to the text image to be processed, so as to obtain a restored target text image. The image processing model is used for identifying a text to be restored in the text image to be restored, determining text clue characteristics corresponding to the text to be restored according to a text identification result of the text to be restored, and restoring the image according to the text image to be restored and the text clue characteristics to obtain a target text image.

Further, text recognition can be performed on the target text image to obtain accurate text in the target text image, and the accurate text can be applied to tasks such as text recognition and understanding. It should be noted that, if the text in the target text image still has a blur problem, the target text image may be used as a new text image to be processed, and the above steps are repeated until a target text image whose definition of the text to be repaired meets the user requirement is obtained.

In summary, in the present disclosure, a to-be-processed text image including a to-be-restored text is first obtained, then the to-be-restored text in the to-be-processed text image is identified according to the to-be-processed text image, a text cue feature corresponding to the to-be-restored text is determined according to a text identification result of the to-be-restored text, and then image restoration is performed according to the to-be-processed text image and the text cue feature, so as to obtain a restored target text image. The method and the device can determine the text clue characteristics for repairing the text image to be processed according to the recognized text recognition result of the text to be repaired, and carry out image repairing by using the text clue characteristics, thereby obtaining the high-quality target text image after the text image to be processed is repaired.

Optionally, the image processing model includes a text cue generator and a repairing network, and the image repairing is performed on the to-be-processed text image through the pre-trained image processing model according to the to-be-processed text image to obtain a repaired target text image, and the method can be implemented in the following manner:

step 1), a text clue generator is used for identifying a text to be repaired in a text image to be processed to obtain a text identification result, a target outline image corresponding to the outline of the text to be repaired is generated according to the text identification result, and text clue characteristics are determined according to the text identification result and the target outline image.

Illustratively, the image processing model may include a text cue generator and a healing network. The text clue generator can identify the text to be repaired in the text image to be processed to obtain a text identification result of the text to be repaired, and draw the outline of the text to be repaired to obtain a target outline image. The outline of the text to be repaired is actually the outline of all possible texts corresponding to the text to be repaired in a clear condition, which is identified by the text clue generator. Then, the text cue generator can generate text cue features according to the text recognition result and the target outline image.

In one scenario, the text cue generator can include a text recognizer, a language model, a painter, and a converged network. After the text image to be processed is obtained, the text image to be processed can be input into the text recognizer, the text recognizer is used for recognizing the text to be repaired in the text image to be processed to obtain a text recognition result, and the text recognition result is respectively sent to the language model and the painter by the text recognizer. The text recognizer may be any neural network capable of recognizing a text in an image, for example, in a case that the text recognizer employs a CRNN network, the text recognition result may include a probability distribution corresponding to each character in the text to be repaired, where the probability distribution corresponding to each character is used to characterize the probability that the character is each character in a preset character set.

And secondly, the text recognition result can be adjusted through the language model to obtain an adjusted text recognition result, and the language model sends the adjusted text recognition result to the fusion network. The text recognition result is adjusted through the language model, and actually, the text recognition result is corrected linguistically. And under the condition that the text recognition result comprises the probability distribution corresponding to each character in the text to be repaired, the adjusted text recognition result is actually the probability distribution corresponding to each corrected character. The language model may be any natural language processing model (for example, a language correction model at a character level) capable of correcting the text recognition result, or may be a model for correcting by an edit distance.

And then, drawing the outline of the text to be repaired by the painter according to the text recognition result to obtain a target outline image, and sending the target outline image to the fusion network by the painter. The drawing device can be any tool capable of drawing the outline of the text recognition result, and can also be realized through an impedance production network. For example, in the case of a PIL (English language) used by the painter, the output of the painter may be a black-and-white Image of the Image to be processed, which clearly shows the stroke details of the text to be repaired that needs to be repaired. And finally, fusing the text recognition result, the adjusted text recognition result and the target outline image through a fusion network to obtain the text clue characteristics. The text recognition result, the adjusted text recognition result, and the target contour image actually represent three clues with text characteristics, namely, the recognition result, the linguistic correction result, and the visual result, respectively (that is, the text clue characteristics actually reflect a cross-modal clue). By fusing the three clues, the repairing network can be guided to repair the image to be processed to obtain the target image. The Fusion network may be any neural network capable of fusing different clues, for example, a neural network using technologies such as Gated-Fusion, morpho-convolution, and Transformer.

In another scenario, the text cue generator may include a text recognizer, a painter, and a fusion network, and after the to-be-processed text image is obtained, the to-be-processed text image may be input into the text recognizer, the to-be-repaired text in the to-be-processed text image is recognized by the text recognizer to obtain a text recognition result, and the text recognition result is sent to the painter by the text recognizer. And then, drawing the outline of the text to be repaired by the painter according to the text recognition result to obtain a target outline image, and sending the target outline image to the fusion network by the painter. And finally, fusing the text recognition result and the target contour image through a fusion network to obtain text clue characteristics. The text recognition result and the target contour image actually represent two clues with text characteristics, namely the recognition result and the visual result, respectively, and the restoration network can be guided to restore the image to be processed by fusing the two clues to obtain the target image.

And step 2), carrying out image restoration on the text image to be processed through the restoration network according to the text clue characteristics to obtain the target text image.

For example, after the text cue features are obtained, the image restoration can be performed on the text image to be processed by using the text cue features through the restoration network to restore the details, textures and contours of the text to be restored, so that the image quality, the perceptibility and the definition of the text to be restored are improved, and the target text image is obtained. The repair network may be any deep neural network capable of image repair, and the repair network may include, for example and without limitation: deep convolutional neural networks, transform structure-based repair networks, and the like.

Optionally, the painter draws the outline of the text to be repaired according to the text recognition result to obtain a target outline image, and the method can be implemented through the following steps:

and a) drawing the outline of the text to be repaired by the painter according to the text recognition result to obtain a plurality of outline images to be selected.

And step b), determining the similarity between each contour image to be selected and the text image to be processed, and taking the contour image with the highest similarity as the target contour image.

For example, in a case that the text recognition result includes a probability distribution corresponding to each character in the text to be repaired, before performing outline drawing, the painter needs to decode, according to the probability distribution, to obtain all possible character strings corresponding to the text to be repaired, which is recognized by the text recognizer, under a clear condition. When the text to be repaired only corresponds to one character string, the painter can directly draw the outline of the character string and take the drawn image as a target outline image. When the text to be repaired corresponds to a plurality of character strings, the painter can respectively draw the outline of each character string and take the drawn image as the outline image to be selected corresponding to each character string. Then, the painter can respectively perform feature comparison on each to-be-selected contour image and the to-be-processed text image to determine the similarity between each to-be-selected contour image and the to-be-processed text image, and the to-be-selected contour image with the highest similarity is used as the target contour image.

FIG. 2 is a flowchart illustrating a method of training an image processing model according to an exemplary embodiment. As shown in fig. 2, the image processing model is obtained by training the following steps:

step 201, a training sample set is obtained. The training sample set comprises training text images and training repair images corresponding to the training text images.

And 202, training a preset model according to the training sample set to obtain an image processing model.

For example, a large number of text images containing clear text may be obtained from a network or a database, and the text in each obtained text image may be blurred (e.g., random noise, etc.). And then, each processed text image can be used as a training text image, and an original image corresponding to each processed text image is used as a training repaired image. And then, a training sample set can be constructed by using all the training text images and the training repairing images corresponding to each training text image, and a preset model is trained through the training sample set to obtain a final image processing model.

In summary, in the present disclosure, a to-be-processed text image including a to-be-repaired text is first acquired, the to-be-repaired text in the to-be-processed text image is identified according to the to-be-processed text image, a text cue feature corresponding to the to-be-repaired text is determined according to a text identification result of the to-be-repaired text, and then image repairing is performed according to the to-be-processed text image and the text cue feature, so as to obtain a repaired target text image. The method and the device can determine the text clue characteristics for repairing the text image to be repaired according to the recognized text recognition result of the text to be repaired, and carry out image repairing by utilizing the text clue characteristics, so that the high-quality target text image after the text image to be repaired is obtained.

Fig. 3 is a block diagram illustrating an image processing apparatus according to an exemplary embodiment. As shown in fig. 3, the apparatus 300 includes:

an obtaining module 301, configured to obtain a text image to be processed. And the text image to be processed comprises a text to be repaired.

The processing module 302 is configured to identify a text to be repaired in the text image to be processed according to the text image to be processed, and determine a text cue feature corresponding to the text to be repaired according to a text identification result of the text to be repaired.

The processing module 302 is further configured to perform image inpainting according to the text image to be processed and the text cue features, so as to obtain an inpainted target text image.

Optionally, the processing module 302 is configured to perform image restoration on the to-be-processed text image through a pre-trained image processing model according to the to-be-processed text image, so as to obtain a restored target text image.

The image processing model is used for identifying a text to be restored in the text image to be restored, determining text clue characteristics corresponding to the text to be restored according to a text identification result of the text to be restored, and restoring the image according to the text image to be restored and the text clue characteristics to obtain a target text image.

Fig. 4 is a block diagram of a processing module shown in accordance with the embodiment shown in fig. 3. The image processing model includes a text cue generator and a healing network, as shown in fig. 4, the processing module 302 includes:

the processing submodule 3021 is configured to identify, by using the text cue generator, the text to be repaired in the text image to be processed, obtain a text identification result, generate a target outline image corresponding to an outline of the text to be repaired according to the text identification result, and determine a text cue feature according to the text identification result and the target outline image.

The repairing submodule 3022 is configured to perform image repairing on the text image to be processed through the repairing network according to the text cue features, so as to obtain a target text image.

Optionally, the text cue generator includes a text recognizer, a language model, a painter, and a converged network. Processing submodule 3021 is used to:

and identifying the text to be repaired in the text image to be processed through a text identifier to obtain a text identification result.

And adjusting the text recognition result through the language model to obtain the adjusted text recognition result.

And drawing the outline of the text to be repaired by the painter according to the text recognition result to obtain a target outline image.

And fusing the text recognition result, the adjusted text recognition result and the target contour image through a fusion network to obtain the text clue characteristics.

Optionally, the text cue generator comprises a text recognizer, a painter, and a converged network. Processing submodule 3021 is used to:

And fusing the text recognition result and the target contour image through a fusion network to obtain text clue characteristics.

Optionally, the processing sub-module 3021 is configured to:

and drawing the outline of the text to be repaired by the painter according to the text recognition result to obtain a plurality of outline images to be selected.

And determining the similarity between each contour image to be selected and the text image to be processed, and taking the contour image with the highest similarity to be selected as the target contour image.

Optionally, the image processing model is trained by:

a training sample set is obtained. The training sample set comprises training text images and training restoration images corresponding to the training text images.

And training the preset model according to the training sample set to obtain an image processing model.

In summary, in the present disclosure, a to-be-processed text image including a to-be-repaired text is first acquired, the to-be-repaired text in the to-be-processed text image is identified according to the to-be-processed text image, a text cue feature corresponding to the to-be-repaired text is determined according to a text identification result of the to-be-repaired text, and then image repairing is performed according to the to-be-processed text image and the text cue feature, so as to obtain a repaired target text image. The method and the device can determine the text clue characteristics for repairing the text image to be processed according to the recognized text recognition result of the text to be repaired, and carry out image repairing by using the text clue characteristics, thereby obtaining the high-quality target text image after the text image to be processed is repaired.

Referring now to fig. 5, a schematic diagram of an electronic device (e.g., the terminal device or the server in fig. 1) 600 suitable for implementing embodiments of the present disclosure is shown. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 5, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 5 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may be separate and not incorporated into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a text image to be processed; the text image to be processed comprises a text to be repaired; identifying the text to be repaired in the text image to be processed according to the text image to be processed, and determining text clue characteristics corresponding to the text to be repaired according to a text identification result of the text to be repaired; and performing image restoration according to the text image to be processed and the text clue characteristics to obtain a restored target text image.

Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present disclosure may be implemented by software or hardware. The name of the module does not constitute a limitation to the module itself in some cases, and for example, the acquiring module may also be described as a "module that acquires a text image to be processed".

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

In accordance with one or more embodiments of the present disclosure, example 1 provides an image processing method, the method comprising: acquiring a text image to be processed; the text image to be processed comprises a text to be repaired; identifying the text to be repaired in the text image to be processed according to the text image to be processed, and determining text clue characteristics corresponding to the text to be repaired according to a text identification result of the text to be repaired; and image restoration is carried out according to the text image to be processed and the text clue characteristics to obtain a restored target text image.

According to one or more embodiments of the present disclosure, example 2 provides the method of example 1, where the text to be repaired in the text image to be processed is identified according to the text image to be processed, and a text clue feature corresponding to the text to be repaired is determined according to a text identification result of the text to be repaired; performing image restoration according to the text image to be processed and the text clue characteristics to obtain a restored target text image, including: according to the text image to be processed, image restoration is carried out on the text image to be processed through a pre-trained image processing model, and a restored target text image is obtained; the image processing model is used for identifying the text to be restored in the text image to be restored, determining text clue characteristics corresponding to the text to be restored according to a text identification result of the text to be restored, and restoring the image according to the text image to be restored and the text clue characteristics to obtain the target text image.

Example 3 provides the method of example 2, the image processing model including a text cue generator and a healing network, according to one or more embodiments of the present disclosure; the image restoration of the text image to be processed through a pre-trained image processing model according to the text image to be processed to obtain the restored target text image comprises the following steps: identifying the text to be repaired in the text image to be processed through the text clue generator to obtain a text identification result, generating a target outline image corresponding to the outline of the text to be repaired according to the text identification result, and determining the text clue characteristics according to the text identification result and the target outline image; and performing image restoration on the text image to be processed through the restoration network according to the text clue characteristics to obtain the target text image.

Example 4 provides the method of example 3, the text cue generator comprising a text recognizer, a language model, a painter, and a converged network; the identifying the text to be restored in the text image to be processed by the text clue generator to obtain the text identification result, generating a target contour image corresponding to the contour of the text to be restored according to the text identification result, and determining the text clue characteristics according to the text identification result and the target contour image, includes: identifying the text to be repaired in the text image to be processed through the text identifier to obtain a text identification result; adjusting the text recognition result through the language model to obtain an adjusted text recognition result; drawing the outline of the text to be repaired by the painter according to the text recognition result to obtain the target outline image; and fusing the text recognition result, the adjusted text recognition result and the target contour image through the fusion network to obtain the text clue characteristics.

Example 5 provides the method of example 3, the text cue generator comprising a text recognizer, a painter, and a converged network; the recognizing the text to be repaired in the text image to be processed through the text clue generator to obtain the text recognition result, generating a target outline image corresponding to the outline of the text to be repaired according to the text recognition result, and determining the text clue characteristics according to the text recognition result and the target outline image, includes: identifying the text to be repaired in the text image to be processed through the text identifier to obtain a text identification result; drawing the outline of the text to be repaired through the painter according to the text recognition result to obtain the target outline image; and fusing the text recognition result and the target contour image through the fusion network to obtain the text clue characteristics.

According to one or more embodiments of the present disclosure, example 6 provides the method of example 4 or example 5, wherein the drawing, by the painter, the outline of the text to be repaired according to the text recognition result to obtain the target outline image includes: drawing the outlines of the texts to be repaired through the painter according to the text recognition result to obtain a plurality of outline images to be selected; and determining the similarity between each contour image to be selected and the text image to be processed, and taking the contour image with the highest similarity as the target contour image.

Example 7 provides the method of any one of examples 2-5, the image processing model being trained by: acquiring a training sample set; the training sample set comprises training text images and training repair images corresponding to the training text images; and training a preset model according to the training sample set to obtain the image processing model.

Example 8 provides an image processing apparatus according to one or more embodiments of the present disclosure, the apparatus including: the acquisition module is used for acquiring a text image to be processed; the text image to be processed comprises a text to be repaired; the processing module is used for identifying the text to be repaired in the text image to be processed according to the text image to be processed and determining text clue characteristics corresponding to the text to be repaired according to a text identification result of the text to be repaired; and the processing module is also used for carrying out image restoration according to the text image to be processed and the text clue characteristics to obtain a restored target text image.

According to one or more embodiments of the present disclosure, example 9 provides the apparatus of example 8, where the processing module is configured to perform image restoration on the to-be-processed text image according to the to-be-processed text image through a pre-trained image processing model, so as to obtain the restored target text image; the image processing model is used for identifying the text to be restored in the text image to be processed, determining text clue characteristics corresponding to the text to be restored according to a text identification result of the text to be restored, and performing image restoration according to the text image to be processed and the text clue characteristics to obtain the target text image.

Example 10 provides the apparatus of example 9, the image processing model comprising a text cue generator and a healing network, in accordance with one or more embodiments of the present disclosure; the processing module comprises: the processing submodule is used for identifying the text to be repaired in the text image to be processed through the text clue generator to obtain a text identification result, generating a target outline image corresponding to the outline of the text to be repaired according to the text identification result, and determining the text clue characteristics according to the text identification result and the target outline image; and the restoration submodule is used for carrying out image restoration on the text image to be processed through the restoration network according to the text clue characteristics to obtain the target text image.

Example 11 provides the apparatus of example 10, the text cue generator comprising a text recognizer, a language model, a painter, and a converged network; the processing submodule is used for: identifying the text to be repaired in the text image to be processed through the text identifier to obtain a text identification result; adjusting the text recognition result through the language model to obtain an adjusted text recognition result; drawing the outline of the text to be repaired through the painter according to the text recognition result to obtain the target outline image; and fusing the text recognition result, the adjusted text recognition result and the target contour image through the fusion network to obtain the text clue characteristics.

Example 12 provides the apparatus of example 10, the text cue generator comprising a text recognizer, a painter, and a converged network, according to one or more embodiments of the present disclosure; the processing submodule is used for: identifying the text to be repaired in the text image to be processed through the text identifier to obtain a text identification result; drawing the outline of the text to be repaired through the painter according to the text recognition result to obtain the target outline image; and fusing the text recognition result and the target contour image through the fusion network to obtain the text clue characteristics.

Example 13 provides the apparatus of example 11 or example 12, the processing submodule to: drawing the outline of the text to be repaired through the painter according to the text recognition result to obtain a plurality of outline images to be selected; and determining the similarity between each contour image to be selected and the text image to be processed, and taking the contour image to be selected with the highest similarity as the target contour image.

Example 14 provides the apparatus of any one of examples 9-12, the image processing model being trained by: acquiring a training sample set; the training sample set comprises training text images and training repair images corresponding to the training text images; and training a preset model according to the training sample set to obtain the image processing model.

Example 15 provides, in accordance with one or more embodiments of the present disclosure, a computer-readable medium having stored thereon a computer program that, when executed by a processing device, performs the steps of the methods of examples 1-7.

In accordance with one or more embodiments of the present disclosure, example 16 provides an electronic device, comprising: a storage device having a computer program stored thereon; processing means for executing the computer program in the storage means to implement the steps of the methods of examples 1-7.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims. With regard to the apparatus in the above embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be described in detail here.

Claims

1. An image processing method, characterized in that the method comprises:

2. The method according to claim 1, wherein the text to be repaired in the text image to be processed is identified according to the text image to be processed, and text cue features corresponding to the text to be repaired are determined according to a text identification result of the text to be repaired; performing image restoration according to the text image to be processed and the text clue characteristics to obtain a restored target text image, including:

According to the text image to be processed, image restoration is carried out on the text image to be processed through a pre-trained image processing model, and a restored target text image is obtained;

the image processing model is used for identifying the text to be restored in the text image to be restored, determining text clue characteristics corresponding to the text to be restored according to a text identification result of the text to be restored, and restoring the image according to the text image to be restored and the text clue characteristics to obtain the target text image.

3. The method of claim 2, wherein the image processing model comprises a text cue generator and a healing network; the image restoration of the text image to be processed through a pre-trained image processing model according to the text image to be processed to obtain the restored target text image comprises the following steps:

identifying the text to be repaired in the text image to be processed through the text clue generator to obtain a text identification result, generating a target outline image corresponding to the outline of the text to be repaired according to the text identification result, and determining the text clue characteristics according to the text identification result and the target outline image;

And performing image restoration on the text image to be processed through the restoration network according to the text clue characteristics to obtain the target text image.

4. The method of claim 3, wherein the text cue generator comprises a text recognizer, a language model, a painter, and a converged network; the identifying the text to be restored in the text image to be processed by the text clue generator to obtain the text identification result, generating a target contour image corresponding to the contour of the text to be restored according to the text identification result, and determining the text clue characteristics according to the text identification result and the target contour image, includes:

identifying the text to be repaired in the text image to be processed through the text identifier to obtain a text identification result;

adjusting the text recognition result through the language model to obtain an adjusted text recognition result;

drawing the outline of the text to be repaired by the painter according to the text recognition result to obtain the target outline image;

and fusing the text recognition result, the adjusted text recognition result and the target contour image through the fusion network to obtain the text clue characteristics.

5. The method of claim 3, wherein the text cue generator comprises a text recognizer, a painter, and a converged network; the recognizing the text to be repaired in the text image to be processed through the text clue generator to obtain the text recognition result, generating a target outline image corresponding to the outline of the text to be repaired according to the text recognition result, and determining the text clue characteristics according to the text recognition result and the target outline image, includes:

drawing the outline of the text to be repaired through the painter according to the text recognition result to obtain the target outline image;

and fusing the text recognition result and the target contour image through the fusion network to obtain the text clue characteristics.

6. The method according to claim 4 or 5, wherein the drawing the outline of the text to be repaired by the painter according to the text recognition result to obtain the target outline image comprises:

Drawing the outlines of the texts to be repaired through the painter according to the text recognition result to obtain a plurality of outline images to be selected;

and determining the similarity between each contour image to be selected and the text image to be processed, and taking the contour image with the highest similarity as the target contour image.

7. The method of any of claims 2-5, wherein the image processing model is trained by:

acquiring a training sample set; the training sample set comprises training text images and training repair images corresponding to the training text images;

and training a preset model according to the training sample set to obtain the image processing model.

8. An image processing apparatus, characterized in that the apparatus comprises:

9. A computer-readable medium, on which a computer program is stored, characterized in that the program, when being executed by processing means, carries out the steps of the method of any one of claims 1 to 7.

10. An electronic device, comprising:

a storage device having at least one computer program stored thereon;

at least one processing device for executing the at least one computer program in the storage device to carry out the steps of the method according to any one of claims 1 to 7.