CN112101368B

CN112101368B - Character image processing method, device, equipment and medium

Info

Publication number: CN112101368B
Application number: CN202011004751.2A
Authority: CN
Inventors: 曲福
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-09-22
Filing date: 2020-09-22
Publication date: 2023-08-18
Anticipated expiration: 2040-09-22
Also published as: CN112101368A

Abstract

The application discloses a character image processing method, a device, equipment and a medium, and relates to the technical field of cloud computing. The specific implementation scheme is as follows: identifying characters in a target image to obtain character area information and character form information in the target image; calibrating the character area information according to the character form information to obtain calibrated character area information; and eliminating characters from the target image according to the calibrated character area information. The embodiment of the application realizes the effect of calibrating the recognized character area information, thereby improving the accuracy of the character area information and ensuring the accuracy of finally eliminating characters.

Description

Character image processing method, device, equipment and medium

Technical Field

The embodiment of the application relates to the technical field of image processing, in particular to a cloud computing technology, and particularly relates to a character image processing method, device, equipment and medium.

Background

With the continuous progress of artificial intelligence technology, the situation of using artificial intelligence technology to perform intelligent analysis on an image document is more and more frequent, such as performing direction and skew correction on the image document, performing layout analysis, performing content recognition, and the like. The artificial intelligence technology can greatly facilitate the staff to input or audit the image documents, thereby improving the processing efficiency of various business processes and greatly shortening the processing time.

Disclosure of Invention

The embodiment of the application discloses a character image processing method, a device, equipment and a medium, which are used for solving the problem of low accuracy in character area information identification in the prior art.

According to an aspect of the present disclosure, there is provided a character image processing method including:

identifying characters in a target image to obtain character area information and character form information in the target image;

calibrating the character area information according to the character form information to obtain calibrated character area information;

and eliminating characters from the target image according to the calibrated character area information.

According to another aspect of the present disclosure, there is provided a character image processing apparatus including:

the character recognition module is used for recognizing characters in the target image to obtain character area information and character form information in the target image;

the area calibration module is used for calibrating the character area information according to the character form information to obtain calibrated character area information;

and the character rejecting module is used for rejecting characters from the target image according to the calibrated character area information.

According to another aspect of the disclosure, an embodiment of the present application further discloses an electronic device, including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the character image processing method according to any one of the embodiments of the present application.

According to another aspect of the disclosure, an embodiment of the present application also discloses a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the character image processing method according to any one of the embodiments of the present application.

According to the technology provided by the application, the effect of calibrating the identified character area information is realized, so that the accuracy of the character area information is improved, and the accuracy of finally eliminating characters is ensured.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are included to provide a better understanding of the present application and are not to be construed as limiting the application.

FIG. 1 is a flow chart of a character image processing method disclosed in accordance with an embodiment of the present application;

FIG. 2A is a flow chart of a character image processing method disclosed in accordance with an embodiment of the present application;

FIG. 2B is a schematic diagram of a related character image processing scenario disclosed in accordance with an embodiment of the present application;

FIG. 2C is a schematic diagram of a character image processing scenario disclosed in accordance with an embodiment of the present application;

FIG. 3A is a flow chart of a character image processing method disclosed in accordance with an embodiment of the present application;

FIG. 3B is a schematic diagram of a character image process according to an embodiment of the present application;

FIG. 3C is a schematic diagram of a character image processing scenario disclosed in accordance with an embodiment of the present application;

fig. 4 is a schematic diagram of a character image processing apparatus according to an embodiment of the present application;

fig. 5 is a block diagram of an electronic device disclosed in accordance with an embodiment of the application.

Detailed Description

Exemplary embodiments of the present application will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present application are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the research and development process, the applicant finds that at present, for the reconstruction of the table in the document image, character area information in the table is generally identified by utilizing a character identification technology, then character rejection is carried out according to the identified character area information, and only table lines are reserved, so that the reconstruction of the table is completed. However, the accuracy of the character area information identified by the existing optical character recognition technology is low, and the problem that the identified character area information is too large or too small exists. If the recognized character area information is too large, when the distance between the table grid line and the characters in the table is relatively short, the table grid line is also mistakenly recognized into the character area information, so that the table grid line is also removed; if the recognized character area information is too small, the character area information cannot fully cover the character image, so that the character elimination is incomplete. In short, the effect of reconstructing the form is poor by performing character culling based on the existing character recognition technology.

Fig. 1 is a flowchart of a character image processing method according to an embodiment of the present application, which can be applied to a case of eliminating a character image in a document image. The method of the embodiment can be performed by a character image processing device, and the device can be implemented by software and/or hardware and can be integrated on any electronic device with computing capability.

As shown in fig. 1, the character image processing method disclosed in the present embodiment may include:

s101, identifying characters in a target image to obtain character area information and character form information in the target image.

The target image includes any document image or video frame image containing character information, and the format of the target image includes, but is not limited to, bmp format, jpg format, png format, and the like.

In one embodiment, after the target image is acquired, the target image is subjected to image preprocessing including, but not limited to, image noise reduction, image enhancement, image smoothing, or image binarization, etc. And recognizing characters in the preprocessed target image by using the existing character recognition technology including but not limited to OCR (Optical Character Recognition ) technology to obtain character area information and character form information in the target image. Wherein the character region information embodies the region position of each character image in the target image, including but not limited to the following two alternative forms: 1. the character image circumscribes four corner coordinates of the rectangle. 2. The height and width of the circumscribed rectangle of the character image, and the distance between the center point of the circumscribed rectangle of the character image and the left/right boundary and the upper/lower boundary of the target image, respectively. The character form information shows the appearance form of the character image.

By identifying the characters in the target image, character area information and character form information in the target image are obtained, the area positioning of the character image is realized, and a foundation is laid for the subsequent calibration of the character area information.

S102, calibrating the character area information according to the character form information to obtain calibrated character area information.

In one embodiment, according to character form information of any character image in the identified target image, a clean image conforming to the character appearance form of the character image is established, wherein all character pixels of the character image are included in the clean image, and non-character pixels not belonging to the character image are not included. Matching the pixel points in the clean image with the pixel points included in the character area information, and if any pixel point in the clean image can be matched with any pixel point in the character area information, obtaining calibrated character area information based on the matched pixel point in the character area information; if any pixel point in the clean image cannot be matched with the pixel point in the character area information, the character area information is expanded, and calibrated character area information is obtained.

Optionally, S102 includes:

generating a character template image according to the character form information; and registering the character template image with the target image according to the character area information, and obtaining calibrated character area information according to a registration result.

In one embodiment, a character template image is generated from character form information by an existing document editing tool. Registering a character template image with an original image in character area information in a target image by adopting a classical registration algorithm, such as an average absolute difference algorithm, an error square sum algorithm, a normalization product algorithm, a sequential similarity detection algorithm and the like, and if any pixel point in the template image can be successfully registered with any pixel point in the original image, acquiring calibrated character area information based on the successfully registered pixel point in the original image; if any pixel point in the template image cannot be matched with the pixel point in the original image, the character area information is expanded, and calibrated character area information is obtained.

According to the character form information, a character template image is generated, the character template image and the target image are registered according to the character area information, and calibrated character area information is obtained according to a registration result.

The character area information is calibrated according to the character form information to obtain calibrated character area information, so that the effect of calibrating the identified character area information is achieved, and the accuracy of the character area information is improved.

S103, eliminating characters from the target image according to the calibrated character area information.

In one embodiment, a background image of the target image is acquired, and optionally, the acquiring method includes: and detecting contour pixel points of the target image, determining all contour pixel points in the target image, and taking the target image without the contour pixel points as a background image. And further, according to the determined background image, calculating an average pixel value of the background image as a background value, and finally setting the pixel value of the pixel point of the target image in the calibrated character area information as the background value.

The character is removed from the target image according to the calibrated character area information, so that the effect of removing the character image in the target image is achieved, the accuracy of finally removing the character is guaranteed due to the fact that the character area information is calibrated, and the requirements of a user on processing the image document, such as form reconstruction or watermark removal, are met.

According to the technical scheme of the embodiment, the recognized character area information is calibrated according to the recognized character form information, and characters are removed from the target image according to the calibrated character area information, so that the effect of calibrating the recognized character area information is achieved, the accuracy of the character area information is improved, and the problems that if the character area information is too large, non-text information is possibly included, and if the character area is too small, the character area information cannot completely cover the character image are avoided. And correspondingly, the accuracy of finally eliminating the characters is ensured, so that the non-text information in the target image is effectively reserved.

Fig. 2A is a flowchart of a character image processing method according to an embodiment of the present application, which is further optimized and expanded based on the above technical solution, and may be combined with the above various alternative embodiments.

As shown in fig. 2A, the method may include:

s201, identifying characters in a target image to obtain character area information and character form information in the target image.

S202, generating the character template image according to the character content and the character font in the character form information.

Wherein, the character content, i.e. the word appearance of the character image, such as "item" or "mesh" can be used as the character content; a character font, i.e., a font of a character image, such as "Song Ti", "regular script", or "script", etc., may be used as the character font.

And generating a character template image according to the character content and the character font in the character form information, so that the character template image is more standard, and the reliability of final character area information calibration is ensured.

And S203, registering the character template image and the target image according to the character area information to obtain target pixel points in the target image associated with the template pixel points in the character template image.

In one embodiment, an original image in the target image, which is located in the character area information, is determined according to the character area information, the character template image is registered with the original image in the target image, a mapping relation between pixel points in the character template image and pixel points in the original image is established, and a target pixel point in the target image, which is associated with a template pixel point in the character template image, is obtained according to the mapping relation.

S204, the area information of the target pixel point in the target image is used as calibrated character area information.

In one embodiment, according to the position information of each target pixel point in the target image, the area information of all the target pixel points in the target image is obtained, and the area information is used as the calibrated character area information.

S205, eliminating characters from the target image according to the calibrated character area information.

Fig. 2B is a schematic diagram of a related character image processing scenario according to an embodiment of the present application, as shown in fig. 2B, characters in a target image 20 are identified to obtain character area information 21 in the target image 20, and characters are removed from the target image 20 according to the character area information 21 to obtain an image 22 after the characters are removed. Because the character area information 21 is limited by the character recognition accuracy, the character area information 21 includes the non-character information table line 23 in addition to the character information, and if the character is rejected directly according to the character area information 21, the character is rejected together with the non-character information table line 23.

Fig. 2C is a schematic diagram of a character image processing scenario according to an embodiment of the present application, where, as shown in fig. 2C, characters in the target image 20 are identified, so as to obtain character area information 21 and character form information in the target image 20. According to the embodiment of the application, the character template image 24 is generated according to the character content and the character font in the character form information, and the registration is carried out on the character template image 24 and the target image 20, so that the target pixel point in the target image 20 associated with the template pixel point in the character template image 24 is obtained. Finally, the area information of the target pixel point in the target image 20 is used as calibrated character area information 25, and characters are removed from the target image 20 according to the calibrated character area information 25, so that a character-removed image 26 is obtained.

According to the technical scheme of the embodiment, the target pixel point in the target image associated with the template pixel point in the character template image is obtained by registering the character template image and the target image according to the character area information, and the area information of the target pixel point in the target image is used as the calibrated character area information, so that the effect of calibrating the identified character area information is achieved, the accuracy of the character area information is improved, the problem that non-character information is possibly included if the character area information is overlarge is avoided, and the non-character information in the target image is effectively reserved when characters are removed according to the calibrated character area information.

Fig. 3A is a flowchart of a character image processing method according to an embodiment of the present application, which is further optimized and expanded based on the above technical solution, and may be combined with the above various alternative embodiments.

As shown in fig. 3A, the method may include:

s301, identifying characters in a target image to obtain character area information and character form information in the target image.

S302, generating the character template image according to the character content and the character font in the character form information.

S303, determining a mapping relation between template pixel points in the character template image and pixel points at the character region information based on an image registration technology.

In one embodiment, an image in character area information in a target image is used as an original image, an image registration technology is based on the image registration technology, a character template image and the original image are registered, features of the character template image and the original image are firstly extracted, feature descriptors are regenerated, finally matching is carried out between the features of the character template image and the original image according to the similarity degree of the descriptors, and therefore mapping relations between template pixel points in the character template image and pixel points in the character area information are determined, wherein the features include but are not limited to point features, edge features, area features and the like.

S304, if the registration of any template pixel point fails, expanding the character area information, and continuing the registration according to the expanded character area information.

In one embodiment, if registration of any template pixel point fails, that is, any template pixel point does not establish a mapping relationship with a pixel point at the character area information, it indicates that the current character area information does not cover the whole character image, and the current character area information is enlarged, so that the enlarged character area information can cover the whole character image, and registration is continued according to the enlarged character area information.

Optionally, expanding the character area information includes:

and expanding the character area information according to the relative position of the character area information in the target image.

Wherein the character area information defaults to a rectangular shape with the target image. The relative position includes the following: relative position 1: only the left boundary of the character area information coincides with the left boundary of the target image. Relative position 2: only the right boundary of the character area information coincides with the right boundary of the target image. Relative position 3: only the upper boundary of the character area information coincides with the upper boundary of the target image. Relative position 4: only the lower boundary of the character area information coincides with the lower boundary of the target image. Relative position 5: the left boundary of the character area information coincides with the left boundary of the target image, and the upper boundary also coincides with the upper boundary of the target image. Relative position 6: the left boundary of the character area information coincides with the left boundary of the target image, and the lower boundary also coincides with the lower boundary of the target image. Relative position 7: the right boundary of the character area information coincides with the right boundary of the target image, and the upper boundary also coincides with the upper boundary of the target image. Relative position 8: the right boundary of the character area information coincides with the right boundary of the target image, and the lower boundary also coincides with the lower boundary of the target image. Relative position 9: the boundary of the character area information does not coincide with the boundary of the target image.

In one embodiment, if the relative position is a relative position 1, the enlarged character area information includes: and expanding the areas above, below and right of the character area information according to the number of the preset pixels. If the relative position is the relative position 2, the enlarged character area information includes: and expanding the areas above, below and left of the character area information according to the number of the preset pixels. If the relative position is the relative position 3, the enlarged character area information includes: and expanding the left, lower and right areas of the character area information according to the number of the preset pixels. If the relative position is the relative position 4, the enlarged character area information includes: and expanding the areas above, below and left of the character area information according to the number of the preset pixels. If the relative position is the relative position 5, the enlarged character area information includes: and expanding the area below and to the right of the character area information according to the number of the preset pixels. If the relative position is the relative position 6, the enlarged character area information includes: and expanding the area above and to the right of the character area information according to the number of the preset pixels. If the relative position is the relative position 7, the enlarged character area information includes: and expanding the area below and to the left of the character area information according to the number of the preset pixels. If the relative position is the relative position 8, the enlarged character area information includes: and expanding the area above and to the left of the character area information according to the number of the preset pixels. If the relative position is the relative position 9, the enlarged character area information includes: and expanding the areas above, below, left and right of the character area information according to the number of the preset pixels.

By expanding the character area information according to the relative positions of the character area information in the target image, the effect of expanding the character area information according to different relative positions is realized, and the flexibility of expanding the character area information is improved.

S305, obtaining target pixel points in a target image associated with the template pixel points in the character template image, and taking the area information of the target pixel points in the target image as calibrated character area information.

S306, eliminating characters from the target image according to the calibrated character area information.

According to the technical scheme of the embodiment, the mapping relation between the template pixel points in the character template image and the pixel points at the character area information is determined based on the image registration technology, if any template pixel point fails to register, the character area information is enlarged, and the registration is continued according to the enlarged character area information, so that the effect of calibrating the recognized character area information is achieved, the accuracy of the character area information is improved, the problem that if the character area information is too small, the character area information cannot fully cover the character image is avoided, and then when characters are removed according to the calibrated character area information, the problem that only part of the character image is removed, and the character removing effect is poor is avoided.

Fig. 3B is a schematic diagram of character image processing according to an embodiment of the present application, as shown in fig. 3B, including:

s30, acquiring a target image;

s31, identifying the characters in the target image to obtain character area information, character content and character fonts in the target image.

S32, generating a character template image according to the character content and the character font in the character form information.

S331, registering the character template image and the target image according to the character area information to obtain target pixel points in the target image associated with the template pixel points in the character template image, and taking the area information of the target pixel points in the target image as calibrated character area information.

S332, based on an image registration technology, determining a mapping relation between template pixel points in the character template image and pixel points at the character region information, if any template pixel point fails to register, expanding the character region information, and continuing to register according to the expanded character region information.

S3321, obtaining target pixel points in the target image associated with the template pixel points in the character template image, and taking the area information of the target pixel points in the target image as calibrated character area information.

S34, eliminating characters from the target image according to the calibrated character area information.

The explanation of each step S31 to S34 has been already referred to in the above embodiments, and will not be repeated here.

Fig. 3C is a schematic diagram of a character image processing scenario according to an embodiment of the present application, where, as shown in fig. 3C, characters in a target image 300 are identified to obtain character area information 301 and character form information in the target image 300, the character area information 301 only includes a part of the character information, and if characters are removed directly according to the character area information 301, the character removal is incomplete. Further, according to the character content and the character font in the character form information, a character template image 302 is generated, based on the image registration technology, a mapping relation between the template pixel points in the character template image 302 and the pixel points at the character area information 301 is determined, if any template pixel point registration fails, the character area information 301 is enlarged, registration is continued according to the enlarged character area information 303, finally, a target pixel point in the target image 300 associated with the template pixel point in the character template image 302 is obtained, the area information where the target pixel point is located in the target image is used as calibrated character area information 304, characters are removed from the target image 300 according to the calibrated character area information 304, and an image 305 after the characters are removed is obtained.

Fig. 4 is a schematic structural diagram of a character image processing apparatus according to an embodiment of the present application, which can be applied to a case of eliminating a character image in a document image. The device of the embodiment can be implemented by software and/or hardware, and can be integrated on any electronic equipment with computing capability.

As shown in fig. 4, the character image processing apparatus 40 disclosed in the present embodiment may include a character recognition module 41, a region calibration module 42, and a character culling module 43, wherein:

a character recognition module 41, configured to recognize characters in a target image, and obtain character area information and character morphology information in the target image;

the area calibration module 42 is configured to calibrate the character area information according to the character form information to obtain calibrated character area information;

and a character rejecting module 43, configured to reject characters from the target image according to the calibrated character area information.

Optionally, the area calibration module 42 is specifically configured to:

generating a character template image according to the character form information;

and registering the character template image with the target image according to the character area information, and obtaining calibrated character area information according to a registration result.

Optionally, the area calibration module 42 is specifically further configured to:

and generating the character template image according to the character content and the character font in the character form information.

registering the character template image and the target image according to the character area information to obtain target pixel points in the target image associated with the template pixel points in the character template image;

and taking the area information of the target pixel point in the target image as calibrated character area information.

determining a mapping relation between template pixel points in the character template image and pixel points at the character region information based on an image registration technology;

if the registration of any template pixel point fails, the character area information is enlarged, and the registration is continued according to the enlarged character area information.

The character image processing device 40 disclosed in the embodiment of the application can execute the character image processing method disclosed in the embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method. Reference is made to the description of any method embodiment of the application for details not described in this embodiment.

According to an embodiment of the present application, the present application also provides an electronic device and a readable storage medium.

As shown in fig. 5, there is a block diagram of an electronic device of a character image processing method according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein.

As shown in fig. 5, the electronic device includes: one or more processors 501, memory 502, and interfaces for connecting components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 501 is illustrated in fig. 5.

Memory 502 is a non-transitory computer readable storage medium provided by the present application. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the character image processing method provided by the application. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to execute the character image processing method provided by the present application.

The memory 502 is used as a non-transitory computer readable storage medium for storing non-transitory software programs, non-transitory computer executable programs, and modules such as program instructions/modules (e.g., the character recognition module 41, the region calibration module 42, and the character culling module 43 shown in fig. 4) corresponding to the character image processing method according to the embodiment of the present application. The processor 501 executes various functional applications of the server and data processing, that is, implements the character image processing method in the above-described method embodiment, by running a non-transitory software program, instructions, and modules stored in the memory 502.

Memory 502 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created according to the use of the electronic device of the character image processing method, and the like. In addition, memory 502 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 502 may optionally include memory located remotely from processor 501, which may be connected to the electronic device of the character image processing method via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the character image processing method may further include: an input device 503 and an output device 504. The processor 501, memory 502, input devices 503 and output devices 504 may be connected by a bus or otherwise, for example in fig. 5.

The input device 503 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device of the character image processing method, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointer stick, one or more mouse buttons, a track ball, a joystick, etc. The output devices 504 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibration motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

According to the technical scheme provided by the embodiment of the application, the effect of calibrating the identified character region information is realized, so that the accuracy of the character region information is improved, and the accuracy of finally eliminating characters is ensured.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed embodiments are achieved, and are not limited herein.

The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims

1. A character image processing method, the method comprising:

removing characters from the target image according to the calibrated character area information;

the step of calibrating the character area information according to the character form information to obtain calibrated character area information comprises the following steps:

generating a character template image according to the character content and the character font in the character form information;

registering the character template image and the target image according to the character area information, and obtaining calibrated character area information according to a registration result;

registering the character template image and the target image according to the character area information, and obtaining calibrated character area information according to a registration result, wherein the registering comprises the following steps:

the area information of the target pixel point in the target image is used as calibrated character area information;

the registering the character template image and the target image according to the character area information comprises the following steps:

2. The method of claim 1, wherein the registering the character template image with the target image according to the character region information comprises:

and determining a mapping relation between template pixel points in the character template image and pixel points at the character region information based on an image registration technology.

3. The method of claim 1, wherein expanding the character region information comprises:

4. A character image processing apparatus, the apparatus comprising:

the character rejecting module is used for rejecting characters from the target image according to the calibrated character area information;

the area calibration module is specifically configured to:

the area calibration module is specifically further configured to:

5. The apparatus of claim 4, wherein the zone calibration module is further specifically configured to:

6. The apparatus of claim 4, wherein the zone calibration module is further specifically configured to:

7. An electronic device, comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the character image processing method of any one of claims 1-3.

8. A non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the character image processing method according to any one of claims 1 to 3.