CN112101368B - Character image processing method, device, equipment and medium - Google Patents

Character image processing method, device, equipment and medium Download PDF

Info

Publication number
CN112101368B
CN112101368B CN202011004751.2A CN202011004751A CN112101368B CN 112101368 B CN112101368 B CN 112101368B CN 202011004751 A CN202011004751 A CN 202011004751A CN 112101368 B CN112101368 B CN 112101368B
Authority
CN
China
Prior art keywords
character
area information
image
character area
target image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011004751.2A
Other languages
Chinese (zh)
Other versions
CN112101368A (en
Inventor
曲福
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202011004751.2A priority Critical patent/CN112101368B/en
Publication of CN112101368A publication Critical patent/CN112101368A/en
Application granted granted Critical
Publication of CN112101368B publication Critical patent/CN112101368B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Character Input (AREA)

Abstract

The application discloses a character image processing method, a device, equipment and a medium, and relates to the technical field of cloud computing. The specific implementation scheme is as follows: identifying characters in a target image to obtain character area information and character form information in the target image; calibrating the character area information according to the character form information to obtain calibrated character area information; and eliminating characters from the target image according to the calibrated character area information. The embodiment of the application realizes the effect of calibrating the recognized character area information, thereby improving the accuracy of the character area information and ensuring the accuracy of finally eliminating characters.

Description

Character image processing method, device, equipment and medium
Technical Field
The embodiment of the application relates to the technical field of image processing, in particular to a cloud computing technology, and particularly relates to a character image processing method, device, equipment and medium.
Background
With the continuous progress of artificial intelligence technology, the situation of using artificial intelligence technology to perform intelligent analysis on an image document is more and more frequent, such as performing direction and skew correction on the image document, performing layout analysis, performing content recognition, and the like. The artificial intelligence technology can greatly facilitate the staff to input or audit the image documents, thereby improving the processing efficiency of various business processes and greatly shortening the processing time.
Disclosure of Invention
The embodiment of the application discloses a character image processing method, a device, equipment and a medium, which are used for solving the problem of low accuracy in character area information identification in the prior art.
According to an aspect of the present disclosure, there is provided a character image processing method including:
identifying characters in a target image to obtain character area information and character form information in the target image;
calibrating the character area information according to the character form information to obtain calibrated character area information;
and eliminating characters from the target image according to the calibrated character area information.
According to another aspect of the present disclosure, there is provided a character image processing apparatus including:
the character recognition module is used for recognizing characters in the target image to obtain character area information and character form information in the target image;
the area calibration module is used for calibrating the character area information according to the character form information to obtain calibrated character area information;
and the character rejecting module is used for rejecting characters from the target image according to the calibrated character area information.
According to another aspect of the disclosure, an embodiment of the present application further discloses an electronic device, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the character image processing method according to any one of the embodiments of the present application.
According to another aspect of the disclosure, an embodiment of the present application also discloses a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the character image processing method according to any one of the embodiments of the present application.
According to the technology provided by the application, the effect of calibrating the identified character area information is realized, so that the accuracy of the character area information is improved, and the accuracy of finally eliminating characters is ensured.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are included to provide a better understanding of the present application and are not to be construed as limiting the application.
FIG. 1 is a flow chart of a character image processing method disclosed in accordance with an embodiment of the present application;
FIG. 2A is a flow chart of a character image processing method disclosed in accordance with an embodiment of the present application;
FIG. 2B is a schematic diagram of a related character image processing scenario disclosed in accordance with an embodiment of the present application;
FIG. 2C is a schematic diagram of a character image processing scenario disclosed in accordance with an embodiment of the present application;
FIG. 3A is a flow chart of a character image processing method disclosed in accordance with an embodiment of the present application;
FIG. 3B is a schematic diagram of a character image process according to an embodiment of the present application;
FIG. 3C is a schematic diagram of a character image processing scenario disclosed in accordance with an embodiment of the present application;
fig. 4 is a schematic diagram of a character image processing apparatus according to an embodiment of the present application;
fig. 5 is a block diagram of an electronic device disclosed in accordance with an embodiment of the application.
Detailed Description
Exemplary embodiments of the present application will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present application are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In the research and development process, the applicant finds that at present, for the reconstruction of the table in the document image, character area information in the table is generally identified by utilizing a character identification technology, then character rejection is carried out according to the identified character area information, and only table lines are reserved, so that the reconstruction of the table is completed. However, the accuracy of the character area information identified by the existing optical character recognition technology is low, and the problem that the identified character area information is too large or too small exists. If the recognized character area information is too large, when the distance between the table grid line and the characters in the table is relatively short, the table grid line is also mistakenly recognized into the character area information, so that the table grid line is also removed; if the recognized character area information is too small, the character area information cannot fully cover the character image, so that the character elimination is incomplete. In short, the effect of reconstructing the form is poor by performing character culling based on the existing character recognition technology.
Fig. 1 is a flowchart of a character image processing method according to an embodiment of the present application, which can be applied to a case of eliminating a character image in a document image. The method of the embodiment can be performed by a character image processing device, and the device can be implemented by software and/or hardware and can be integrated on any electronic device with computing capability.
As shown in fig. 1, the character image processing method disclosed in the present embodiment may include:
s101, identifying characters in a target image to obtain character area information and character form information in the target image.
The target image includes any document image or video frame image containing character information, and the format of the target image includes, but is not limited to, bmp format, jpg format, png format, and the like.
In one embodiment, after the target image is acquired, the target image is subjected to image preprocessing including, but not limited to, image noise reduction, image enhancement, image smoothing, or image binarization, etc. And recognizing characters in the preprocessed target image by using the existing character recognition technology including but not limited to OCR (Optical Character Recognition ) technology to obtain character area information and character form information in the target image. Wherein the character region information embodies the region position of each character image in the target image, including but not limited to the following two alternative forms: 1. the character image circumscribes four corner coordinates of the rectangle. 2. The height and width of the circumscribed rectangle of the character image, and the distance between the center point of the circumscribed rectangle of the character image and the left/right boundary and the upper/lower boundary of the target image, respectively. The character form information shows the appearance form of the character image.
By identifying the characters in the target image, character area information and character form information in the target image are obtained, the area positioning of the character image is realized, and a foundation is laid for the subsequent calibration of the character area information.
S102, calibrating the character area information according to the character form information to obtain calibrated character area information.
In one embodiment, according to character form information of any character image in the identified target image, a clean image conforming to the character appearance form of the character image is established, wherein all character pixels of the character image are included in the clean image, and non-character pixels not belonging to the character image are not included. Matching the pixel points in the clean image with the pixel points included in the character area information, and if any pixel point in the clean image can be matched with any pixel point in the character area information, obtaining calibrated character area information based on the matched pixel point in the character area information; if any pixel point in the clean image cannot be matched with the pixel point in the character area information, the character area information is expanded, and calibrated character area information is obtained.
Optionally, S102 includes:
generating a character template image according to the character form information; and registering the character template image with the target image according to the character area information, and obtaining calibrated character area information according to a registration result.
In one embodiment, a character template image is generated from character form information by an existing document editing tool. Registering a character template image with an original image in character area information in a target image by adopting a classical registration algorithm, such as an average absolute difference algorithm, an error square sum algorithm, a normalization product algorithm, a sequential similarity detection algorithm and the like, and if any pixel point in the template image can be successfully registered with any pixel point in the original image, acquiring calibrated character area information based on the successfully registered pixel point in the original image; if any pixel point in the template image cannot be matched with the pixel point in the original image, the character area information is expanded, and calibrated character area information is obtained.
According to the character form information, a character template image is generated, the character template image and the target image are registered according to the character area information, and calibrated character area information is obtained according to a registration result.
The character area information is calibrated according to the character form information to obtain calibrated character area information, so that the effect of calibrating the identified character area information is achieved, and the accuracy of the character area information is improved.
S103, eliminating characters from the target image according to the calibrated character area information.
In one embodiment, a background image of the target image is acquired, and optionally, the acquiring method includes: and detecting contour pixel points of the target image, determining all contour pixel points in the target image, and taking the target image without the contour pixel points as a background image. And further, according to the determined background image, calculating an average pixel value of the background image as a background value, and finally setting the pixel value of the pixel point of the target image in the calibrated character area information as the background value.
The character is removed from the target image according to the calibrated character area information, so that the effect of removing the character image in the target image is achieved, the accuracy of finally removing the character is guaranteed due to the fact that the character area information is calibrated, and the requirements of a user on processing the image document, such as form reconstruction or watermark removal, are met.
According to the technical scheme of the embodiment, the recognized character area information is calibrated according to the recognized character form information, and characters are removed from the target image according to the calibrated character area information, so that the effect of calibrating the recognized character area information is achieved, the accuracy of the character area information is improved, and the problems that if the character area information is too large, non-text information is possibly included, and if the character area is too small, the character area information cannot completely cover the character image are avoided. And correspondingly, the accuracy of finally eliminating the characters is ensured, so that the non-text information in the target image is effectively reserved.
Fig. 2A is a flowchart of a character image processing method according to an embodiment of the present application, which is further optimized and expanded based on the above technical solution, and may be combined with the above various alternative embodiments.
As shown in fig. 2A, the method may include:
s201, identifying characters in a target image to obtain character area information and character form information in the target image.
S202, generating the character template image according to the character content and the character font in the character form information.
Wherein, the character content, i.e. the word appearance of the character image, such as "item" or "mesh" can be used as the character content; a character font, i.e., a font of a character image, such as "Song Ti", "regular script", or "script", etc., may be used as the character font.
And generating a character template image according to the character content and the character font in the character form information, so that the character template image is more standard, and the reliability of final character area information calibration is ensured.
And S203, registering the character template image and the target image according to the character area information to obtain target pixel points in the target image associated with the template pixel points in the character template image.
In one embodiment, an original image in the target image, which is located in the character area information, is determined according to the character area information, the character template image is registered with the original image in the target image, a mapping relation between pixel points in the character template image and pixel points in the original image is established, and a target pixel point in the target image, which is associated with a template pixel point in the character template image, is obtained according to the mapping relation.
S204, the area information of the target pixel point in the target image is used as calibrated character area information.
In one embodiment, according to the position information of each target pixel point in the target image, the area information of all the target pixel points in the target image is obtained, and the area information is used as the calibrated character area information.
S205, eliminating characters from the target image according to the calibrated character area information.
Fig. 2B is a schematic diagram of a related character image processing scenario according to an embodiment of the present application, as shown in fig. 2B, characters in a target image 20 are identified to obtain character area information 21 in the target image 20, and characters are removed from the target image 20 according to the character area information 21 to obtain an image 22 after the characters are removed. Because the character area information 21 is limited by the character recognition accuracy, the character area information 21 includes the non-character information table line 23 in addition to the character information, and if the character is rejected directly according to the character area information 21, the character is rejected together with the non-character information table line 23.
Fig. 2C is a schematic diagram of a character image processing scenario according to an embodiment of the present application, where, as shown in fig. 2C, characters in the target image 20 are identified, so as to obtain character area information 21 and character form information in the target image 20. According to the embodiment of the application, the character template image 24 is generated according to the character content and the character font in the character form information, and the registration is carried out on the character template image 24 and the target image 20, so that the target pixel point in the target image 20 associated with the template pixel point in the character template image 24 is obtained. Finally, the area information of the target pixel point in the target image 20 is used as calibrated character area information 25, and characters are removed from the target image 20 according to the calibrated character area information 25, so that a character-removed image 26 is obtained.
According to the technical scheme of the embodiment, the target pixel point in the target image associated with the template pixel point in the character template image is obtained by registering the character template image and the target image according to the character area information, and the area information of the target pixel point in the target image is used as the calibrated character area information, so that the effect of calibrating the identified character area information is achieved, the accuracy of the character area information is improved, the problem that non-character information is possibly included if the character area information is overlarge is avoided, and the non-character information in the target image is effectively reserved when characters are removed according to the calibrated character area information.
Fig. 3A is a flowchart of a character image processing method according to an embodiment of the present application, which is further optimized and expanded based on the above technical solution, and may be combined with the above various alternative embodiments.
As shown in fig. 3A, the method may include:
s301, identifying characters in a target image to obtain character area information and character form information in the target image.
S302, generating the character template image according to the character content and the character font in the character form information.
S303, determining a mapping relation between template pixel points in the character template image and pixel points at the character region information based on an image registration technology.
In one embodiment, an image in character area information in a target image is used as an original image, an image registration technology is based on the image registration technology, a character template image and the original image are registered, features of the character template image and the original image are firstly extracted, feature descriptors are regenerated, finally matching is carried out between the features of the character template image and the original image according to the similarity degree of the descriptors, and therefore mapping relations between template pixel points in the character template image and pixel points in the character area information are determined, wherein the features include but are not limited to point features, edge features, area features and the like.
S304, if the registration of any template pixel point fails, expanding the character area information, and continuing the registration according to the expanded character area information.
In one embodiment, if registration of any template pixel point fails, that is, any template pixel point does not establish a mapping relationship with a pixel point at the character area information, it indicates that the current character area information does not cover the whole character image, and the current character area information is enlarged, so that the enlarged character area information can cover the whole character image, and registration is continued according to the enlarged character area information.
Optionally, expanding the character area information includes:
and expanding the character area information according to the relative position of the character area information in the target image.
Wherein the character area information defaults to a rectangular shape with the target image. The relative position includes the following: relative position 1: only the left boundary of the character area information coincides with the left boundary of the target image. Relative position 2: only the right boundary of the character area information coincides with the right boundary of the target image. Relative position 3: only the upper boundary of the character area information coincides with the upper boundary of the target image. Relative position 4: only the lower boundary of the character area information coincides with the lower boundary of the target image. Relative position 5: the left boundary of the character area information coincides with the left boundary of the target image, and the upper boundary also coincides with the upper boundary of the target image. Relative position 6: the left boundary of the character area information coincides with the left boundary of the target image, and the lower boundary also coincides with the lower boundary of the target image. Relative position 7: the right boundary of the character area information coincides with the right boundary of the target image, and the upper boundary also coincides with the upper boundary of the target image. Relative position 8: the right boundary of the character area information coincides with the right boundary of the target image, and the lower boundary also coincides with the lower boundary of the target image. Relative position 9: the boundary of the character area information does not coincide with the boundary of the target image.
In one embodiment, if the relative position is a relative position 1, the enlarged character area information includes: and expanding the areas above, below and right of the character area information according to the number of the preset pixels. If the relative position is the relative position 2, the enlarged character area information includes: and expanding the areas above, below and left of the character area information according to the number of the preset pixels. If the relative position is the relative position 3, the enlarged character area information includes: and expanding the left, lower and right areas of the character area information according to the number of the preset pixels. If the relative position is the relative position 4, the enlarged character area information includes: and expanding the areas above, below and left of the character area information according to the number of the preset pixels. If the relative position is the relative position 5, the enlarged character area information includes: and expanding the area below and to the right of the character area information according to the number of the preset pixels. If the relative position is the relative position 6, the enlarged character area information includes: and expanding the area above and to the right of the character area information according to the number of the preset pixels. If the relative position is the relative position 7, the enlarged character area information includes: and expanding the area below and to the left of the character area information according to the number of the preset pixels. If the relative position is the relative position 8, the enlarged character area information includes: and expanding the area above and to the left of the character area information according to the number of the preset pixels. If the relative position is the relative position 9, the enlarged character area information includes: and expanding the areas above, below, left and right of the character area information according to the number of the preset pixels.
By expanding the character area information according to the relative positions of the character area information in the target image, the effect of expanding the character area information according to different relative positions is realized, and the flexibility of expanding the character area information is improved.
S305, obtaining target pixel points in a target image associated with the template pixel points in the character template image, and taking the area information of the target pixel points in the target image as calibrated character area information.
S306, eliminating characters from the target image according to the calibrated character area information.
According to the technical scheme of the embodiment, the mapping relation between the template pixel points in the character template image and the pixel points at the character area information is determined based on the image registration technology, if any template pixel point fails to register, the character area information is enlarged, and the registration is continued according to the enlarged character area information, so that the effect of calibrating the recognized character area information is achieved, the accuracy of the character area information is improved, the problem that if the character area information is too small, the character area information cannot fully cover the character image is avoided, and then when characters are removed according to the calibrated character area information, the problem that only part of the character image is removed, and the character removing effect is poor is avoided.
Fig. 3B is a schematic diagram of character image processing according to an embodiment of the present application, as shown in fig. 3B, including:
s30, acquiring a target image;
s31, identifying the characters in the target image to obtain character area information, character content and character fonts in the target image.
S32, generating a character template image according to the character content and the character font in the character form information.
S331, registering the character template image and the target image according to the character area information to obtain target pixel points in the target image associated with the template pixel points in the character template image, and taking the area information of the target pixel points in the target image as calibrated character area information.
S332, based on an image registration technology, determining a mapping relation between template pixel points in the character template image and pixel points at the character region information, if any template pixel point fails to register, expanding the character region information, and continuing to register according to the expanded character region information.
S3321, obtaining target pixel points in the target image associated with the template pixel points in the character template image, and taking the area information of the target pixel points in the target image as calibrated character area information.
S34, eliminating characters from the target image according to the calibrated character area information.
The explanation of each step S31 to S34 has been already referred to in the above embodiments, and will not be repeated here.
Fig. 3C is a schematic diagram of a character image processing scenario according to an embodiment of the present application, where, as shown in fig. 3C, characters in a target image 300 are identified to obtain character area information 301 and character form information in the target image 300, the character area information 301 only includes a part of the character information, and if characters are removed directly according to the character area information 301, the character removal is incomplete. Further, according to the character content and the character font in the character form information, a character template image 302 is generated, based on the image registration technology, a mapping relation between the template pixel points in the character template image 302 and the pixel points at the character area information 301 is determined, if any template pixel point registration fails, the character area information 301 is enlarged, registration is continued according to the enlarged character area information 303, finally, a target pixel point in the target image 300 associated with the template pixel point in the character template image 302 is obtained, the area information where the target pixel point is located in the target image is used as calibrated character area information 304, characters are removed from the target image 300 according to the calibrated character area information 304, and an image 305 after the characters are removed is obtained.
Fig. 4 is a schematic structural diagram of a character image processing apparatus according to an embodiment of the present application, which can be applied to a case of eliminating a character image in a document image. The device of the embodiment can be implemented by software and/or hardware, and can be integrated on any electronic equipment with computing capability.
As shown in fig. 4, the character image processing apparatus 40 disclosed in the present embodiment may include a character recognition module 41, a region calibration module 42, and a character culling module 43, wherein:
a character recognition module 41, configured to recognize characters in a target image, and obtain character area information and character morphology information in the target image;
the area calibration module 42 is configured to calibrate the character area information according to the character form information to obtain calibrated character area information;
and a character rejecting module 43, configured to reject characters from the target image according to the calibrated character area information.
Optionally, the area calibration module 42 is specifically configured to:
generating a character template image according to the character form information;
and registering the character template image with the target image according to the character area information, and obtaining calibrated character area information according to a registration result.
Optionally, the area calibration module 42 is specifically further configured to:
and generating the character template image according to the character content and the character font in the character form information.
Optionally, the area calibration module 42 is specifically further configured to:
registering the character template image and the target image according to the character area information to obtain target pixel points in the target image associated with the template pixel points in the character template image;
and taking the area information of the target pixel point in the target image as calibrated character area information.
Optionally, the area calibration module 42 is specifically further configured to:
determining a mapping relation between template pixel points in the character template image and pixel points at the character region information based on an image registration technology;
if the registration of any template pixel point fails, the character area information is enlarged, and the registration is continued according to the enlarged character area information.
Optionally, the area calibration module 42 is specifically further configured to:
and expanding the character area information according to the relative position of the character area information in the target image.
The character image processing device 40 disclosed in the embodiment of the application can execute the character image processing method disclosed in the embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method. Reference is made to the description of any method embodiment of the application for details not described in this embodiment.
According to an embodiment of the present application, the present application also provides an electronic device and a readable storage medium.
As shown in fig. 5, there is a block diagram of an electronic device of a character image processing method according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein.
As shown in fig. 5, the electronic device includes: one or more processors 501, memory 502, and interfaces for connecting components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 501 is illustrated in fig. 5.
Memory 502 is a non-transitory computer readable storage medium provided by the present application. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the character image processing method provided by the application. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to execute the character image processing method provided by the present application.
The memory 502 is used as a non-transitory computer readable storage medium for storing non-transitory software programs, non-transitory computer executable programs, and modules such as program instructions/modules (e.g., the character recognition module 41, the region calibration module 42, and the character culling module 43 shown in fig. 4) corresponding to the character image processing method according to the embodiment of the present application. The processor 501 executes various functional applications of the server and data processing, that is, implements the character image processing method in the above-described method embodiment, by running a non-transitory software program, instructions, and modules stored in the memory 502.
Memory 502 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created according to the use of the electronic device of the character image processing method, and the like. In addition, memory 502 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 502 may optionally include memory located remotely from processor 501, which may be connected to the electronic device of the character image processing method via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device of the character image processing method may further include: an input device 503 and an output device 504. The processor 501, memory 502, input devices 503 and output devices 504 may be connected by a bus or otherwise, for example in fig. 5.
The input device 503 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device of the character image processing method, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointer stick, one or more mouse buttons, a track ball, a joystick, etc. The output devices 504 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibration motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme provided by the embodiment of the application, the effect of calibrating the identified character region information is realized, so that the accuracy of the character region information is improved, and the accuracy of finally eliminating characters is ensured.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed embodiments are achieved, and are not limited herein.
The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims (8)

1. A character image processing method, the method comprising:
identifying characters in a target image to obtain character area information and character form information in the target image;
calibrating the character area information according to the character form information to obtain calibrated character area information;
removing characters from the target image according to the calibrated character area information;
the step of calibrating the character area information according to the character form information to obtain calibrated character area information comprises the following steps:
generating a character template image according to the character content and the character font in the character form information;
registering the character template image and the target image according to the character area information, and obtaining calibrated character area information according to a registration result;
registering the character template image and the target image according to the character area information, and obtaining calibrated character area information according to a registration result, wherein the registering comprises the following steps:
registering the character template image and the target image according to the character area information to obtain target pixel points in the target image associated with the template pixel points in the character template image;
the area information of the target pixel point in the target image is used as calibrated character area information;
the registering the character template image and the target image according to the character area information comprises the following steps:
if the registration of any template pixel point fails, the character area information is enlarged, and the registration is continued according to the enlarged character area information.
2. The method of claim 1, wherein the registering the character template image with the target image according to the character region information comprises:
and determining a mapping relation between template pixel points in the character template image and pixel points at the character region information based on an image registration technology.
3. The method of claim 1, wherein expanding the character region information comprises:
and expanding the character area information according to the relative position of the character area information in the target image.
4. A character image processing apparatus, the apparatus comprising:
the character recognition module is used for recognizing characters in the target image to obtain character area information and character form information in the target image;
the area calibration module is used for calibrating the character area information according to the character form information to obtain calibrated character area information;
the character rejecting module is used for rejecting characters from the target image according to the calibrated character area information;
the area calibration module is specifically configured to:
generating a character template image according to the character content and the character font in the character form information;
registering the character template image and the target image according to the character area information, and obtaining calibrated character area information according to a registration result;
the area calibration module is specifically further configured to:
registering the character template image and the target image according to the character area information to obtain target pixel points in the target image associated with the template pixel points in the character template image;
the area information of the target pixel point in the target image is used as calibrated character area information;
the area calibration module is specifically further configured to:
if the registration of any template pixel point fails, the character area information is enlarged, and the registration is continued according to the enlarged character area information.
5. The apparatus of claim 4, wherein the zone calibration module is further specifically configured to:
and determining a mapping relation between template pixel points in the character template image and pixel points at the character region information based on an image registration technology.
6. The apparatus of claim 4, wherein the zone calibration module is further specifically configured to:
and expanding the character area information according to the relative position of the character area information in the target image.
7. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the character image processing method of any one of claims 1-3.
8. A non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the character image processing method according to any one of claims 1 to 3.
CN202011004751.2A 2020-09-22 2020-09-22 Character image processing method, device, equipment and medium Active CN112101368B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011004751.2A CN112101368B (en) 2020-09-22 2020-09-22 Character image processing method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011004751.2A CN112101368B (en) 2020-09-22 2020-09-22 Character image processing method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN112101368A CN112101368A (en) 2020-12-18
CN112101368B true CN112101368B (en) 2023-08-18

Family

ID=73754996

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011004751.2A Active CN112101368B (en) 2020-09-22 2020-09-22 Character image processing method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN112101368B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108388898A (en) * 2018-01-31 2018-08-10 宁波市科技园区明天医网科技有限公司 Character identifying method based on connector and template
CN108830275A (en) * 2018-05-07 2018-11-16 广东省电信规划设计院有限公司 Dot character, the recognition methods of dot matrix digit and device
CN109145904A (en) * 2018-08-24 2019-01-04 讯飞智元信息科技有限公司 A kind of character identifying method and device
CN110895696A (en) * 2019-11-05 2020-03-20 泰康保险集团股份有限公司 Image information extraction method and device
CN111639636A (en) * 2020-05-29 2020-09-08 北京奇艺世纪科技有限公司 Character recognition method and device
CN111680688A (en) * 2020-06-10 2020-09-18 创新奇智(成都)科技有限公司 Character recognition method and device, electronic equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6041836B2 (en) * 2014-07-30 2016-12-14 京セラドキュメントソリューションズ株式会社 Image processing apparatus and image processing program
CN105844205B (en) * 2015-01-15 2019-05-31 新天科技股份有限公司 Character information recognition methods based on image procossing

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108388898A (en) * 2018-01-31 2018-08-10 宁波市科技园区明天医网科技有限公司 Character identifying method based on connector and template
CN108830275A (en) * 2018-05-07 2018-11-16 广东省电信规划设计院有限公司 Dot character, the recognition methods of dot matrix digit and device
CN109145904A (en) * 2018-08-24 2019-01-04 讯飞智元信息科技有限公司 A kind of character identifying method and device
CN110895696A (en) * 2019-11-05 2020-03-20 泰康保险集团股份有限公司 Image information extraction method and device
CN111639636A (en) * 2020-05-29 2020-09-08 北京奇艺世纪科技有限公司 Character recognition method and device
CN111680688A (en) * 2020-06-10 2020-09-18 创新奇智(成都)科技有限公司 Character recognition method and device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
PROTECTIVE RELAYING PERFORMANCE REPORTING;R P TAYLOR等;《IEEE Transactions on Power Delivery》;第7卷(第4期);1892-1899 *

Also Published As

Publication number Publication date
CN112101368A (en) 2020-12-18

Similar Documents

Publication Publication Date Title
CN111753727B (en) Method, apparatus, device and readable storage medium for extracting structured information
US20230409771A1 (en) Building information model (bim) element extraction from floor plan drawings using machine learning
CN111523468B (en) Human body key point identification method and device
US10769427B1 (en) Detection and definition of virtual objects in remote screens
CN104182748B (en) One kind is based on the matched Chinese-character stroke extraction method of fractionation
CN111507354B (en) Information extraction method, device, equipment and storage medium
CN111753717B (en) Method, device, equipment and medium for extracting structured information of text
CN113627439A (en) Text structuring method, processing device, electronic device and storage medium
CN111209909B (en) Construction method, device, equipment and storage medium for qualification recognition template
CN111640123B (en) Method, device, equipment and medium for generating background-free image
CN112115921A (en) True and false identification method and device and electronic equipment
CN111832396B (en) Method and device for analyzing document layout, electronic equipment and storage medium
CN113610809B (en) Fracture detection method, fracture detection device, electronic equipment and storage medium
CN111709428A (en) Method and device for identifying key point positions in image, electronic equipment and medium
JP7389824B2 (en) Object identification method and device, electronic equipment and storage medium
CN111563453B (en) Method, apparatus, device and medium for determining table vertices
CN111523292B (en) Method and device for acquiring image information
CN111552829B (en) Method and apparatus for analyzing image material
Belhedi et al. Adaptive scene‐text binarisation on images captured by smartphones
CN112101368B (en) Character image processing method, device, equipment and medium
US20230005171A1 (en) Visual positioning method, related apparatus and computer program product
CN114511863B (en) Table structure extraction method and device, electronic equipment and storage medium
CN111476090A (en) Watermark identification method and device
CN111767859A (en) Image correction method and device, electronic equipment and computer-readable storage medium
CN114998906B (en) Text detection method, training method and device of model, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant