WO2023071119A1 - 文字检测识别方法、装置、电子设备及存储介质 - Google Patents

文字检测识别方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2023071119A1
WO2023071119A1 PCT/CN2022/090193 CN2022090193W WO2023071119A1 WO 2023071119 A1 WO2023071119 A1 WO 2023071119A1 CN 2022090193 W CN2022090193 W CN 2022090193W WO 2023071119 A1 WO2023071119 A1 WO 2023071119A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
frame line
area
table frame
recognized
Prior art date
Application number
PCT/CN2022/090193
Other languages
English (en)
French (fr)
Inventor
侯丽
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2023071119A1 publication Critical patent/WO2023071119A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Definitions

  • the present application relates to the technical fields of artificial intelligence and image recognition, and in particular to a text detection and recognition method, device, electronic equipment and storage medium.
  • OCR Optical Character Recognition, Optical Character Recognition
  • OCR technology is a relatively mature technology based on deep learning in recent years. It refers to the electronic equipment to check the characters printed on the object, determine its shape by detecting dark and bright patterns, and then use the character Recognition methods are the process of translating shapes into computer text.
  • OCR technology is capable of locating and recognizing text, but considering the limitations of the neural network in deep learning in terms of implementation mechanism and resource occupation, when the object has interference, noise, distortion, etc., the text The accuracy of detection and identification will be affected.
  • the present application provides a character detection and recognition method, device, electronic equipment and storage medium, which is beneficial to improve the accuracy of character detection and recognition.
  • the first aspect of the embodiment of the present application provides a text detection and recognition method, the method includes:
  • the stamp detection is performed on the original image to obtain the stamp area, including:
  • the stamp area is obtained.
  • determining the circular contour in the original image according to the first binary image includes:
  • the area ratios of the plurality of contours are compared with a preset area threshold, and a contour whose area ratio is greater than or equal to the preset area threshold is determined as a circular contour.
  • the image of the text area to be recognized is detected for the frame line of the table, and the detection result of the frame line of the table is obtained, including:
  • the table frame line detection result includes the presence of vertical table frame lines, the presence of horizontal table frame lines, and the absence of table frame lines; according to the first list and the second list, the table Frame detection results, including:
  • the table frame detection result is that there is no table frame.
  • determining the clipping position of the image of the text area to be recognized according to the detection result of the table frame line includes:
  • the cropping position is determined according to the column where the vertical table frame line is located and/or the row where the horizontal table frame line is located;
  • the clipping position is determined according to the first and last consecutive 0 elements in the first list and the second list.
  • the second aspect of the embodiment of the present application provides a text detection and recognition device, the device includes a detection unit and a recognition unit; wherein,
  • a detection unit configured to perform seal detection on the original image to obtain a seal area
  • the identification unit is used to fill the stamp area with the mean value of the background color of the original image to obtain the image to be detected;
  • the detection unit is also used to perform text detection on the image to be detected to obtain an image of the text area to be recognized;
  • the detection unit is also used to perform form frame line detection on the image of the text region to be recognized to obtain a form frame line detection result;
  • the recognition unit is also used to determine the cropping position of the image of the text area to be recognized according to the detection result of the form frame line, and to crop the image of the text area to be recognized based on the cropping position to obtain the cropped image of the text area to be recognized;
  • the recognition unit is further configured to obtain a character recognition result based on the cropped image of the character area to be recognized.
  • the third aspect of the embodiment of the present application provides an electronic device, the electronic device includes an input device and an output device, and also includes a processor, adapted to implement one or more instructions; and a memory, the memory stores one or more a computer program, the one or more computer programs being adapted to be loaded by the processor and perform the following steps:
  • the fourth aspect of the embodiment of the present application provides a computer storage medium, the computer storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by a processor and performing the following steps:
  • the seal area is obtained by performing seal detection on the original image, and the seal area is filled with the mean value of the background color of the original image to obtain the image to be detected, and then the text detection is performed on the image to be detected to obtain the text area to be recognized Image, perform table frame line detection on the image of the text area to be recognized to obtain the table frame line detection result, determine the cropping position of the text area image to be recognized according to the table frame line detection result, and crop the image of the text area to be recognized based on the cropping position to obtain the cropping Based on the cropped image of the character region to be recognized, a character recognition result is obtained.
  • the stamp area is filled with the mean value of the background color of the original image, and the stamp area is changed into the background area, and then the cropping position of the image of the text area to be recognized is determined by the detection result of the form frame line, and treated based on the cropping position Recognize the image of the text area to crop, so as to cut off the blanks on the upper and lower sides and/or left and right sides of the image of the text area to be recognized, so that the form frame line is located at the edge of the image, so as to reduce the interference of the form frame line on text recognition, which is beneficial Improve the accuracy of text detection.
  • this method of eliminating interference is relatively simple, it is also conducive to improving the efficiency of text recognition to a certain extent.
  • FIG. 1 is a schematic diagram of an application environment provided by an embodiment of the present application
  • FIG. 2 is a schematic flow diagram of a text detection and recognition method provided in an embodiment of the present application
  • FIG. 3A is a schematic diagram of a curve without vertical table frame lines provided by the embodiment of the present application.
  • FIG. 3B is a schematic diagram of a curve with vertical table frame lines provided by the embodiment of the present application.
  • FIG. 4A is a schematic diagram of a curve without horizontal table frame lines provided by the embodiment of the present application.
  • FIG. 4B is a schematic diagram of a curve with horizontal table frame lines provided by the embodiment of the present application.
  • FIG. 5 is a schematic diagram of an image of a character area to be recognized before and after cropping provided by an embodiment of the present application
  • FIG. 6 is a schematic flowchart of another text detection and recognition method provided in the embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a text detection and recognition device provided in an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the application environment includes an electronic device 101 and a terminal device 102 connected to the electronic device 101 through a network.
  • the terminal device 102 is used to provide the original image to be detected and identified.
  • the original image may be the electronic version of some documents and bills.
  • the terminal device 102 is also used to provide an image acquisition device, the image acquisition device It can be a camera, or a scanning gun, sensor, etc. connected to the terminal device 102.
  • the image acquisition device is used to collect the original image of the document, bill, etc., and send it to the communication device of the terminal device 102.
  • the communication device of the terminal device 102 The original image is compressed and packaged, and the compressed and packaged original image is sent to the electronic device 101 .
  • the electronic device 101 receives the original image sent by the terminal device 102 through its own communication device, and performs a decompression operation.
  • the electronic device 101 uses a graphics processor to call program instructions to perform seal detection and form frame line detection on the original image, and perform seal area filling. and cropping operations to eliminate the interference of the text arranged on the stamp and the frame line of the form on the text to be recognized in the original image.
  • OCR technology is used to perform text recognition on the cropped text area image to be recognized, which can relatively improve the accuracy of text recognition .
  • the electronic device 101 may be an independent server, or it may provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content distribution Network (Content Delivery Network, CDN), and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms.
  • the terminal device 102 may be a smart phone, a computer, a personal digital assistant, a self-service terminal, and so on.
  • FIG. 2 is a schematic flowchart of a text detection and recognition method provided in the embodiment of the present application. The method is applied to electronic equipment, as shown in FIG. 2, including steps 201-206:
  • the original image may be an image of a bill, a report, or various reports.
  • the seal detection is performed on the original image to obtain the seal area, including:
  • the stamp area is obtained.
  • the original image is converted into a grayscale image
  • a threshold value calculation is performed on the grayscale image
  • the grayscale image is converted into a first binary image according to the threshold value.
  • the circular contour in the original image is determined according to the first binary image, including:
  • the area ratios of the plurality of contours are compared with a preset area threshold, and a contour whose area ratio is greater than or equal to the preset area threshold is determined as a circular contour.
  • the first binary image can present all the contours of the non-background area in the original image (that is, the above-mentioned contours), and for each contour in the original image, its minimum circumscribed circle is obtained in turn, and then the distance between it and the minimum circumscribed circle is calculated.
  • Area ratio the value range of the area ratio is (0,1] if the area ratio is greater than or equal to the preset area threshold, the contour is considered to be similar to a circle, and the contour can be determined as a circle contour.
  • the seal is judged by two characteristics, one is that the shape is circular, and the other is that the color is red.
  • Can. convert the original image to HSV (Hue, hue; Saturation, saturation; Value, brightness) mode, if the value of the hue of the circular outline is in [0°, 10°] ⁇ [156°, 180°] If it is within a preset range, the color of the circular outline is considered to be red, and if the color of the circular outline is red, it is determined that the area covered by the circular outline is the stamp area.
  • HSV Hue, hue; Saturation, saturation; Value, brightness
  • the mean value of the background color of the original image is used to fill the seal area, so that the The stamp area becomes the background area to reduce the interference of the text arranged on the stamp to the text to be recognized which is closer to it.
  • the interference of the seal to the text recognition is basically eliminated, but for the text in the form, the form frame still interferes with the recognition of the text in the form, such as the recognition area Due to detection errors that include horizontal and (or) vertical table frame lines, the model does not perform well during recognition or the frame lines are recognized as characters such as the number "1", the letter "l", and the Chinese character "one". Therefore, it is necessary to Eliminate the impact of table frame lines on recognition accuracy.
  • the target detection algorithm is firstly used to detect the text area to be recognized in the image to be recognized, and then the image of the text area to be recognized is intercepted from the image to be detected based on the detection frame of the text area to be recognized. The cell is intercepted as the image of the text area to be recognized, and the subsequent operation of eliminating frame line interference is performed based on the image of the text area to be recognized.
  • the target detection algorithm can use Faster R-CNN (Faster Region-Convolutional Neural Networks, a faster candidate region convolutional neural network detector), YOLO (You Only Look Once, a glance at the target detector), etc., not limited here .
  • Faster Region-Convolutional Neural Networks a faster candidate region convolutional neural network detector
  • YOLO You Only Look Once, a glance at the target detector
  • the form frame line detection is performed on the image of the text area to be recognized, and the result of the table frame line detection is obtained, including:
  • the image of the character area to be recognized it is first converted into a grayscale image, and then the threshold value is calculated, and the grayscale image is converted into a binary image according to the threshold value, that is, the second binary image.
  • the width of the text area image to be recognized is w
  • sum each row of pixels of the second binary image correspondingly w summation results will be obtained, and the w summation results will be sequentially stored in the list as elements, to get the first list.
  • the table frame line detection results include the presence of vertical table frame lines, the presence of horizontal table frame lines, and the absence of table frame lines; according to the first list and the second list, the table frame line detection results include:
  • the table frame detection result is that there is no table frame.
  • the sum of the pixels in its column is usually larger, and it can be reflected by calculating the first difference between the summation result at each position in the first list and the summation result at adjacent positions
  • the change range of the pixel summation result between each column and the adjacent column, when there is a target first difference greater than or equal to the first preset value among all the calculated first differences, then the text area to be recognized can be determined
  • the image contains vertical table frame lines, for example: the difference between the pixel summation result of the third column and the pixel summation result of the fourth column reaches the first preset value, indicating that a vertical table appears on the third or fourth column Frame line, as shown in Figure 3B, by performing curve fitting on the first list, it is not difficult to see that there are large jumps at both the beginning and the end, which indicates that there are vertical tables on the left and right of the image of the text area to be recognized frame line.
  • FIG. 3A shows the situation that there is no vertical table frame line at the beginning and end of the image.
  • FIG. 3B it is not difficult to see that the jump range of the sum of pixels without vertical table frame lines at the beginning and end is obviously smaller.
  • the abscissa in Fig. 3A and Fig. 3B represents each position in the first list, namely each column of the image of the character region to be recognized, and the ordinate represents the value at each position in the first list, namely the character region to be recognized The sum of pixels in each column of the image.
  • the sum of the pixels in its row is usually larger, and by calculating the second difference between the summation result at each position in the second list and the summation result at adjacent positions, it can reflect that each The magnitude of change of the pixel summation result between the row and the adjacent row, when there is a target second difference greater than or equal to the second preset value in all the calculated second differences, the image of the text area to be recognized can be determined contains horizontal table frame lines, for example: the difference between the pixel summation result of the third row and the pixel summation result of the fourth row reaches the second preset value, indicating that a horizontal table frame line appears on the third row or the fourth row, As shown in Figure 4B, by performing curve fitting on the second list, it is not difficult to find that there are also large jumps at the first and last two places, which indicates that there are vertical table frame lines above and below the image of the text area to be recognized.
  • Figure 4A shows the situation that there is no horizontal table frame line on the upper and lower sides of the text in the image.
  • Figure 4B it is not difficult to see that the jump range of the sum of pixels without horizontal table frame lines on the upper and lower sides of the text Significantly smaller.
  • the abscissa of Fig. 4A and Fig. 4B figure represents each position in the second list, i.e. each line of the text region image to be recognized, and the ordinate represents the value on each position in the second list, i.e. the text region to be recognized The sum of pixels in each row of the image.
  • the cropping position of the image of the text area to be recognized is determined according to the detection result of the frame line of the form, including:
  • the cropping position is determined according to the column where the vertical table frame line is located and/or the row where the horizontal table frame line is located;
  • the clipping position is determined according to the first and last consecutive 0 elements in the first list and the second list.
  • the column where the vertical table frame is located can be determined through the first list, for example : The vertical table frame lines on both sides are in the 10th column and the 200th column respectively, then the vertical cropping position is the 9th column and the 201st column.
  • the row where the horizontal table frame is located can be determined through the second list, for example: up and down
  • the horizontal table frame lines of are respectively in the 8th row and the 100th row, then the horizontal cropping position is the 7th row and the 101st row.
  • the blank areas on the left and right sides of the image of the text area to be recognized can be determined through the consecutive 0 elements in the first list.
  • the upper and lower blank areas in the image of the text area to be recognized can be determined.
  • the previous column at the beginning of the horizontal text and the last column at the end can be determined as the cropping position, and the The upper line at the beginning of the vertical text and the next line at the end are determined as the clipping position to reduce the overhead of identifying blank areas.
  • the interference caused by the seal and the frame line of the form is eliminated, and then the OCR technology is used to recognize the characters in it, which is beneficial to improve the accuracy of the character recognition result.
  • the table frame detection result also includes the inclined table frame.
  • the sum of the pixels is used to detect the vertical table frame and the horizontal table frame.
  • the application can also use the straight line detection algorithm to recognize the text
  • the area image is detected to detect the slanted form frame line. If it is determined that there is a slanted form frame line, the mean value of the background color of the original image can be used to cover the slanted form frame line.
  • the seal area is obtained by performing seal detection on the original image, and the seal area is filled with the mean value of the background color of the original image to obtain the image to be detected, and then the text detection is performed on the image to be detected to obtain the image to be recognized.
  • the image of the text area, the form frame line detection is performed on the image of the text area to be recognized, and the detection result of the table frame line is obtained, the cropping position of the image of the text area to be recognized is determined according to the detection result of the table frame line, and the image of the text area to be recognized is cropped based on the cropping position,
  • the cropped image of the character region to be recognized is obtained, and a character recognition result is obtained based on the cropped image of the character region to be recognized.
  • the stamp area is filled with the mean value of the background color of the original image, and the stamp area is changed into the background area, and then the cropping position of the image of the text area to be recognized is determined by the detection result of the form frame line, and treated based on the cropping position Recognize the image of the text area to crop, so as to cut off the blanks on the upper and lower sides and/or left and right sides of the image of the text area to be recognized, so that the form frame line is located at the edge of the image, so as to reduce the interference of the form frame line on text recognition, which is beneficial Improve the accuracy of text detection.
  • this method of eliminating interference is relatively simple, it is also conducive to improving the efficiency of text recognition to a certain extent.
  • FIG. 6 is a schematic flowchart of another text detection and recognition method provided by the embodiment of the present application, as shown in FIG. 6, including steps 601-609:
  • 602 Determine the circular contour in the original image according to the first binary image
  • 605 Fill the stamp area with the mean value of the background color of the original image to obtain the image to be detected;
  • 606 Perform text detection on the image to be detected to obtain an image of the text area to be recognized;
  • 607 Perform form frame line detection on the image of the text area to be recognized, and obtain a table frame line detection result;
  • 608 Determine the cropping position of the image of the text area to be recognized according to the detection result of the form frame line, and crop the image of the text area to be recognized based on the cropping position, to obtain the cropped image of the text area to be recognized;
  • steps 601-609 have been described in the embodiments shown in FIG. 2-FIG. 5, and can achieve the same or similar beneficial effects. In order to avoid repetition, details are not repeated here.
  • FIG. 7 is a schematic structural diagram of a text detection and recognition device provided by the embodiment of the present application. As shown in FIG. 7, the device includes a detection unit 701 and a recognition unit 702; where,
  • a detection unit 701 configured to perform seal detection on the original image to obtain a seal area
  • the identification unit 702 is used to fill the stamp area with the mean value of the background color of the original image to obtain the image to be detected;
  • the detection unit 701 is also used to perform text detection on the image to be detected to obtain an image of the text area to be recognized;
  • the detection unit 701 is also used to perform form frame line detection on the image of the character area to be recognized to obtain a table frame line detection result;
  • the recognition unit 702 is further configured to determine the cropping position of the image of the text area to be recognized according to the detection result of the form frame line, and to crop the image of the text area to be recognized based on the cropping position, so as to obtain the cropped image of the text area to be recognized;
  • the recognition unit 702 is further configured to obtain a character recognition result based on the cropped image of the character area to be recognized.
  • the seal area is obtained by performing seal detection on the original image, and the seal area is filled with the mean value of the background color of the original image to obtain the image to be detected, and then the image to be detected is obtained.
  • Perform text detection on the image to obtain the image of the text area to be recognized perform table frame detection on the image of the text area to be recognized, obtain the detection result of the table frame, determine the cropping position of the image of the text area to be recognized according to the detection result of the table frame, and based on the cropping position
  • the stamp area is filled with the mean value of the background color of the original image, and the stamp area is changed into the background area, and then the cropping position of the image of the text area to be recognized is determined by the detection result of the form frame line, and treated based on the cropping position Recognize the image of the text area to crop, so as to cut off the blanks on the upper and lower sides and/or left and right sides of the image of the text area to be recognized, so that the form frame line is located at the edge of the image, so as to reduce the interference of the form frame line on text recognition, which is beneficial Improve the accuracy of text detection.
  • this method of eliminating interference is relatively simple, it is also conducive to improving the efficiency of text recognition to a certain extent.
  • the detection unit 701 is specifically configured to:
  • the stamp area is obtained.
  • the detection unit 701 is specifically configured to:
  • the area ratios of the plurality of contours are compared with a preset area threshold, and a contour whose area ratio is greater than or equal to the preset area threshold is determined as a circular contour.
  • the detection unit 701 is specifically configured to:
  • the table frame line detection result includes the presence of vertical table frame lines, the presence of horizontal table frame lines, and the absence of table frame lines; according to the first list and the second list, the table frame line detection results are obtained
  • the detection unit 701 is specifically used for:
  • the table frame detection result is that there is no table frame.
  • the recognition unit 702 is specifically configured to:
  • the cropping position is determined according to the column where the vertical table frame line is located and/or the row where the horizontal table frame line is located;
  • the clipping position is determined according to the first and last consecutive 0 elements in the first list and the second list.
  • each unit of the text detection and recognition device shown in Fig. 7 can be respectively or all combined into one or several other units to form, or one (some) units can be split again It is composed of multiple functionally smaller units, which can achieve the same operation without affecting the realization of the technical effects of the embodiments of the present application.
  • the above-mentioned units are divided based on logical functions.
  • the functions of one unit may also be realized by multiple units, or the functions of multiple units may be realized by one unit.
  • the text detection and recognition device may also include other units. In practical applications, these functions may also be implemented with the assistance of other units, and may be implemented by multiple units in cooperation.
  • a general-purpose computing device such as a computer including processing elements such as a central processing unit (CPU), a random access storage medium (RAM), and a read-only storage medium (ROM) and storage elements.
  • CPU central processing unit
  • RAM random access storage medium
  • ROM read-only storage medium
  • Run the computer program (including program code) that can execute the steps involved in the corresponding method as shown in Figure 2 or Figure 6, to construct the text detection and recognition device equipment as shown in Figure 7, and to realize the implementation of the present application Example text detection and recognition method.
  • the computer program can be recorded in, for example, a computer-readable recording medium, loaded into the above-mentioned computing device through the computer-readable recording medium, and executed therein.
  • an embodiment of the present application further provides an electronic device.
  • the electronic device at least includes a processor 801 , an input device 802 , an output device 803 and a memory 804 .
  • the processor 801, the input device 802, the output device 803 and the memory 804 in the electronic device may be connected through a bus or in other ways.
  • the memory 804 may be stored in the memory of the electronic device, the memory 804 is used to store computer programs, the computer programs include program instructions, and the processor 801 is used to execute the program instructions stored in the memory 804 .
  • Processor 801 (or called CPU (Central Processing Unit, central processing unit)) is the calculation core and control core of electronic equipment, which is suitable for implementing one or more instructions, specifically for loading and executing one or more instructions to realize Corresponding method flow or corresponding function.
  • CPU Central Processing Unit, central processing unit
  • the processor 801 of the electronic device provided in the embodiment of this application can be used to perform a series of text detection and recognition processing:
  • the seal area is obtained by performing seal detection on the original image, and the seal area is filled with the mean value of the background color of the original image to obtain the image to be detected, and then the image to be detected is processed.
  • Text detection obtain the image of the text area to be recognized, perform table frame detection on the image of the text area to be recognized, obtain the detection result of the table frame line, determine the cropping position of the image of the text area to be recognized according to the detection result of the table frame line, and determine the cropping position of the text area image to be recognized based on the cropping position
  • the image of the text area is cropped to obtain a cropped image of the text area to be recognized, and a text recognition result is obtained based on the cropped image of the text area to be recognized.
  • the stamp area is filled with the mean value of the background color of the original image, and the stamp area is changed into the background area, and then the cropping position of the image of the text area to be recognized is determined by the detection result of the form frame line, and treated based on the cropping position Recognize the image of the text area to crop, so as to cut off the blanks on the upper and lower sides and/or left and right sides of the image of the text area to be recognized, so that the form frame line is located at the edge of the image, so as to reduce the interference of the form frame line on text recognition, which is beneficial Improve the accuracy of text detection.
  • this method of eliminating interference is relatively simple, it is also conducive to improving the efficiency of text recognition to a certain extent.
  • the processor 801 performs stamp detection on the original image to obtain the stamp area, including:
  • the stamp area is obtained.
  • the processor 801 executes determining the circular contour in the original image according to the first binary image, including:
  • the area ratios of the plurality of contours are compared with a preset area threshold, and a contour whose area ratio is greater than or equal to the preset area threshold is determined as a circular contour.
  • the processor 801 performs table frame line detection on the image of the character area to be recognized, and obtains a table frame line detection result, including:
  • the table frame line detection result includes the presence of vertical table frame lines, the presence of horizontal table frame lines, and the absence of table frame lines; the processor 801 executes the table frame line detection results according to the first list and the second list. ,include:
  • the target first difference does not exist in the first difference and the target second difference does not exist in the second difference, then it is determined that the table frame detection result is that there is no table frame.
  • the processor 801 determines the clipping position of the image of the character area to be recognized according to the detection result of the frame line of the table, including:
  • the cropping position is determined according to the column where the vertical table frame line is located and/or the row where the horizontal table frame line is located;
  • the clipping position is determined according to the first and last consecutive 0 elements in the first list and the second list.
  • the electronic device includes but not limited to a processor 801 , an input device 802 , an output device 803 and a memory 804 . It can also include memory, power supply, application client modules, etc.
  • the input device 802 may be a keyboard, a touch screen, a radio frequency receiver, etc.
  • the output device 803 may be a speaker, a display, a radio frequency transmitter, etc.
  • the schematic diagram is only an example of the electronic device, and does not constitute a limitation to the electronic device, and may include more or less components than those shown in the figure, or combine certain components, or different components.
  • the processor 801 of the electronic device executes the computer program to implement the steps in the above-mentioned character detection and recognition method
  • the embodiments of the above-mentioned character detection and recognition method are all applicable to the electronic device, and can achieve the same or similar beneficial effect.
  • the embodiment of the present application also provides a computer storage medium (Memory).
  • the computer storage medium is a memory device in an electronic device and is used to store programs and data. It can be understood that the computer storage medium here may include a built-in storage medium in the terminal, and of course may also include an extended storage medium supported by the terminal.
  • the computer storage medium provides storage space, and the storage space stores the operating system of the terminal.
  • one or more instructions suitable for being loaded and executed by the processor 801 are also stored in the storage space, and these instructions may be one or more computer programs (including program codes).
  • the computer storage medium here can be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory;
  • one or more instructions stored in the computer storage medium may be loaded and executed by the processor 801, so as to implement the corresponding steps of the above-mentioned text detection and recognition method.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the computer program on the computer storage medium includes computer program code
  • the computer program code may be in the form of source code, object code, executable file or some intermediate form.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a USB flash drive, a removable hard disk, a magnetic disk, an optical disk, a computer memory, and a read-only memory (ROM, Read-Only Memory) , Random Access Memory (RAM, Random Access Memory), electrical carrier signal, telecommunication signal and software distribution medium, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

本申请提供了一种文字检测识别方法、装置、电子设备及存储介质,其中,该方法包括:对原始图像进行印章检测,得到印章区域;采用原始图像的背景颜色的均值对印章区域进行填充,得到待检测图像;对待检测图像进行文字检测,得到待识别文字区域图像;对待识别文字区域图像进行表格框线检测,得到表格框线检测结果;根据表格框线检测结果确定待识别文字区域图像的裁剪位置,以及基于裁剪位置对待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像;基于裁剪后的待识别文字区域图像,得到文字识别结果。本申请实施例有利于提升文字检测和识别的精度。

Description

文字检测识别方法、装置、电子设备及存储介质
优先权申明
本申请要求于2021年10月30日提交中国专利局、申请号为202111279385.6,发明名称为“文字检测识别方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能、图像识别技术领域,尤其涉及一种文字检测识别方法、装置、电子设备及存储介质。
背景技术
随着计算机性能的不断提高,高度依赖中央处理器或图形处理器等计算资源的深度学习技术广泛应用于社会各行各业中,并取得了突出的成果。OCR(Optical Character Recognition,光学字符识别)技术是近年来发展较为成熟的基于深度学习的技术,其是指电子设备检查物件上打印的字符,通过检测暗、亮的模式确定其形状,然后用字符识别方法将形状翻译成计算机文字的过程。发明人意识到一般情况下,OCR技术能够胜任文字的定位和识别,但是考虑到深度学习中神经网络在实现机制和资源占用等方面的限制,当物件存在干扰、噪声、失真等情况时,文字检测和识别的精度会受到影响。
发明内容
针对上述问题,本申请提供了一种文字检测识别方法、装置、电子设备及存储介质,有利于提升文字检测和识别的精度。
为实现上述目的,本申请实施例第一方面提供了一种文字检测识别方法,该方法包括:
对原始图像进行印章检测,得到印章区域;
采用原始图像的背景颜色的均值对印章区域进行填充,得到待检测图像;
对待检测图像进行文字检测,得到待识别文字区域图像;
对待识别文字区域图像进行表格框线检测,得到表格框线检测结果;
根据表格框线检测结果确定待识别文字区域图像的裁剪位置,以及基于裁剪位置对待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像;
基于裁剪后的待识别文字区域图像,得到文字识别结果。
结合第一方面,在一种可能的实施方式中,对原始图像进行印章检测,得到印章区域,包括:
将原始图像转换为第一二值图像;
根据第一二值图像确定原始图像中的圆形轮廓;
根据原始图像确定圆形轮廓的色调;
根据圆形轮廓的色调,得到印章区域。
结合第一方面,在一种可能的实施方式中,根据第一二值图像确定原始图像中的圆形轮廓,包括:
根据第一二值图像确定出原始图像中的轮廓;
计算轮廓围成图形的面积与轮廓的最小外接圆的面积之比,得到原始图像中多个轮廓的面积比;
将多个轮廓的面积比与预设的面积阈值进行比对,并将面积比大于或等于预设的面积阈值的轮廓确定为圆形轮廓。
结合第一方面,在一种可能的实施方式中,对待识别文字区域图像进行表格框线检测,得到表格框线检测结果,包括:
将待识别文字区域图像转换为第二二值图像;
沿高度方向遍历第二二值图像的每列像素,对每列像素进行求和;
将每列像素的求和结果作为元素存入列表,得到长为w的第一列表,w为大于1的整数;
沿宽度方向遍历第二二值图像的每行像素,对每行像素进行求和;
将每行像素的求和结果作为元素存入列表,得到长为h的第二列表,h为大于1的整数;
根据第一列表和第二列表,得到表格框线检测结果。
结合第一方面,在一种可能的实施方式中,表格框线检测结果包括存在竖向表格框线、存在横向表格框线和不存在表格框线;根据第一列表和第二列表,得到表格框线检测结果,包括:
计算第一列表中每个位置上的求和结果与相邻位置上的求和结果的第一差值,若第一差值中存在大于或等于第一预设值的目标第一差值,则确定表格框线检测结果为存在竖向表格框线;
计算第二列表中每个位置上的求和结果与相邻位置上的求和结果的第二差值,若第二差值中存在大于或等于第二预设值的目标第二差值,则确定表格框线检测结果为存在横向表格框线;
若第一差值中不存在目标第一差值且第二差值中不存在目标第二差值,则确定表格框线检测结果为不存在表格框线。
结合第一方面,在一种可能的实施方式中,根据表格框线检测结果确定待识别文字区域图像的裁剪位置,包括:
在表格框线检测结果为存在竖向表格框线和/或存在横向表格框线的情况下,根据竖向表格框线所在的列和/或横向表格框线所在的行确定裁剪位置;
在表格框线检测结果为不存在表格框线的情况下,根据第一列表和第二列表中首尾连续的0元素确定裁剪位置。
本申请实施例第二方面提供了一种文字检测识别装置,该装置包括检测单元和识别单元;其中,
检测单元,用于对原始图像进行印章检测,得到印章区域;
识别单元,用于采用原始图像的背景颜色的均值对印章区域进行填充,得到待检测图像;
检测单元,还用于对待检测图像进行文字检测,得到待识别文字区域图像;
检测单元,还用于对待识别文字区域图像进行表格框线检测,得到表格框线检测结果;
识别单元,还用于根据表格框线检测结果确定待识别文字区域图像的裁剪位置,以及基于裁剪位置对待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像;
识别单元,还用于基于裁剪后的待识别文字区域图像,得到文字识别结果。
本申请实施例第三方面提供了一种电子设备,该电子设备包括输入设备和输出设备,还包括处理器,适于实现一条或多条指令;以及,存储器,所述存储器存储有一条或多条计算机程序,所述一条或多条计算机程序适于由所述处理器加载并执行如下步骤:
对原始图像进行印章检测,得到印章区域;
采用原始图像的背景颜色的均值对印章区域进行填充,得到待检测图像;
对待检测图像进行文字检测,得到待识别文字区域图像;
对待识别文字区域图像进行表格框线检测,得到表格框线检测结果;
根据表格框线检测结果确定待识别文字区域图像的裁剪位置,以及基于裁剪位置对待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像;
基于裁剪后的待识别文字区域图像,得到文字识别结果。
本申请实施例第四方面提供了一种计算机存储介质,所述计算机存储介质存储有一条或多条指令,所述一条或多条指令适于由处理器加载并执行如下步骤:
对原始图像进行印章检测,得到印章区域;
采用原始图像的背景颜色的均值对印章区域进行填充,得到待检测图像;
对待检测图像进行文字检测,得到待识别文字区域图像;
对待识别文字区域图像进行表格框线检测,得到表格框线检测结果;
根据表格框线检测结果确定待识别文字区域图像的裁剪位置,以及基于裁剪位置对待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像;
基于裁剪后的待识别文字区域图像,得到文字识别结果。
本申请的上述方案至少包括以下有益效果:
本申请实施例中,通过对原始图像进行印章检测,得到印章区域,采用原始图像的背景颜色的均值对印章区域进行填充,得到待检测图像,然后对待检测图像进行文字检测,得到待识别文字区域图像,对待识别文字区域图像进行表格框线检测,得到表格框线检测结果,根据表格框线检测结果确定待识别文字区域图像的裁剪位置,以及基于裁剪位置对待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像,基于裁剪后的待识别文字区域图像,得到文字识别结果。这样检测出印章区域后,采用原始图像的背景颜色的均值对印章区域进行填充,将印章区域变为背景区域,然后通过表格框线检测结果确定待识别文字区域图像的裁剪位置,基于裁剪位置对待识别文字区域图像进行裁剪,以将待识别文字区域图像中上下侧和/或左右侧的空白裁剪掉,让表格框线位于图像的边缘,以降低表格框线对文字识别的干扰,从而有利于提升文字检测的精度,另外,由于这种消除干扰的处理方式相对比较精简,在一定程度上还有利于提升文字识别的效率。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的一种应用环境的示意图;
图2为本申请实施例提供的一种文字检测识别方法的流程示意图;
图3A为本申请实施例提供的一种不存在竖向表格框线的曲线示意图;
图3B为本申请实施例提供的一种存在竖向表格框线的曲线示意图;
图4A为本申请实施例提供的一种不存在横向表格框线的曲线示意图;
图4B为本申请实施例提供的一种存在横向表格框线的曲线示意图;
图5为本申请实施例提供的一种待识别文字区域图像裁剪前和裁剪后的示意图;
图6为本申请实施例提供的另一种文字检测识别方法的流程示意图;
图7为本申请实施例提供的一种文字检测识别装置的结构示意图;
图8为本申请实施例提供的一种电子设备的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。
本申请说明书、权利要求书和附图中出现的术语“包括”和“具有”以及它们任何变形, 意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。此外,术语“第一”、“第二”和“第三”等是用于区别不同的对象,而并非用于描述特定的顺序。
本申请实施例提供一种文字检测识别方法,可基于图1所示的应用环境实施,请参见图1,该应用环境中包括电子设备101和通过网络与电子设备101连接的终端设备102。其中,终端设备102用于提供待检测识别的原始图像,比如该原始图像可以是一些文档、票据的电子版,在一些场景中,终端设备102还用于提供有图像采集装置,该图像采集装置可以是摄像头,或与终端设备102连接的扫描枪、传感器等,图像采集装置用于采集文档、票据等的原始图像,并将其发送给终端设备102的通信装置,由终端设备102的通信装置对该原始图像进行压缩打包,并将压缩打包后的原始图像发送给电子设备101。电子设备101通过自身的通信装置接收终端设备102发送的原始图像,并执行解压缩操作,电子设备101通过图形处理器调用程序指令对原始图像进行印章检测和表格框线检测,并执行印章区域填充和裁剪操作,以消除印章上排布的文字和表格框线对原始图像中待识别文字的干扰,最后采用OCR技术对裁剪后的待识别文字区域图像进行文字识别,相对能够提升文字识别的精度。
示例性的,电子设备101可以是独立的服务器,也可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(Content Delivery Network,CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器。终端设备102可以是智能手机、电脑、个人数字助理、以及自助终端,等等。
基于图1所示的应用环境,以下结合其他附图对本申请实施例提供的文字检测识别方法进行详细阐述。
请参见图2,图2为本申请实施例提供的一种文字检测识别方法的流程示意图,该方法应用于电子设备,如图2所示,包括步骤201-206:
201:对原始图像进行印章检测,得到印章区域。
本申请具体实施例中,原始图像可以是票据、报表、各种报告的图像,示例性的,对原始图像进行印章检测,得到印章区域,包括:
将原始图像转换为第一二值图像;
根据第一二值图像确定原始图像中的圆形轮廓;
根据原始图像确定圆形轮廓的色调;
根据圆形轮廓的色调,得到印章区域。
具体的,将原始图像转换为灰度图,对灰度图进行阈值计算,按照该阈值将灰度图转化为第一二值图像。示例性的,根据第一二值图像确定原始图像中的圆形轮廓,包括:
根据第一二值图像确定出原始图像中的轮廓;
计算轮廓围成图形的面积与轮廓的最小外接圆的面积之比,得到原始图像中多个轮廓的面积比;
将多个轮廓的面积比与预设的面积阈值进行比对,并将面积比大于或等于预设的面积阈值的轮廓确定为圆形轮廓。
应理解,第一二值图像能够呈现出原始图像中非背景区域的所有轮廓(即上述各个轮廓),针对原始图像中的各个轮廓,依次获取其最小外接圆,然后计算其与最小外接圆的面积之比,该面积之比的取值范围为(0,1],若该面积之比大于或等于预设面积阈值,则认为该轮廓近似于圆形,即可将该轮廓确定为圆形轮廓。
本申请具体实施例中,印章通过两个特点进行判定,一是形状为圆形,二是颜色为红色,在确定出圆形轮廓后,只需确定其在原始图像中的色调是否为红色即可。具体的,将原始图像转换为HSV(Hue,色调;Saturation,饱和度;Value,亮度)模式,若圆形轮廓的色调的值在[0°,10°]∪[156°,180°]这一预设范围内,则认为圆形轮廓的颜色为红色系,在圆形轮廓的颜色为红色系的情况下,确定该圆形轮廓所覆盖的区域为印章区域。
202:采用原始图像的背景颜色的均值对印章区域进行填充,得到待检测图像。
本申请具体实施例中,为了避免印章上环绕排布的文字被划入距离其较近的待识别文字(或检测框)中,采用原始图像的背景颜色的均值对印章区域进行填充,从而将印章区域变为背景区域,以降低印章上排布的文字对距其较近的待识别文字的干扰。
203:对待检测图像进行文字检测,得到待识别文字区域图像。
本申请具体实施例中,印章区域被填充后,基本排除了印章对文字识别的干扰,但是对于表格中的文字而言,表格框线仍然会对表格中的文字的识别造成干扰,比如识别区域因检测失误包含了横向和(或)竖向的表格框线,导致识别时模型发挥不佳或将框线识别为数字“1”、字母“l”、汉字“一”等字符,因此,需要消除表格框线对识别精度的影响。具体的,首先采用目标检测算法检测出待识别图像中的待识别文字区域,然后基于待识别文字区域的检测框从待检测图像中截取出待识别文字区域图像,例如:将有文字的每个单元格截取为待识别文字区域图像,基于待识别文字区域图像执行后续的消除框线干扰的操作。
其中,目标检测算法可以采用Faster R-CNN(Faster Region-Convolutional Neural Networks,更快速的候选区域卷积神经网络检测器)、YOLO(You Only Look Once,一瞥目标检测器)等,此处不作限定。
204:对待识别文字区域图像进行表格框线检测,得到表格框线检测结果。
本申请具体实施例中,示例性的,对待识别文字区域图像进行表格框线检测,得到表格框线检测结果,包括:
将待识别文字区域图像转换为第二二值图像;
沿高度方向遍历第二二值图像的每列像素,对每列像素进行求和;
将每列像素的求和结果作为元素存入列表,得到长为w的第一列表,w为大于1的整 数;
沿宽度方向遍历第二二值图像的每行像素,对每行像素进行求和;
将每行像素的求和结果作为元素存入列表,得到长为h的第二列表,h为大于1的整数;
根据第一列表和第二列表,得到表格框线检测结果。
应理解,对于待识别文字区域图像,先将其转为灰度图,再进行阈值计算,按照该阈值将灰度图转化为二值图像,即第二二值图像。在待识别文字区域图像的宽度为w的情况下,对第二二值图像的每列像素进行求和,相应会得到w个求和结果,将w个求和结果作为元素依次存入列表,得到第一列表。同理,在待识别文字区域图像的高度为h的情况下,对第二二值图像的每行像素进行求和,相应会得到h个求和结果,将h个求和结果作为元素依次存入另一列表,得到第二列表。
示例性的,表格框线检测结果包括存在竖向表格框线、存在横向表格框线和不存在表格框线;根据第一列表和第二列表,得到表格框线检测结果,包括:
计算第一列表中每个位置上的求和结果与相邻位置上的求和结果的第一差值,若第一差值中存在大于或等于第一预设值的目标第一差值,则确定表格框线检测结果为存在竖向表格框线;
计算第二列表中每个位置上的求和结果与相邻位置上的求和结果的第二差值,若第二差值中存在大于或等于第二预设值的目标第二差值,则确定表格框线检测结果为存在横向表格框线;
若第一差值中不存在目标第一差值且第二差值中不存在目标第二差值,则确定表格框线检测结果为不存在表格框线。
对于竖向表格框线,其所在列的像素之和通常较大,通过计算第一列表中每个位置上的求和结果与相邻位置上的求和结果的第一差值,能够反映出每列与相邻列之间的像素求和结果的变化幅度,当计算出的所有第一差值中存在大于或等于第一预设值的目标第一差值,则可确定待识别文字区域图像中包含了竖向表格框线,例如:第三列像素求和结果与第四列像素求和结果之差达到第一预设值,说明第三列或第四列上出现了竖向表格框线,如图3B所示,通过对第一列表进行曲线拟合,不难看出在首尾两处均出现了幅度较大的跳变,这就表明待识别文字区域图像的左右存在竖向表格框线。
其中,图3A示出了图像中首尾不存在竖向表格框线的情况,与图3B进行对比,不难看出,首尾不存在竖向表格框线的像素之和的跳变幅度明显较小。其中,图3A和图3B中的横坐标表示第一列表中的每个位置,即待识别文字区域图像的每一列,纵坐标表示第一列表中每个位置上的值,即待识别文字区域图像每一列的像素之和。
对于横向表格框线,其所在行的像素之和通常较大,通过计算第二列表中每个位置上的求和结果与相邻位置上的求和结果的第二差值,能够反映出每行与相邻行之间的像素求 和结果的变化幅度,当计算出的所有第二差值中存在大于或等于第二预设值的目标第二差值,则可确定待识别文字区域图像中包含了横向表格框线,例如:第三行像素求和结果与第四行像素求和结果之差达到第二预设值,说明第三行或第四行上出现了横向表格框线,如图4B所示,通过对第二列表进行曲线拟合,不难出来在首尾两处也出现了幅度较大的跳变,这就表明待识别文字区域图像的上下存在竖向表格框线。
其中,图4A示出了图像中文字上下两侧不存在横向表格框线的情况,与图4B进行对比,不难看出,文字上下两侧不存在横向表格框线的像素之和的跳变幅度明显较小。其中,图4A和图4B图的横坐标表示第二列表中的每个位置,即待识别文字区域图像的每一行,纵坐标表示第二列表中每个位置上的值,即待识别文字区域图像每一行的像素之和。
应理解,通过对第一列表和第二列表进行曲线拟合发现第一列表和第二列表中每个位置上的像素求和结果与相邻位置上的像素求和结果之间并没有出现幅度较大的跳变,即不存在目标第一差值和目标第二差值,则可确定待识别文字区域图像中不包含表格框线,例如:原始图像对应的文档确实没有表格,或者文档采用了无框线表格。
205:根据表格框线检测结果确定待识别文字区域图像的裁剪位置,以及基于裁剪位置对待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像。
本申请具体实施例中,根据表格框线检测结果确定待识别文字区域图像的裁剪位置,包括:
在表格框线检测结果为存在竖向表格框线和/或存在横向表格框线的情况下,根据竖向表格框线所在的列和/或横向表格框线所在的行确定裁剪位置;
在表格框线检测结果为不存在表格框线的情况下,根据第一列表和第二列表中首尾连续的0元素确定裁剪位置。
具体的,请参见图5中(b)、(d)图,对于待识别文字区域图像中存在竖向表格框线的情况,通过第一列表可以确定出竖向表格框线所在的列,例如:两侧的竖向表格框线分别在第10列和第200列,则竖向的裁剪位置为第9列和第201列。同理,请参见图5中(b)、(c)图,对于待识别文字区域图像中存在横向表格框线的情况,通过第二列表可以确定出横向表格框线所在的行,例如:上下的横向表格框线分别在第8行和第100行,则横向的裁剪位置为第7行和第101行。通过将待识别文字区域图像两侧和/或上下的空白区域裁剪掉,让表格框线出现在裁剪后的待识别文字区域图像的最边缘,就很容易将表格框线与文字区别开,以此降低表格框线对待识别文字的干扰。请参见图5中(a)图,对于待识别文字区域图像中不存在表格框线的情况,通过第一列表中首尾连续的0元素,可以确定待识别文字区域图像中左右两侧的空白区域,同理,通过第二列表中首尾连续的0元素,可以确定待识别文字区域图像中上下的空白区域,如此,便能将横向文字开始的前一列和结束的后一列确定为裁剪位置,将竖向文字开始的上一行和结束的下一行确定为裁剪位置,以降低识别空白区域带来的开销。
206:基于裁剪后的待识别文字区域图像,得到文字识别结果。
本申请具体实施例中,对于裁剪后的待识别文字区域图像,消除了印章和表格框线带来的干扰,再采用OCR技术对其中的文字进行识别,有利于提升文字识别结果的精度。
进一步的,表格框线检测结果还包括存在倾斜的表格框线,在步骤204中采用像素之和检测竖向表格框线和横向表格框线的同时,本申请还可采用直线检测算法对待识别文字区域图像进行检测,以检测出倾斜的表格框线,若确定存在倾斜的表格框线,则可采用原始图像的背景颜色的均值对该倾斜的表格框线进行覆盖。
可以看出,本申请实施例通过对原始图像进行印章检测,得到印章区域,采用原始图像的背景颜色的均值对印章区域进行填充,得到待检测图像,然后对待检测图像进行文字检测,得到待识别文字区域图像,对待识别文字区域图像进行表格框线检测,得到表格框线检测结果,根据表格框线检测结果确定待识别文字区域图像的裁剪位置,以及基于裁剪位置对待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像,基于裁剪后的待识别文字区域图像,得到文字识别结果。这样检测出印章区域后,采用原始图像的背景颜色的均值对印章区域进行填充,将印章区域变为背景区域,然后通过表格框线检测结果确定待识别文字区域图像的裁剪位置,基于裁剪位置对待识别文字区域图像进行裁剪,以将待识别文字区域图像中上下侧和/或左右侧的空白裁剪掉,让表格框线位于图像的边缘,以降低表格框线对文字识别的干扰,从而有利于提升文字检测的精度,另外,由于这种消除干扰的处理方式相对比较精简,在一定程度上还有利于提升文字识别的效率。
请参见图6,图6本申请实施例提供的另一种文字检测识别方法的流程示意图,如图6所示,包括步骤601-609:
601:将原始图像转换为第一二值图像;
602:根据第一二值图像确定原始图像中的圆形轮廓;
603:根据原始图像确定圆形轮廓的色调;
604:根据圆形轮廓的色调,得到印章区域;
605:采用原始图像的背景颜色的均值对印章区域进行填充,得到待检测图像;
606:对待检测图像进行文字检测,得到待识别文字区域图像;
607:对待识别文字区域图像进行表格框线检测,得到表格框线检测结果;
608:根据表格框线检测结果确定待识别文字区域图像的裁剪位置,以及基于裁剪位置对待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像;
609:基于裁剪后的待识别文字区域图像,得到文字识别结果。
其中,步骤601-609的具体实施方式在图2-图5所示的实施例中已有相关说明,且能达到相同或相似的有益效果,为避免重复,此处不再赘述。
基于上述文字检测识别方法实施例的描述,请参见图7,图7为本申请实施例提供的一 种文字检测识别装置的结构示意图,如图7所示,该装置包括检测单元701和识别单元702;其中,
检测单元701,用于对原始图像进行印章检测,得到印章区域;
识别单元702,用于采用原始图像的背景颜色的均值对印章区域进行填充,得到待检测图像;
检测单元701,还用于对待检测图像进行文字检测,得到待识别文字区域图像;
检测单元701,还用于对待识别文字区域图像进行表格框线检测,得到表格框线检测结果;
识别单元702,还用于根据表格框线检测结果确定待识别文字区域图像的裁剪位置,以及基于裁剪位置对待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像;
识别单元702,还用于基于裁剪后的待识别文字区域图像,得到文字识别结果。
可以看出,在图7所示的文字检测识别装置中,通过对原始图像进行印章检测,得到印章区域,采用原始图像的背景颜色的均值对印章区域进行填充,得到待检测图像,然后对待检测图像进行文字检测,得到待识别文字区域图像,对待识别文字区域图像进行表格框线检测,得到表格框线检测结果,根据表格框线检测结果确定待识别文字区域图像的裁剪位置,以及基于裁剪位置对待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像,基于裁剪后的待识别文字区域图像,得到文字识别结果。这样检测出印章区域后,采用原始图像的背景颜色的均值对印章区域进行填充,将印章区域变为背景区域,然后通过表格框线检测结果确定待识别文字区域图像的裁剪位置,基于裁剪位置对待识别文字区域图像进行裁剪,以将待识别文字区域图像中上下侧和/或左右侧的空白裁剪掉,让表格框线位于图像的边缘,以降低表格框线对文字识别的干扰,从而有利于提升文字检测的精度,另外,由于这种消除干扰的处理方式相对比较精简,在一定程度上还有利于提升文字识别的效率。
在一种可能的实施方式中,在对原始图像进行印章检测,得到印章区域方面,检测单元701具体用于:
将原始图像转换为第一二值图像;
根据第一二值图像确定原始图像中的圆形轮廓;
根据原始图像确定圆形轮廓的色调;
根据圆形轮廓的色调,得到印章区域。
在一种可能的实施方式中,在根据第一二值图像确定原始图像中的圆形轮廓方面,检测单元701具体用于:
根据第一二值图像确定出原始图像中的轮廓;
计算轮廓围成图形的面积与轮廓的最小外接圆的面积之比,得到原始图像中多个轮廓的面积比;
将多个轮廓的面积比与预设的面积阈值进行比对,并将面积比大于或等于预设的面积阈值的轮廓确定为圆形轮廓。
在一种可能的实施方式中,在对待识别文字区域图像进行表格框线检测,得到表格框线检测结果方面,检测单元701具体用于:
将待识别文字区域图像转换为第二二值图像;
沿高度方向遍历第二二值图像的每列像素,对每列像素进行求和;
将每列像素的求和结果作为元素存入列表,得到长为w的第一列表,w为大于1的整数;
沿宽度方向遍历第二二值图像的每行像素,对每行像素进行求和;
将每行像素的求和结果作为元素存入列表,得到长为h的第二列表,h为大于1的整数;
根据第一列表和第二列表,得到表格框线检测结果。
在一种可能的实施方式中,表格框线检测结果包括存在竖向表格框线、存在横向表格框线和不存在表格框线;在根据第一列表和第二列表,得到表格框线检测结果方面,检测单元701具体用于:
计算第一列表中每个位置上的求和结果与相邻位置上的求和结果的第一差值,若第一差值中存在大于或等于第一预设值的目标第一差值,则确定表格框线检测结果为存在竖向表格框线;
计算第二列表中每个位置上的求和结果与相邻位置上的求和结果的第二差值,若第二差值中存在大于或等于第二预设值的目标第二差值,则确定表格框线检测结果为存在横向表格框线;
若第一差值中不存在目标第一差值且第二差值中不存在目标第二差值,则确定表格框线检测结果为不存在表格框线。
在一种可能的实施方式中,在根据表格框线检测结果确定待识别文字区域图像的裁剪位置方面,识别单元702具体用于:
在表格框线检测结果为存在竖向表格框线和/或存在横向表格框线的情况下,根据竖向表格框线所在的列和/或横向表格框线所在的行确定裁剪位置;
在表格框线检测结果为不存在表格框线的情况下,根据第一列表和第二列表中首尾连续的0元素确定裁剪位置。
根据本申请的一个实施例,图7所示的文字检测识别装置的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本申请的实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个单元来实现,或者多个单元的功能由一个单元实现。在本申请的其它实施例中,文字检测 识别装置也可以包括其它单元,在实际应用中,这些功能也可以由其它单元协助实现,并且可以由多个单元协作实现。
根据本申请的另一个实施例,可以通过在包括中央处理单元(CPU)、随机存取存储介质(RAM)、只读存储介质(ROM)等处理元件和存储元件的例如计算机的通用计算设备上运行能够执行如图2或图6中所示的相应方法所涉及的各步骤的计算机程序(包括程序代码),来构造如图7中所示的文字检测识别装置设备,以及来实现本申请实施例的文字检测识别方法。所述计算机程序可以记载于例如计算机可读记录介质上,并通过计算机可读记录介质装载于上述计算设备中,并在其中运行。
基于上述方法实施例和装置实施例的描述,本申请实施例还提供一种电子设备。请参见图8,该电子设备至少包括处理器801、输入设备802、输出设备803以及存储器804。其中,电子设备内的处理器801、输入设备802、输出设备803以及存储器804可通过总线或其他方式连接。
存储器804可以存储在电子设备的存储器中,所述存储器804用于存储计算机程序,所述计算机程序包括程序指令,所述处理器801用于执行所述存储器804存储的程序指令。处理器801(或称CPU(Central Processing Unit,中央处理器))是电子设备的计算核心以及控制核心,其适于实现一条或多条指令,具体适于加载并执行一条或多条指令从而实现相应方法流程或相应功能。
在一个实施例中,本申请实施例提供的电子设备的处理器801可以用于进行一系列文字检测识别处理:
对原始图像进行印章检测,得到印章区域;
采用原始图像的背景颜色的均值对印章区域进行填充,得到待检测图像;
对待检测图像进行文字检测,得到待识别文字区域图像;
对待识别文字区域图像进行表格框线检测,得到表格框线检测结果;
根据表格框线检测结果确定待识别文字区域图像的裁剪位置,以及基于裁剪位置对待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像;
基于裁剪后的待识别文字区域图像,得到文字识别结果。
可以看出,在图8所示的电子设备中,通过对原始图像进行印章检测,得到印章区域,采用原始图像的背景颜色的均值对印章区域进行填充,得到待检测图像,然后对待检测图像进行文字检测,得到待识别文字区域图像,对待识别文字区域图像进行表格框线检测,得到表格框线检测结果,根据表格框线检测结果确定待识别文字区域图像的裁剪位置,以及基于裁剪位置对待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像,基于裁剪后的待识别文字区域图像,得到文字识别结果。这样检测出印章区域后,采用原始图像的背景颜色的均值对印章区域进行填充,将印章区域变为背景区域,然后通过表格框线检测结果确定待识别文字区域图像的裁剪位置,基于裁剪位置对待识别文字区域图像进 行裁剪,以将待识别文字区域图像中上下侧和/或左右侧的空白裁剪掉,让表格框线位于图像的边缘,以降低表格框线对文字识别的干扰,从而有利于提升文字检测的精度,另外,由于这种消除干扰的处理方式相对比较精简,在一定程度上还有利于提升文字识别的效率。
再一个实施例中,处理器801执行对原始图像进行印章检测,得到印章区域,包括:
将原始图像转换为第一二值图像;
根据第一二值图像确定原始图像中的圆形轮廓;
根据原始图像确定圆形轮廓的色调;
根据圆形轮廓的色调,得到印章区域。
再一个实施例中,处理器801执行根据第一二值图像确定原始图像中的圆形轮廓,包括:
根据第一二值图像确定出原始图像中的轮廓;
计算轮廓围成图形的面积与轮廓的最小外接圆的面积之比,得到原始图像中多个轮廓的面积比;
将多个轮廓的面积比与预设的面积阈值进行比对,并将面积比大于或等于预设的面积阈值的轮廓确定为圆形轮廓。
再一个实施例中,处理器801执行对待识别文字区域图像进行表格框线检测,得到表格框线检测结果,包括:
将待识别文字区域图像转换为第二二值图像;
沿高度方向遍历第二二值图像的每列像素,对每列像素进行求和;
将每列像素的求和结果作为元素存入列表,得到长为w的第一列表,w为大于1的整数;
沿宽度方向遍历第二二值图像的每行像素,对每行像素进行求和;
将每行像素的求和结果作为元素存入列表,得到长为h的第二列表,h为大于1的整数;
根据第一列表和第二列表,得到表格框线检测结果。
再一个实施例中,表格框线检测结果包括存在竖向表格框线、存在横向表格框线和不存在表格框线;处理器801执行根据第一列表和第二列表,得到表格框线检测结果,包括:
计算第一列表中每个位置上的求和结果与相邻位置上的求和结果的第一差值,若第一差值中存在大于或等于第一预设值的目标第一差值,则确定表格框线检测结果为存在竖向表格框线;
计算第二列表中每个位置上的求和结果与相邻位置上的求和结果的第二差值,若第二差值中存在大于或等于第二预设值的目标第二差值,则确定表格框线检测结果为存在横向表格框线;
若第一差值中不存在目标第一差值且第二差值中不存在目标第二差值,则确定表格框 线检测结果为不存在表格框线。
再一个实施例中,处理器801执行根据表格框线检测结果确定待识别文字区域图像的裁剪位置,包括:
在表格框线检测结果为存在竖向表格框线和/或存在横向表格框线的情况下,根据竖向表格框线所在的列和/或横向表格框线所在的行确定裁剪位置;
在表格框线检测结果为不存在表格框线的情况下,根据第一列表和第二列表中首尾连续的0元素确定裁剪位置。
示例性的,电子设备包括但不仅限于处理器801、输入设备802、输出设备803以及存储器804。还可以包括内存、电源、应用客户端模块等。输入设备802可以是键盘、触摸屏、射频接收器等,输出设备803可以是扬声器、显示器、射频发送器等。本领域技术人员可以理解,所述示意图仅仅是电子设备的示例,并不构成对电子设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件。
需要说明的是,由于电子设备的处理器801执行计算机程序时实现上述的文字检测识别方法中的步骤,因此上述文字检测识别方法的实施例均适用于该电子设备,且均能达到相同或相似的有益效果。
本申请实施例还提供了一种计算机存储介质(Memory),所述计算机存储介质是电子设备中的记忆设备,用于存放程序和数据。可以理解的是,此处的计算机存储介质既可以包括终端中的内置存储介质,当然也可以包括终端所支持的扩展存储介质。计算机存储介质提供存储空间,该存储空间存储了终端的操作系统。并且,在该存储空间中还存放了适于被处理器801加载并执行的一条或多条的指令,这些指令可以是一个或一个以上的计算机程序(包括程序代码)。需要说明的是,此处的计算机存储介质可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器;可选的,还可以是至少一个位于远离前述处理器801的计算机存储介质。在一个实施例中,可由处理器801加载并执行计算机存储介质中存放的一条或多条指令,以实现上述有关文字检测识别方法的相应步骤。所述计算机可读存储介质可以是非易失性,也可以是易失性。
示例性的,计算机存储介质的计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。
需要说明的是,由于计算机存储介质的计算机程序被处理器执行时实现上述的文字检测识别方法中的步骤,因此上述文字检测识别方法的所有实施例均适用于该计算机存储介质,且均能达到相同或相似的有益效果。
以上对本申请实施例进行了详细介绍,本文中应用了具体个例对本申请的原理及实施 方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。

Claims (20)

  1. 一种文字检测识别方法,其中,所述方法包括:
    对原始图像进行印章检测,得到印章区域;
    采用所述原始图像的背景颜色的均值对所述印章区域进行填充,得到待检测图像;
    对所述待检测图像进行文字检测,得到待识别文字区域图像;
    对所述待识别文字区域图像进行表格框线检测,得到表格框线检测结果;
    根据所述表格框线检测结果确定所述待识别文字区域图像的裁剪位置,以及基于所述裁剪位置对所述待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像;
    基于所述裁剪后的待识别文字区域图像,得到文字识别结果。
  2. 根据权利要求1所述的方法,其中,所述对原始图像进行印章检测,得到印章区域,包括:
    将所述原始图像转换为第一二值图像;
    根据所述第一二值图像确定所述原始图像中的圆形轮廓;
    根据所述原始图像确定所述圆形轮廓的色调;
    根据所述圆形轮廓的色调,得到所述印章区域。
  3. 根据权利要求2所述的方法,其中,所述根据所述第一二值图像确定所述原始图像中的圆形轮廓,包括:
    根据所述第一二值图像确定出所述原始图像中的轮廓;
    计算所述轮廓围成图形的面积与所述轮廓的最小外接圆的面积之比,得到所述原始图像中多个轮廓的面积比;
    将所述多个轮廓的面积比与预设的面积阈值进行比对,并将面积比大于或等于预设的面积阈值的轮廓确定为所述圆形轮廓。
  4. 根据权利要求3所述的方法,其中,所述对所述待识别文字区域图像进行表格框线检测,得到表格框线检测结果,包括:
    将所述待识别文字区域图像转换为第二二值图像;
    沿高度方向遍历所述第二二值图像的每列像素,对所述每列像素进行求和;
    将所述每列像素的求和结果作为元素存入列表,得到长为w的第一列表,w为大于1的整数;
    沿宽度方向遍历所述第二二值图像的每行像素,对所述每行像素进行求和;
    将所述每行像素的求和结果作为元素存入列表,得到长为h的第二列表,h为大于1的整数;
    根据所述第一列表和所述第二列表,得到所述表格框线检测结果。
  5. 根据权利要求3所述的方法,其中,所述表格框线检测结果包括存在竖向表格框 线、存在横向表格框线和不存在表格框线;所述根据所述第一列表和所述第二列表,得到所述表格框线检测结果,包括:
    计算所述第一列表中每个位置上的求和结果与相邻位置上的求和结果的第一差值,若所述第一差值中存在大于或等于第一预设值的目标第一差值,则确定所述表格框线检测结果为存在竖向表格框线;
    计算所述第二列表中每个位置上的求和结果与相邻位置上的求和结果的第二差值,若所述第二差值中存在大于或等于第二预设值的目标第二差值,则确定所述表格框线检测结果为存在横向表格框线;
    若所述第一差值中不存在所述目标第一差值且所述第二差值中不存在所述目标第二差值,则确定所述表格框线检测结果为不存在表格框线。
  6. 根据权利要求4所述的方法,其中,所述根据所述表格框线检测结果确定所述待识别文字区域图像的裁剪位置,包括:
    在所述表格框线检测结果为存在竖向表格框线和/或存在横向表格框线的情况下,根据竖向表格框线所在的列和/或横向表格框线所在的行确定所述裁剪位置;
    在所述表格框线检测结果为不存在表格框线的情况下,根据所述第一列表和所述第二列表中首尾连续的0元素确定所述裁剪位置。
  7. 一种文字检测识别装置,其中,所述装置包括检测单元和识别单元;其中,
    所述检测单元,用于对原始图像进行印章检测,得到印章区域;
    所述识别单元,用于采用所述原始图像的背景颜色的均值对所述印章区域进行填充,得到待检测图像;
    所述检测单元,还用于对所述待检测图像进行文字检测,得到待识别文字区域图像;
    所述检测单元,还用于对所述待识别文字区域图像进行表格框线检测,得到表格框线检测结果;
    所述识别单元,还用于根据所述表格框线检测结果确定所述待识别文字区域图像的裁剪位置,以及基于所述裁剪位置对所述待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像;
    所述识别单元,还用于基于所述裁剪后的待识别文字区域图像,得到文字识别结果。
  8. 根据权利要求7所述的装置,其中,所述表格框线检测结果包括存在竖向表格框线、存在横向表格框线和不存在表格框线;在根据所述表格框线检测结果确定所述待识别文字区域图像的裁剪位置方面,所述识别单元具体用于:
    在所述表格框线检测结果为存在竖向表格框线和/或存在横向表格框线的情况下,根据竖向表格框线所在的列和/或横向表格框线所在的行确定所述裁剪位置;
    在所述表格框线检测结果为不存在表格框线的情况下,根据所述第一列表和所述第二列表中首尾连续的0元素确定所述裁剪位置。
  9. 一种电子设备,包括输入设备和输出设备,其中,还包括:
    处理器,适于实现一条或多条指令;以及,存储器,所述存储器存储有一条或多条计算机程序,所述一条或多条计算机程序适于由所述处理器加载并执行如下步骤:
    对原始图像进行印章检测,得到印章区域;
    采用原始图像的背景颜色的均值对印章区域进行填充,得到待检测图像;
    对待检测图像进行文字检测,得到待识别文字区域图像;
    对待识别文字区域图像进行表格框线检测,得到表格框线检测结果;
    根据表格框线检测结果确定待识别文字区域图像的裁剪位置,以及基于裁剪位置对待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像;
    基于裁剪后的待识别文字区域图像,得到文字识别结果。
  10. 根据权利要求9所述的电子设备,其中,所述对原始图像进行印章检测,得到印章区域,包括:
    将所述原始图像转换为第一二值图像;
    根据所述第一二值图像确定所述原始图像中的圆形轮廓;
    根据所述原始图像确定所述圆形轮廓的色调;
    根据所述圆形轮廓的色调,得到所述印章区域。
  11. 根据权利要求10所述的电子设备,其中,所述根据所述第一二值图像确定所述原始图像中的圆形轮廓,包括:
    根据所述第一二值图像确定出所述原始图像中的轮廓;
    计算所述轮廓围成图形的面积与所述轮廓的最小外接圆的面积之比,得到所述原始图像中多个轮廓的面积比;
    将所述多个轮廓的面积比与预设的面积阈值进行比对,并将面积比大于或等于预设的面积阈值的轮廓确定为所述圆形轮廓。
  12. 根据权利要求11所述的电子设备,其中,所述对所述待识别文字区域图像进行表格框线检测,得到表格框线检测结果,包括:
    将所述待识别文字区域图像转换为第二二值图像;
    沿高度方向遍历所述第二二值图像的每列像素,对所述每列像素进行求和;
    将所述每列像素的求和结果作为元素存入列表,得到长为w的第一列表,w为大于1的整数;
    沿宽度方向遍历所述第二二值图像的每行像素,对所述每行像素进行求和;
    将所述每行像素的求和结果作为元素存入列表,得到长为h的第二列表,h为大于1的整数;
    根据所述第一列表和所述第二列表,得到所述表格框线检测结果。
  13. 根据权利要求11所述的电子设备,其中,所述表格框线检测结果包括存在竖向 表格框线、存在横向表格框线和不存在表格框线;所述根据所述第一列表和所述第二列表,得到所述表格框线检测结果,包括:
    计算所述第一列表中每个位置上的求和结果与相邻位置上的求和结果的第一差值,若所述第一差值中存在大于或等于第一预设值的目标第一差值,则确定所述表格框线检测结果为存在竖向表格框线;
    计算所述第二列表中每个位置上的求和结果与相邻位置上的求和结果的第二差值,若所述第二差值中存在大于或等于第二预设值的目标第二差值,则确定所述表格框线检测结果为存在横向表格框线;
    若所述第一差值中不存在所述目标第一差值且所述第二差值中不存在所述目标第二差值,则确定所述表格框线检测结果为不存在表格框线。
  14. 根据权利要求12所述的电子设备,其中,所述根据所述表格框线检测结果确定所述待识别文字区域图像的裁剪位置,包括:
    在所述表格框线检测结果为存在竖向表格框线和/或存在横向表格框线的情况下,根据竖向表格框线所在的列和/或横向表格框线所在的行确定所述裁剪位置;
    在所述表格框线检测结果为不存在表格框线的情况下,根据所述第一列表和所述第二列表中首尾连续的0元素确定所述裁剪位置。
  15. 一种计算机存储介质,其中,所述计算机存储介质存储有一条或多条指令,所述一条或多条指令适于由处理器加载并执行如下步骤:
    对原始图像进行印章检测,得到印章区域;
    采用原始图像的背景颜色的均值对印章区域进行填充,得到待检测图像;
    对待检测图像进行文字检测,得到待识别文字区域图像;
    对待识别文字区域图像进行表格框线检测,得到表格框线检测结果;
    根据表格框线检测结果确定待识别文字区域图像的裁剪位置,以及基于裁剪位置对待识别文字区域图像进行裁剪,得到裁剪后的待识别文字区域图像;
    基于裁剪后的待识别文字区域图像,得到文字识别结果。
  16. 根据权利要求15所述的计算机存储介质,其中,所述对原始图像进行印章检测,得到印章区域,包括:
    将所述原始图像转换为第一二值图像;
    根据所述第一二值图像确定所述原始图像中的圆形轮廓;
    根据所述原始图像确定所述圆形轮廓的色调;
    根据所述圆形轮廓的色调,得到所述印章区域。
  17. 根据权利要求16所述的计算机存储介质,其中,所述根据所述第一二值图像确定所述原始图像中的圆形轮廓,包括:
    根据所述第一二值图像确定出所述原始图像中的轮廓;
    计算所述轮廓围成图形的面积与所述轮廓的最小外接圆的面积之比,得到所述原始图像中多个轮廓的面积比;
    将所述多个轮廓的面积比与预设的面积阈值进行比对,并将面积比大于或等于预设的面积阈值的轮廓确定为所述圆形轮廓。
  18. 根据权利要求17所述的计算机存储介质,其中,所述对所述待识别文字区域图像进行表格框线检测,得到表格框线检测结果,包括:
    将所述待识别文字区域图像转换为第二二值图像;
    沿高度方向遍历所述第二二值图像的每列像素,对所述每列像素进行求和;
    将所述每列像素的求和结果作为元素存入列表,得到长为w的第一列表,w为大于1的整数;
    沿宽度方向遍历所述第二二值图像的每行像素,对所述每行像素进行求和;
    将所述每行像素的求和结果作为元素存入列表,得到长为h的第二列表,h为大于1的整数;
    根据所述第一列表和所述第二列表,得到所述表格框线检测结果。
  19. 根据权利要求17所述的计算机存储介质,其中,所述表格框线检测结果包括存在竖向表格框线、存在横向表格框线和不存在表格框线;所述根据所述第一列表和所述第二列表,得到所述表格框线检测结果,包括:
    计算所述第一列表中每个位置上的求和结果与相邻位置上的求和结果的第一差值,若所述第一差值中存在大于或等于第一预设值的目标第一差值,则确定所述表格框线检测结果为存在竖向表格框线;
    计算所述第二列表中每个位置上的求和结果与相邻位置上的求和结果的第二差值,若所述第二差值中存在大于或等于第二预设值的目标第二差值,则确定所述表格框线检测结果为存在横向表格框线;
    若所述第一差值中不存在所述目标第一差值且所述第二差值中不存在所述目标第二差值,则确定所述表格框线检测结果为不存在表格框线。
  20. 根据权利要求18所述的计算机存储介质,其中,所述根据所述表格框线检测结果确定所述待识别文字区域图像的裁剪位置,包括:
    在所述表格框线检测结果为存在竖向表格框线和/或存在横向表格框线的情况下,根据竖向表格框线所在的列和/或横向表格框线所在的行确定所述裁剪位置;
    在所述表格框线检测结果为不存在表格框线的情况下,根据所述第一列表和所述第二列表中首尾连续的0元素确定所述裁剪位置。
PCT/CN2022/090193 2021-10-30 2022-04-29 文字检测识别方法、装置、电子设备及存储介质 WO2023071119A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111279385.6A CN113920295A (zh) 2021-10-30 2021-10-30 文字检测识别方法、装置、电子设备及存储介质
CN202111279385.6 2021-10-30

Publications (1)

Publication Number Publication Date
WO2023071119A1 true WO2023071119A1 (zh) 2023-05-04

Family

ID=79243884

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/090193 WO2023071119A1 (zh) 2021-10-30 2022-04-29 文字检测识别方法、装置、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN113920295A (zh)
WO (1) WO2023071119A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113920295A (zh) * 2021-10-30 2022-01-11 平安科技(深圳)有限公司 文字检测识别方法、装置、电子设备及存储介质
CN116311333B (zh) * 2023-02-21 2023-12-01 南京云阶电力科技有限公司 针对电气图纸中边缘细小文字识别的预处理方法及系统

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105988979A (zh) * 2015-02-16 2016-10-05 北京邮电大学 基于pdf文件的表格提取方法和装置
JP2017161969A (ja) * 2016-03-07 2017-09-14 日本電気株式会社 文字認識装置、方法およびプログラム
CN109670500A (zh) * 2018-11-30 2019-04-23 平安科技(深圳)有限公司 一种文字区域获取方法、装置、存储介质及终端设备
CN110610163A (zh) * 2019-09-18 2019-12-24 山东浪潮人工智能研究院有限公司 一种自然场景下基于椭圆拟合的表格提取方法及工具
CN111476109A (zh) * 2020-03-18 2020-07-31 深圳中兴网信科技有限公司 票据处理方法、票据处理装置和计算机可读存储介质
CN112528863A (zh) * 2020-12-14 2021-03-19 中国平安人寿保险股份有限公司 表格结构的识别方法、装置、电子设备及存储介质
CN113139445A (zh) * 2021-04-08 2021-07-20 招商银行股份有限公司 表格识别方法、设备及计算机可读存储介质
CN113920295A (zh) * 2021-10-30 2022-01-11 平安科技(深圳)有限公司 文字检测识别方法、装置、电子设备及存储介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105988979A (zh) * 2015-02-16 2016-10-05 北京邮电大学 基于pdf文件的表格提取方法和装置
JP2017161969A (ja) * 2016-03-07 2017-09-14 日本電気株式会社 文字認識装置、方法およびプログラム
CN109670500A (zh) * 2018-11-30 2019-04-23 平安科技(深圳)有限公司 一种文字区域获取方法、装置、存储介质及终端设备
CN110610163A (zh) * 2019-09-18 2019-12-24 山东浪潮人工智能研究院有限公司 一种自然场景下基于椭圆拟合的表格提取方法及工具
CN111476109A (zh) * 2020-03-18 2020-07-31 深圳中兴网信科技有限公司 票据处理方法、票据处理装置和计算机可读存储介质
CN112528863A (zh) * 2020-12-14 2021-03-19 中国平安人寿保险股份有限公司 表格结构的识别方法、装置、电子设备及存储介质
CN113139445A (zh) * 2021-04-08 2021-07-20 招商银行股份有限公司 表格识别方法、设备及计算机可读存储介质
CN113920295A (zh) * 2021-10-30 2022-01-11 平安科技(深圳)有限公司 文字检测识别方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN113920295A (zh) 2022-01-11

Similar Documents

Publication Publication Date Title
WO2023071119A1 (zh) 文字检测识别方法、装置、电子设备及存储介质
US10896349B2 (en) Text detection method and apparatus, and storage medium
WO2017140233A1 (zh) 文字检测方法及系统、设备、存储介质
CN113313111B (zh) 文本识别方法、装置、设备和介质
WO2017121018A1 (zh) 二维码图像处理的方法和装置、终端、存储介质
RU2721188C2 (ru) Улучшение контраста и снижение шума на изображениях, полученных с камер
JP4494563B2 (ja) トークン化によるイメージ分割を用いたイメージ処理方法および装置
CN111353497A (zh) 一种身份证信息的识别方法和装置
CN105046254A (zh) 字符识别方法及装置
JP2020509436A (ja) システム言語切替方法およびシステム言語切替端末機器
CN112784835B (zh) 圆形印章的真实性识别方法、装置、电子设备及存储介质
CN112989995B (zh) 文本检测方法、装置及电子设备
CN111507324A (zh) 卡片边框识别方法、装置、设备和计算机存储介质
US20230196805A1 (en) Character detection method and apparatus , model training method and apparatus, device and storage medium
CN110895696A (zh) 一种图像信息提取方法和装置
CN110210467B (zh) 一种文本图像的公式定位方法、图像处理装置、存储介质
WO2021218183A1 (zh) 证件边沿检测方法、装置、设备及介质
CN113486881A (zh) 一种文本识别方法、装置、设备及介质
CN113627423A (zh) 圆形印章字符识别方法、装置、计算机设备和存储介质
CN116994272A (zh) 一种针对目标图片的识别方法和装置
CN115937039A (zh) 数据扩充方法、装置、电子设备及可读存储介质
CN116052202A (zh) 物流单据图像提取方法及其装置、设备、介质、产品
CN114267035A (zh) 一种文档图像处理方法、系统、电子设备及可读介质
CN111368572A (zh) 一种二维码的识别方法及系统
CN115188000A (zh) 基于ocr的文本识别方法、装置、存储介质及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22885033

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE