CN112906532B - Image processing method and device, electronic equipment and storage medium - Google Patents

Image processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112906532B
CN112906532B CN202110169261.6A CN202110169261A CN112906532B CN 112906532 B CN112906532 B CN 112906532B CN 202110169261 A CN202110169261 A CN 202110169261A CN 112906532 B CN112906532 B CN 112906532B
Authority
CN
China
Prior art keywords
line
region
frames
coordinate system
pixel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110169261.6A
Other languages
Chinese (zh)
Other versions
CN112906532A (en
Inventor
徐青松
李青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Ruisheng Software Co Ltd
Original Assignee
Hangzhou Ruisheng Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Ruisheng Software Co Ltd filed Critical Hangzhou Ruisheng Software Co Ltd
Priority to CN202110169261.6A priority Critical patent/CN112906532B/en
Publication of CN112906532A publication Critical patent/CN112906532A/en
Priority to PCT/CN2022/073988 priority patent/WO2022166707A1/en
Application granted granted Critical
Publication of CN112906532B publication Critical patent/CN112906532B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Abstract

An image processing method, an image processing apparatus, an electronic device, and a non-transitory computer-readable storage medium. The image processing method comprises the following steps: acquiring an input image, wherein the input image comprises a table area, and the table area comprises a plurality of object areas; performing region identification processing on the input image to obtain a plurality of object region frames corresponding to the object regions one by one and a table region frame corresponding to the table region; performing table line detection processing on the input image to judge whether the table area comprises a wired table or not; and in response to the table region not including the wired table: performing alignment treatment on the multiple object region frames to obtain multiple region labeling frames corresponding to the multiple object region frames one by one; determining at least one dividing line based on the plurality of object region frames, and dividing the plurality of region labeling frames through the at least one dividing line to form a plurality of cells; and generating a cell table corresponding to the table area based on the plurality of cells.

Description

Image processing method and device, electronic equipment and storage medium
Technical Field
Embodiments of the present disclosure relate to an image processing method, an image processing apparatus, an electronic device, and a non-transitory computer-readable storage medium.
Background
Currently, users often take a photograph of an object (e.g., the object may be a business card, a test paper, a laboratory sheet, a document, etc.), and wish to process the photographed image accordingly to obtain relevant information about the object in the image. According to different practical requirements, in some cases, a user wants related information of an object obtained based on an image to be presented in a form of a table, so that the obtained information is more visual and standard. Therefore, when an image is processed to obtain relevant information of an object in the image, a table is also required to be drawn based on the size, the position, etc. of an area occupied by the relevant information of the object in the image, so as to satisfy the requirement that information desired by a user can be presented in a table form.
Disclosure of Invention
At least one embodiment of the present disclosure provides an image processing method, including: acquiring an input image, the input image comprising a table area, the table area comprising a plurality of object areas, each object area of the plurality of object areas comprising at least one object; performing region identification processing on the input image to obtain a plurality of object region frames corresponding to the plurality of object regions one by one and a table region frame corresponding to the table region; performing table line detection processing on the input image to judge whether the table area comprises a wired table or not; and in response to the table region not including a wired table: performing alignment processing on the plurality of object region frames to obtain a plurality of region labeling frames corresponding to the plurality of object region frames one by one; determining at least one dividing line based on the multiple object region frames, and dividing the multiple region labeling frames through the at least one dividing line to form multiple cells; and generating a cell table corresponding to the table area based on the plurality of cells.
At least one embodiment of the present disclosure also provides an image processing apparatus including: the system comprises an image acquisition module, a region identification processing module, a form line detection processing module and a unit form generation module; the image acquisition module is configured to acquire an input image, the input image including a table region including a plurality of object regions, each object region of the plurality of object regions including at least one object; the region identification processing module is configured to perform region identification processing on the input image so as to obtain a plurality of object region frames corresponding to the plurality of object regions one by one and a table region frame corresponding to the table region; the table line detection processing module is configured to perform table line detection processing on the input image so as to judge whether the table area comprises a wired table or not; the cell table generation module is configured to, in response to the table region not including the wired table: performing alignment processing on the plurality of object region frames to obtain a plurality of region labeling frames corresponding to the plurality of object region frames one by one; determining at least one dividing line based on the multiple object region frames, and dividing the multiple region labeling frames through the at least one dividing line to form multiple cells; based on the plurality of cells, a cell table corresponding to the table region is generated.
At least one embodiment of the present disclosure also provides an electronic device including a processor and a memory for storing computer readable instructions; the processor is configured to implement the steps of the method of any of the embodiments described above when executing the computer readable instructions.
At least one embodiment of the present disclosure also provides a non-transitory computer-readable storage medium for non-transitory storage of computer-readable instructions that, when executed by a processor, implement the steps of the method of any of the embodiments described above.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings of the embodiments will be briefly described below, and it is apparent that the drawings in the following description relate only to some embodiments of the present disclosure, not to limit the present disclosure.
Fig. 1 is a flow chart of an image processing method according to at least one embodiment of the present disclosure;
FIG. 2A is a schematic illustration of an input image provided in accordance with at least one embodiment of the present disclosure;
FIGS. 2B-2H are schematic diagrams illustrating the image processing of the input image of FIG. 2A;
FIG. 3A is a schematic illustration of another input image provided in accordance with at least one embodiment of the present disclosure;
FIGS. 3B-3I are schematic diagrams illustrating the image processing of the input image of FIG. 3A;
fig. 4 is a flowchart of step S30 in an image processing method according to at least one embodiment of the present disclosure;
fig. 5 is a flowchart illustrating a partial operation of step S302 in an image processing method according to at least one embodiment of the present disclosure;
fig. 6 is a flowchart of step S3020 in an image processing method according to at least one embodiment of the present disclosure;
FIG. 7 is a flow chart of another image processing method according to at least one embodiment of the present disclosure;
fig. 8 is a flowchart of step S401 in an image processing method according to at least one embodiment of the present disclosure;
fig. 9 is a schematic flow chart of step S4012 in an image processing method according to at least one embodiment of the present disclosure;
fig. 10 is a partial flowchart of step S402 in an image processing method according to at least one embodiment of the present disclosure;
FIG. 11 is a flowchart of yet another image processing method according to at least one embodiment of the present disclosure;
FIG. 12 is a schematic block diagram of an image processing apparatus provided in at least one embodiment of the present disclosure;
FIG. 13 is a schematic diagram of an electronic device according to at least one embodiment of the present disclosure; and
fig. 14 is a schematic diagram of a non-transitory computer readable storage medium according to at least one embodiment of the present disclosure.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present disclosure. It will be apparent that the described embodiments are some, but not all, of the embodiments of the present disclosure. All other embodiments, which can be made by one of ordinary skill in the art without the need for inventive faculty, are within the scope of the present disclosure, based on the described embodiments of the present disclosure.
Unless defined otherwise, technical or scientific terms used in this disclosure should be given the ordinary meaning as understood by one of ordinary skill in the art to which this disclosure belongs. The terms "first," "second," and the like, as used in this disclosure, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. Likewise, the terms "a," "an," or "the" and similar terms do not denote a limitation of quantity, but rather denote the presence of at least one. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
At least one embodiment of the present disclosure provides an image processing method, an image processing apparatus, an electronic device, and a non-transitory computer-readable storage medium. The image processing method comprises the following steps: acquiring an input image, wherein the input image comprises a table area, the table area comprises a plurality of object areas, and each object area in the plurality of object areas comprises at least one object; performing region identification processing on the input image to obtain a plurality of object region frames corresponding to the object regions one by one and a table region frame corresponding to the table region; performing table line detection processing on the input image to judge whether the table area comprises a wired table or not; and in response to the table region not including the wired table: performing alignment treatment on the multiple object region frames to obtain multiple region labeling frames corresponding to the multiple object region frames one by one; determining at least one dividing line based on the plurality of object region frames, and dividing the plurality of region labeling frames through the at least one dividing line to form a plurality of cells; and generating a cell table corresponding to the table area based on the plurality of cells.
In the image processing method provided in the embodiment of the present disclosure, by performing the table line detection processing on the input image, when it is determined that the table area of the input image does not include the wired table, a plurality of corresponding area labeling frames are obtained by performing the alignment processing on the plurality of object area frames obtained after the identification processing, and the dividing line is determined based on the plurality of object area frames, and further, after the plurality of area labeling frames are subjected to the dividing processing by the dividing line, a plurality of cells may be formed, so that a cell table corresponding to the table area of the input image is generated based on the plurality of cells. Thus, after the objects in the object area are filled in the cells of the cell table, the object table containing the relevant information of the objects in the input image can be generated, so that the relevant information of the objects in the acquired input image can be presented to the user more intuitively and normally in the form of the cell table.
The image processing method provided by the embodiment of the present disclosure may be applied to the image processing apparatus provided by the embodiment of the present disclosure, which may be configured on an electronic device. The electronic device may be a personal computer, a mobile terminal, etc., and the mobile terminal may be a hardware device such as a mobile phone, a tablet computer, etc.
Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. It should be noted that the present disclosure is not limited to these specific embodiments.
Fig. 1 is a flowchart of an image processing method according to at least one embodiment of the present disclosure.
As shown in fig. 1, an image processing method provided in at least one embodiment of the present disclosure includes the following steps S10 to S40.
Step S10: an input image is acquired. For example, the input image includes a table area including a plurality of object areas, each object area of the plurality of object areas including at least one object.
Step S20: the input image is subjected to region identification processing to obtain a plurality of object region frames corresponding to the plurality of object regions one by one and a table region frame corresponding to the table region.
Step S30: the input image is subjected to a form line detection process to determine whether the form area includes a wired form.
Step S40: in response to the table area not including the wired table, the following steps S401 to S403 are performed.
Step S401: and performing alignment processing on the plurality of object region frames to obtain a plurality of region annotation frames corresponding to the plurality of object region frames one by one.
Step S402: at least one dividing line is determined based on the plurality of object region frames, and the plurality of region labeling frames are divided through the at least one dividing line to form a plurality of cells.
Step S403: based on the plurality of cells, a cell table corresponding to the table area is generated.
For step S10, for example, the input image may be an image obtained by photographing a certain object by the user, where the object may be, for example, a business card, a test paper, a laboratory sheet, a document, an invoice, etc., and correspondingly, the object in the input image may be a text (chinese and/or foreign text; printed text and/or handwritten text), data, a graphic, a symbol, etc. included in the object.
For example, the shape of the input image may be a regular shape such as a rectangle or a square, or may be an irregular shape, and the shape, size, etc. of the input image may be set by the user according to the actual situation. For example, the input image may be an image captured by a digital camera, a mobile phone, or the like, and may be an original image directly captured by a digital camera, a mobile phone, or the like, or an image obtained by preprocessing the original image. For example, the input image may be a grayscale image, a color image, or the like.
For example, fig. 2A and 3A are examples of two input images, respectively. The input image shown in fig. 2A includes a table area 201, and the table area 201 includes a plurality of object areas 202, and each object area 202 includes at least one text or data. The input image shown in fig. 3A includes a table area 301, and the table area 301 includes a plurality of object areas 302, and each object area 302 includes at least one text or data. For example, in the object regions 202 and 302, the text and data are arranged in a row in the horizontal direction.
Note that, in the example shown in fig. 2A and 3A, the text and data included in the object region are arranged in one line in the horizontal direction, and in other examples of the present disclosure, the input image may also include a case where the text or data included in the object region are arranged in one line in the vertical direction or in a plurality of lines in the horizontal direction and the vertical direction, respectively, which is not limited by the embodiments of the present disclosure. In the examples shown in fig. 2A and 3A, the objects contained in the object region are text or data, while in other examples of the present disclosure, the objects contained in the object region may also include graphics, symbols, etc., as embodiments of the present disclosure are not limited in this regard.
For example, in the examples shown in fig. 2A and 3A, the form areas 201 and 301 and the object areas 202 and 302 to be recognized in the input image are rectangular in shape, and in other examples of the present disclosure, the form areas and the object areas to be recognized in the input image may also be other regular shapes such as diamond, square, or the like, or may also be irregular shapes or the like, as long as it is satisfied that the form areas can cover all the objects to be recognized and each object area can cover the corresponding objects to be recognized.
For example, for the input image shown in fig. 2A, the text "2019 annunciation" located in the upper right corner may be divided in the form area 201, that is, the form area 201 includes the area occupied by the text "2019 annunciation"; alternatively, the text "2019 annuity report" may be divided outside of the form area 201, i.e., the form area 201 may not include the area occupied by the text "2019 annuity report", as embodiments of the present disclosure are not limited in this regard. For the input image shown in fig. 3A, the text "blood routine test" located at the top may be divided outside the form area 301, i.e., the form area 301 does not include the area occupied by the text "blood routine test"; alternatively, the text "blood routine test" may also be divided into the form area 301, i.e., the form area 301 may also include the area occupied by the text "blood routine test", as embodiments of the present disclosure are not limited in this regard.
For example, in some embodiments of the present disclosure, after the input image is acquired, each operation in the subsequent step may be further performed after the input image is preprocessed, so as to improve accuracy and reliability of each operation in the subsequent step. For example, the input image may be subjected to a correction process, which may include, for example, global correction and local correction, the global correction may correct, for example, a global offset condition of a text line, and since details may be left unadjusted after the global correction, some supplementary correction may be performed by the local correction for details that are ignored in the global correction process, thereby reducing or preventing loss of details due to the global correction, and improving accuracy and reliability of the obtained correction process result.
For step S20, for example, a table region and a plurality of object regions in the input image may be identified by a region identification model, which may be implemented using machine learning techniques and run on a general purpose computing device or a special purpose computing device, for example. The region identification model may be, for example, a neural network model that is trained in advance. For example, the region identification model may be implemented using a neural network suitable for DEEP convolutional neural network (DEEP-CNN) or the like.
For example, the specific shapes of the table area frame and the object area frame may be determined according to the specific shapes, sizes, and the like of the table area and the object area, respectively, the table area frame surrounds the table area and is capable of containing all objects located in the table area therein, and the object area frame surrounds the corresponding object area and is capable of containing all objects located in the object area therein. For example, the distance between the border of the object region frame and the object located at the edge of the object region may approach 0 so that the shape of the object region frame is closer to the actual shape of the object region. For example, the distance between the border of the form region frame and objects located at the edges of the form region may be adaptively increased as compared to the object region frame, such that the form region frame may contain all objects therein.
For example, taking the case of performing the region recognition processing on the input image shown in fig. 2A and 3A, as shown in fig. 2B, after performing the region recognition processing on the input image shown in fig. 2A, a table region frame 210 corresponding to the table region 201 and a plurality of object region frames 220 corresponding to the plurality of object regions 202 one by one can be obtained; as shown in fig. 3B, after the area recognition processing is performed on the input image shown in fig. 3A, a table area frame 310 corresponding to the table area 301 and a plurality of object area frames 320 corresponding to the plurality of object areas 302 one by one can be obtained.
For example, in order to facilitate subsequent operations, in embodiments provided by the present disclosure, the shape of the object region frame may be set to a regular shape such as a rectangle, a square, or the like, for example, so that subsequent alignment processing operations are performed on the plurality of object region frames in response to the table region not including the wired table.
In the embodiments of the present disclosure, "the shape of the table region" and "the shape of the object region" represent the general shape of the table region or the object region, and similarly, "the shape of the table region frame" and "the shape of the object region frame" represent the general shape of the table region frame or the object region frame.
For step S30, for example, a table line detection process may be performed on the input image based on the edge detection algorithm to identify a table line segment in the input image, and further, whether a wired table is included in a table area of the input image is determined according to the identification result of the table line segment.
Fig. 4 is a flowchart of step S30 in an image processing method according to at least one embodiment of the present disclosure.
For example, as shown in fig. 4, step S30 may include the following steps S301 to S302.
Step S301: in a case where the table line detection processing is performed on the input image and it is detected that the input image does not have a table line segment, it is determined that the table area does not include a wired table.
Step S302: in the case where the input image is subjected to the table line detection processing and one or more table line segments are obtained, it is determined whether or not the table area includes a wired table based on the one or more table line segments.
With step S301, in the case where it is determined that there is no table line segment in the input image according to the table line detection processing result, it may be determined that no wired table is included in the table area of the input image, whereby the operation of step S40 is performed in response to the table area of the input image not including the wired table.
In step S302, when at least one table segment is obtained in the input image according to the result of the table line detection processing, it is necessary to further determine whether or not a wired table is included in the table area of the input image based on the obtained table segment.
The specific operation procedure in step S302 will be described below taking the example of performing the table line detection processing on the input image shown in fig. 2A.
Fig. 5 is a flowchart illustrating a partial operation of step S302 in an image processing method according to at least one embodiment of the present disclosure.
For example, as shown in fig. 5, performing the table line detection processing on the input image in step S302 to obtain one or more table line segments may include the following steps S3011 to S3016.
Step S3011: and performing line segment detection on the input image to obtain a plurality of detection line segments.
Step S3012: and combining the plurality of detection line segments to redraw to obtain a plurality of first intermediate table line segments.
Step S3013: and respectively performing expansion processing on the plurality of first middle table line segments to obtain a plurality of second middle table line segments.
Step S3014: deleting a second middle table line segment which is positioned in any object area frame in the object area frames, and taking the rest second middle table line segments as a plurality of third middle table line segments.
Step S3015: and combining the plurality of third intermediate table line segments to obtain a plurality of fourth intermediate table line segments.
Step S3016: and respectively performing expansion treatment on the plurality of fourth middle table line segments to obtain one or more fifth middle table line segments, and taking the one or more fifth middle table line segments as one or more table line segments.
For step S3011, for example, taking the input image shown in fig. 2A as an example, as shown in fig. 2C, after the line segment detection is performed on the input image shown in fig. 2A, a plurality of detection line segments L0 may be obtained, whereby operations such as merging processing, expanding processing, and the like in the subsequent steps may be performed based on the detected plurality of detection line segments L0 to obtain corresponding table line segments, so that whether the table area 201 of the input image shown in fig. 2A includes a wired table is determined based on the obtained table line segments.
For step S3012, the merging process includes: and for the first line segment to be merged and the second line segment to be merged, merging the first line segment to be merged and the second line segment to be merged in response to the difference between the slope of the first line segment to be merged and the slope of the second line segment to be merged being smaller than a slope threshold and the distance between the end point of the first line segment to be merged, which is close to the second line segment to be merged, and the end point of the second line segment to be merged, which is close to the first line segment to be merged, being smaller than or equal to a distance threshold. For example, the first line segment to be merged and the second line segment to be merged are any two detected line segments of the plurality of detected line segments.
For example, with respect to a plurality of detection line segments L0 detected based on the input image shown in fig. 2A, any two detection line segments L0 among the plurality of detection line segments L0 are taken as a first line segment to be merged and a second line segment to be merged, it is determined whether the any two detection line segments L0 satisfy the condition for performing the merging process, that is, whether the difference between the slopes of the any two detection line segments L0 is smaller than a slope threshold value and whether the distance between the end points of the any two detection line segments L0 adjacent to each other is smaller than or equal to a distance threshold value, and in the case that the condition for the merging process is satisfied, the any two detection line segments L0 are merged to obtain a first intermediate table line segment L1.
For example, the slope threshold may range from 0 ° to 10 °, and the first distance threshold may be a value in units of pixels, for example, the first distance threshold may range from 0 to 10 pixels, thereby improving accuracy and reliability of a table line segment obtained based on the detected line segment.
For example, taking the detected line segments L11 and L12 located in the region RN1 in fig. 2C as an example, as shown in fig. 2D, the detected line segments L11 and L12 are respectively used as a first line segment to be merged and a second line segment to be merged, where the difference between the slope of the first line segment to be merged L11 and the slope of the second line segment to be merged L12 approaches zero, that is, the difference between the slope of the first line segment to be merged L11 and the slope of the second line segment to be merged L12 may be determined to be smaller than the slope threshold, and the distance between the end point D11 of the first line segment to be merged L11 near the second line segment to be merged L12 and the end point D12 of the second line segment to be merged L12 near the first line segment to be merged L11 is smaller than or equal to the distance threshold, so that the first line segment to be merged L11 and the second line segment to be merged L12 may be merged to obtain the first intermediate table line segment L1. Thus, after the merging process is performed on any two detection line segments L0 in fig. 2C, a plurality of first intermediate table line segments L1 can be obtained accordingly.
For step S3013, the obtained plurality of first intermediate table segments L1 are respectively subjected to expansion processing to obtain a plurality of second intermediate table segments L2. For example, the width of the expanded second intermediate table segment L2 may be 1 to 4 times that of the corresponding first intermediate table segment L1, so as to facilitate the merging processing operation in the subsequent step.
For step S3014, the second intermediate table segment L2 located within any of the object region frames 220 is deleted, and the remaining second intermediate table segments L2 are taken as a plurality of third intermediate table segments L3. For example, in step S3014, if one second intermediate table segment L2 is located entirely within one object region frame 220, that is, does not pass through the object region frame 220, the second intermediate table segment L2 is deleted, so that the detected segment, for example, obtained from text or data, shown in fig. 2C may be deleted, thereby further improving the accuracy and reliability of the subsequently obtained table segments. For example, after step S3014, a plurality of third intermediate table segments L3 shown in fig. 2E may be obtained.
For step S3015 and step S3016, after obtaining the plurality of third intermediate table segments L3 shown in fig. 2E, the merging process and the expanding process in step S3012 and step S3013 are repeated based on the plurality of third intermediate table segments L3, so as to obtain a plurality of fifth intermediate table segments shown in fig. 2F, and the fifth intermediate table segments obtained in fig. 2F are taken as the table segments TL, so that the accuracy and the reliability of the obtained table segments TL are improved, and the accuracy and the reliability of the judging process based on whether the table segments TL include the wired table in the table area of the input image are improved.
For example, the merging process in step S3015 includes: and for the first line segment to be merged and the second line segment to be merged, merging the first line segment to be merged and the second line segment to be merged in response to the difference between the slope of the first line segment to be merged and the slope of the second line segment to be merged being smaller than a slope threshold and the distance between the end point of the first line segment to be merged, which is close to the second line segment to be merged, and the end point of the second line segment to be merged, which is close to the first line segment to be merged, being smaller than or equal to a distance threshold. For example, the first segment to be merged and the second segment to be merged are any two third intermediate table segments of the plurality of third intermediate table segments.
For the operation procedures of step S3015 and step S3016, reference may be made to the description of the operation procedures of step S3012 and step S3013, which are not repeated here.
Thus, it is possible to determine whether to directly perform step S40 shown in fig. 1 or whether to further determine whether the form area of the input image includes a wired form based on the obtained form line segment, according to the result of the form line detection process of the input image.
For example, after performing the table line detection processing on the input image and obtaining at least one table line segment, the above-described determination in step S302 as to whether the table area includes a wired table based on one or more table line segments may include the following steps S3019 to S3022.
In response to obtaining a table segment:
step S3019: the determination table area does not include a wired table.
In response to deriving a plurality of table line segments:
step S3020: an intersection between a plurality of table line segments is determined.
Step S3021: in response to the number of intersection points being greater than or equal to the second reference value, it is determined that the table region includes a wired table.
Step S3022: in response to the number of intersection points being less than the second reference value, it is determined that the table region does not include a wired table.
With step S3019, in the case where the table line detection processing is performed on the input image and it is detected that the input image has only one table line segment, since one table line segment cannot form a complete table structure, the table area of the input image can be determined to not include a wired table, and the operation of step S40 shown in fig. 1 is performed.
For steps S3020 to S3022, in the case where the input image is subjected to the table line detection processing and it is detected that the input image has a plurality of table line segments, it is necessary to further determine whether a complete table structure can be formed between the plurality of table line segments based on the plurality of table line segments to determine whether the table area of the input image includes a wired table. For example, in steps S3020 to S3022, it is determined whether a complete table structure can be formed between a plurality of table line segments by determining the number of intersections based on the plurality of table line segments, thereby further determining whether the table area of the input image includes a wired table.
For example, as shown in fig. 6, the intersection points between the plurality of table line segments in step S3020 may be determined by the following steps S3020A to S3020D.
Step S3020A: the plurality of table line segments is divided into a plurality of first table line segments and a plurality of second table line segments.
Step S3020B: the method comprises the steps of dividing a plurality of first table line segments into a plurality of first line segment rows and marking row numbers of first line segment rows to which each first table line segment in the plurality of first table line segments belongs. For example, each first line segment row includes at least one first table line segment arranged along a third direction.
Step S3020C: the plurality of second table line segments are divided into a plurality of second line segment columns and column numbers of the second line segment columns to which each of the plurality of second table line segments belongs are marked. For example, each second line segment column includes at least one second table line segment arranged along the fourth direction.
Step S3030D: a plurality of intersection points between the plurality of first table line segments and the plurality of second table line segments are identified and coordinates of the plurality of intersection points are determined. For example, the coordinates of any of the plurality of intersections include a row number corresponding to a first table segment and a column number corresponding to a second table segment that intersect to constitute any of the intersections.
For example, in step S3020A, the included angle between each first table line segment and the third direction is in the first angle range, the included angle between each first table line segment and the fourth direction is in the second angle range, the included angle between each second table line segment and the third direction is in the second angle range, the included angle between each second table line segment and the fourth direction is in the first angle range, and the third direction and the fourth direction are perpendicular to each other.
For example, taking the plurality of table line segments TL shown in fig. 2F as an example, as shown in fig. 2G, the third direction R3 may be a horizontal direction shown in fig. 2G, and the fourth direction R4 may be a vertical direction shown in fig. 2G. For example, the first angular range may be 0 ° to 45 °, and the second angular range may be 45 ° to 90 °, whereby the plurality of table line segments TL may be divided into a plurality of first table line segments TL1 and a plurality of second table line segments TL2. Further, the plurality of first table line segments TL1 are divided into a plurality of first line segment rows in the fourth direction R4, and the row numbers of the first line segment rows to which each first table line segment TL1 belongs are marked, for example, the plurality of first line segment rows include the 1 st line segment row to the 43 rd line segment row as shown in fig. 2G; the plurality of second table line segments TL2 are divided into a plurality of second line segment columns along the third direction R3, which include, for example, the 1 st line segment column to the 5 th line segment column as shown in fig. 2G, and column numbers of the second line segment columns to which each of the second table line segments TL2 belongs are marked. Thus, the coordinates of each intersection N1 shown in fig. 2G can be obtained based on the row number corresponding to the first table line segment TL1 and the column number corresponding to the second table line segment TL2 constituting each intersection N1.
For example, after the coordinates of each intersection point N1 are determined, steps S3021 and S3022 are performed based on the number of intersection points N1 to determine whether a wired table is included in the table area of the input image.
For example, the second reference value in steps S3021 and S3022 may be a larger value of the number of the plurality of first line segment rows and the number of the plurality of second line segment columns. For example, taking the case shown in fig. 2G as an example, the number of the first line segment rows is 43, and the number of the second line segment columns is 5, the second reference value is 43. Thus, it can be judged whether the table area of the input image includes a wired table according to the magnitude relation between the number of intersections and the second reference value.
For example, taking the case shown in fig. 2G as an example, the number of intersection points N1 is 215, which is larger than the second reference value 43, it can be determined that the table area 201 of the input image shown in fig. 2A includes a wired table.
For example, after the input image shown in fig. 3A is processed in step S30 described above, it is determined that the table area 301 of the input image shown in fig. 3A does not include a wired table. Accordingly, in response to the table area 301 of the input image shown in fig. 3A not including the wired table, the above-described step S40 is performed to generate a cell table corresponding to the table area 301 of the input image shown in fig. 3A; in response to the table area 201 of the input image shown in fig. 2A including a wired table, the following step S50 is performed to generate a cell table corresponding to the table area 201 of the input image shown in fig. 2A.
Fig. 7 is a flowchart of another image processing method according to at least one embodiment of the present disclosure. It should be noted that, except for the step S50, the steps S10 to S30 shown in fig. 7 are substantially the same as the steps S10 to S30 shown in fig. 1, and the repetition is not repeated.
For example, as shown in fig. 7, in response to the form area including a wired form, the image processing method provided by the embodiment of the present disclosure further includes the following step S50.
Step S50: a cell table corresponding to the table region is generated based on the plurality of table line segments.
For example, taking the input image shown in fig. 2A as an example, after determining that the table area 201 of the input image shown in fig. 2A includes a wired table through step S30, a corresponding cell table may be generated based on the plurality of table line segments TL1 and TL2 shown in fig. 2G.
For example, in some embodiments of the present disclosure, step S50 may include the following step S501.
Step S501: each cell in the cell table is determined based on the plurality of intersections. For example, the vertex of each cell in the cell table is made up of at least three of the plurality of intersections.
For example, the obtained intersection is used as the vertex of each cell in the determination cell table, and each cell in the determination cell table is determined based on the coordinates of the intersection. For example, the cells may take the form of rectangles, squares, etc., so that one cell may be determined by three or more intersections, and a corresponding cell table may be generated by constructing a table structure from a plurality of cells.
For example, in some embodiments of the present disclosure, step S501 may include the following steps S5011 to S5014.
Step S5011: the current intersection point is determined. For example, the current intersection point is any one of a plurality of intersection points.
Step S5012: and determining a first current table line segment and a second current table line segment corresponding to the current intersection point based on the coordinates of the current intersection point. For example, the first current table segment is any one first table segment, and the second current table segment is any one second table segment.
Step S5013: a first intersection point on the first current table segment adjacent to the current intersection point is determined, and a second intersection point on the second current table segment adjacent to the current intersection point is determined.
Step S5014: a cell is determined based on the current intersection, the first intersection, and the second intersection.
Thus, by the table line segment where the intersection point is located, the first intersection point and the second intersection point adjacent to the current intersection point in the horizontal direction and the vertical direction, for example, can be determined, so that one cell is formed based on the determined intersection points, to generate the cell table presented in the form of the table structure.
For the case where the table area of the input image does not include a wired table, the above-described step S40 is performed, whereby a cell table corresponding to the table area is generated based on the object area frame identified in the input image.
Fig. 8 is a flowchart of step S401 in an image processing method according to at least one embodiment of the present disclosure.
For example, as shown in fig. 8, step S401 includes the following steps S4011 to S4013.
Step S4011: and dividing the table region frame into a plurality of coordinate grid regions which are arranged in M rows and N columns along the first direction and the second direction by taking the datum reference value as a coordinate unit so as to establish a table coordinate system. For example, M rows of grid regions are arranged in a first direction, N columns of grid regions are arranged in a second direction, and M and N are positive integers.
Step S4012: coordinates of the plurality of object region frames in the table coordinate system are determined.
Step S4013: and expanding the plurality of object region frames based on the coordinates of the plurality of object region frames in the table coordinate system to obtain a plurality of region labeling frames.
For example, taking the object region frame 320 located in the region RN2 in the input image shown in fig. 3A and 3B as an example, as shown in fig. 3C, after dividing the table region frame 310 into a plurality of coordinate cell regions 311 arranged in a plurality of rows and columns along the first direction R1 and the second direction R2, respective coordinates of each object region frame 320 in the table coordinate system are determined, for example, a row number and a column number of the coordinate cell region 311 corresponding to each side of each object region frame 320 in the table coordinate system are determined, respectively. Thus, the plurality of object region frames 320 are expanded based on the coordinates of the plurality of object region frames 320 in the table coordinate system, so as to obtain region labeling frames corresponding to the object region frames 320.
For example, the base reference value in step S4011 may be determined according to an average height of the plurality of object region frames in the first direction. Therefore, the relative positions of the object region frames can be accurately determined based on the generated table coordinate system, and the subsequent alignment processing of the object region frames based on the relative positions of the object region frames is facilitated to determine the region labeling frame.
For example, taking the text or data as an example of the objects included in the object region 302 of the input image shown in fig. 3A and 3B, the table region frame 310 may be divided into a plurality of coordinate regions 311 along the first direction R1 and the second direction R2 with the half text height or data height as a reference value, thereby forming a high-density table coordinate system with a row-column width of half text height or data height based on the table region frame 310. Thus, the relative position between the object region frames 320 can be determined more accurately based on the generated table coordinate system.
Fig. 9 is a flowchart of step S4012 in an image processing method according to at least one embodiment of the present disclosure. As shown in fig. 9, step S4012 includes the following steps S4012A to S4012C.
Step S4012A: a plurality of slopes of a plurality of object region boxes is determined. For example, the slope of each of the plurality of object region boxes represents the slope of the side of each object region box extending in the second direction relative to the second direction.
Step S4012B: and performing correction processing on the input image according to a plurality of slopes of a plurality of object region frames to obtain a corrected input image.
Step S4012C: coordinates of the plurality of object region frames in the table coordinate system are determined based on the corrected input image.
Therefore, before the coordinates of a plurality of object region frames in the table coordinate system are determined, the input image can be calibrated according to the slope of the edge, extending along the second direction R2, of each object region frame relative to the second direction R2, for example, the rotation angle of the input image in a plane formed by the first direction R1 and the second direction R2 is adjusted, global correction of the input image is achieved, global offset conditions of text lines in the input image are improved, and accuracy and reliability of the coordinates of the determined object region frames in the table coordinate system are improved, so that the relative positions among the object region frames can be determined accurately based on the table coordinate system.
For example, in some examples, the step S4012B described above can include the following steps S4012D and S4012E.
Step S4012D: an average of the plurality of slopes is calculated based on the plurality of slopes of the plurality of object region boxes.
Step S4012E: the input image is rotated in a plane constituted by the first direction and the second direction based on an average value of the plurality of slopes so that the average value of the plurality of slopes approaches 0.
By performing the rotation processing on the input image in the plane formed by the first direction and the second direction, the inclination angles of the plurality of object region frames with respect to the first direction or the second direction in the plane can be relatively kept uniform, for example, in a certain angle range, thereby improving the situation of, for example, line shift, which may occur in the objects included in the object region as a whole, and realizing the global correction on the input image.
In other examples of the present disclosure, with respect to the above-described step S4012A and step S4012B, the correction processing may be performed on the input image according to the slope of the side of each object region frame extending in the first direction R1 with respect to the first direction R1, and the embodiment of the present disclosure is not limited thereto.
In some examples of the present disclosure, step S4013 may include the following steps S4013A to S4013D.
Step S4013A: first start coordinates and first end coordinates of the plurality of object region frames in the first direction and second start coordinates and second end coordinates in the second direction in the table coordinate system are determined. For example, the first start coordinate of any one of the plurality of object region frames includes the coordinates of the start row of the grid region occupied by any one of the object region frames in the table coordinate system, the first end coordinate of any one of the object region frames includes the coordinates of the end row of the grid region occupied by any one of the object region frames in the table coordinate system, the second start coordinate of any one of the object region frames includes the coordinates of the start column of the grid region occupied by any one of the object region frames in the table coordinate system, and the second end coordinate of any one of the object region frames includes the coordinates of the end column of the grid region occupied by any one of the object region frames in the table coordinate system.
Step S4013B: dividing the plurality of object region frames into a plurality of rows and a plurality of columns, performing row-by-row expansion processing on the plurality of object region frames according to the direction from a start row to a stop row in the table coordinate system, and sequentially performing expansion processing on each row of object region frames according to the direction from the start column to the stop column in the table coordinate system.
For the i-th object region box of the plurality of object region boxes, for example, i is a positive integer:
step S4013C: performing expansion processing on the ith object area frame in a first direction, so that the starting line of the coordinate grid area occupied by the ith object area frame moves the basic reference value in the first direction along the direction away from the ending line of the coordinate grid area occupied by the ith object area frame each time, so that the ending line of the coordinate grid area occupied by the ith object area frame moves the basic reference value in the first direction along the direction away from the starting line of the coordinate grid area occupied by the ith object area frame each time until the first starting coordinate of the ith object area frame is equal to 0 or equal to the first ending coordinate of any object area frame except for the ith object area frame in the plurality of object area frames, and the first ending coordinate of the ith object area frame is equal to the maximum line value of a table coordinate system or equal to the first starting coordinate of any object area frame except for the ith object area frame in the plurality of object area frames.
Step S4013D: and expanding the ith object area frame in a second direction, so that the starting column of the coordinate grid area occupied by the ith object area frame moves the basic reference value in the second direction along the direction away from the ending column of the coordinate grid area occupied by the ith object area frame each time, so that the ending column of the coordinate grid area occupied by the ith object area frame moves the basic reference value in the second direction along the direction away from the starting column of the coordinate grid area occupied by the ith object area frame each time until the second starting coordinate of the ith object area frame is equal to 0 or equal to the second ending coordinate of any one object area frame except the ith object area frame in the plurality of object area frames, and the second ending coordinate of the ith object area frame is equal to the maximum column value of a table coordinate system or equal to the second starting coordinate of any one object area frame except the ith object area frame in the plurality of object area frames, thereby obtaining the area labeling frame corresponding to the ith object area frame.
For example, taking the input image shown in fig. 3A as an example, the plurality of object region frames 320 may be divided into 23 rows and 7 columns, and the object region frames 320 of each row may be sequentially subjected to the expansion processing in a direction from "serial number" to "22", for example, to perform the alignment processing for the object region frames 320 of each row, and the object region frames 320 of each column may be sequentially subjected to the expansion processing in a direction from "serial number" to "reference value", for example, to perform the alignment processing for the object region frames 320 of each column.
For example, in the above-mentioned process of performing the expansion processing on the plurality of object region frames 320, the plurality of object region frames 320 may be sequentially subjected to the expansion processing once to obtain the corresponding plurality of region labeling frames, or the plurality of object region frames 320 may be sequentially subjected to the expansion processing repeatedly for a plurality of times to obtain the corresponding plurality of region labeling frames, that is, each object region frame 320 may be subjected to the expansion processing once or a plurality of times to obtain the region labeling frame after the final expansion processing. The embodiments of the present disclosure do not particularly limit the number of expansion processes.
For example, taking the determination of the second ending coordinate of the object region frame 320 as an example, as shown in fig. 3C and 3D, the object region frame 321 is inflated in the second direction R2 such that the ending column of the coordinate cell region 311 occupied by the object region frame 321 moves the reference value in the second direction R2 each time in a direction away from the starting column of the coordinate cell region 311 occupied by the object region frame 321 until the second ending coordinate of the object region frame 321 is made equal to the second starting coordinate of the object region frame 325 (i.e., the second starting coordinate of the object region frame 326, the object region frame 327, or the object region frame 328), thereby determining the second ending coordinate of the object region frame 321; expanding the object region frame 322 in the second direction R2 such that the ending column of the grid region 311 occupied by the object region frame 322 moves the reference value in the second direction R2 each time in a direction away from the starting column of the grid region 311 occupied by the object region frame 322 until the second ending coordinate of the object region frame 322 is made equal to the second starting coordinate of the object region frame 325, thereby determining the second ending coordinate of the object region frame 322; expanding the object region frame 323 in the second direction R2 such that the ending column of the grid region 311 occupied by the object region frame 323 moves the reference value in the second direction R2 each time in a direction away from the starting column of the grid region 311 occupied by the object region frame 323 until the second ending coordinate of the object region frame 323 is made equal to the second starting coordinate of the object region frame 325, thereby determining the second ending coordinate of the object region frame 323; the object region frame 324 is inflated in the second direction R2 such that the ending column of the grid region 311 occupied by the object region frame 324 moves the reference value in the second direction R2 each time in a direction away from the starting column of the grid region 311 occupied by the object region frame 324 until the second ending coordinate of the object region frame 324 is made equal to the second starting coordinate of the object region frame 325, thereby determining the second ending coordinate of the object region frame 324. The determination method of the second start coordinate and the first end coordinate of the object region frame may refer to the determination process of the second end coordinate, which is not described herein.
Thus, after the expansion processing is performed for each of the target region frames 320 in the first direction R1 and the second direction R2, a plurality of region marking frames aligned in one-to-one correspondence with the plurality of target region frames 320 can be obtained.
It should be noted that, in other examples of the present disclosure, for example, in a case where there is a larger distance between adjacent object region frames in the second direction, the plurality of object region frames may be inflated only in the first direction, and the object region frames may not be inflated in the second direction, so as to obtain the region labeling frame after the alignment processing in the first direction, so that the alignment processing procedure for the plurality of object region frames is simplified, and the implementation procedure of the image processing method provided in the present disclosure is optimized.
Fig. 10 is a partial flowchart of step S402 in an image processing method according to at least one embodiment of the present disclosure.
For example, as shown in fig. 10, determining at least one dividing line based on a plurality of region labeling frames in step S402 includes the following steps S421 to S426.
Step S421: and establishing a pixel coordinate system based on the table area frame by taking the pixels as coordinate units. For example, the pixel coordinate system includes a plurality of pixel units, a first coordinate axis of the pixel coordinate system is parallel to a first direction, and a second coordinate axis of the pixel coordinate system is parallel to a second direction.
Step S422: coordinates of the plurality of object region frames in the pixel coordinate system are determined to obtain a plurality of pixel regions corresponding to the plurality of object region frames one by one.
Step S423: the pixel units occupied by the plurality of pixel regions in the pixel coordinate system are marked as first pixel units, and the pixel units except the first pixel units occupied by the plurality of pixel regions in the pixel coordinate system are marked as second pixel units.
Step S424: and sequentially determining the number of the first pixel units included in each column of pixel units in the pixel coordinate system along the second direction.
Step S425: and responding to the number of the first pixel units included in the Ren Yiyi column pixel units being smaller than or equal to the first pixel reference value, taking any one column of pixel units as a first intermediate dividing line to obtain at least one first intermediate dividing line.
Step S426: at least one first split line extending along the first direction is determined in the table coordinate system based on the at least one first intermediate split line. For example, the at least one split line comprises at least one first split line.
For example, taking the input image shown in fig. 3A as an example, referring to fig. 3E and 3F, a pixel coordinate system in pixel units as shown in fig. 3E is established based on the table area frame 310 shown in fig. 3B, and after determining coordinates of the plurality of object area frames 320 in the pixel coordinate system, a plurality of pixel areas 321 corresponding to the plurality of object area frames 320 one by one can be obtained.
For example, as shown in fig. 3E, a pixel unit occupied by the pixel region 321 in the pixel coordinate system is marked as a first pixel unit PX1, for example, a white pixel unit shown in fig. 3E; the pixel units in the pixel coordinate system except for the first pixel unit PX1 occupied by the plurality of pixel areas 321 are each labeled as a second pixel unit PX2, for example, a black pixel unit shown in fig. 3E, whereby the relative positions of the plurality of pixel areas 321 corresponding to the plurality of object area frames 320 in the pixel coordinate system can be represented by the first pixel unit PX1 and the second pixel unit PX 2.
Further, after each pixel unit in the pixel coordinate system is marked as a first pixel unit PX1 and a second pixel unit PX2, the number of first pixel units PX1 included in each column of pixel units in the pixel coordinate system is sequentially determined along the second direction R2, and a first intermediate division line is determined according to the number of first pixel units PX1 included in each column of pixel units. When the number of the first pixel units PX1 included in one column of pixel units is less than or equal to the first pixel reference value, for example, the number of the first pixel units PX1 included in the column of pixel units is 0 or the number of the first pixel units PX1 included in the column of pixel units is significantly less than the number of the first pixel units PX1 included in other column of pixel units, the column of pixel units may serve as a first intermediate dividing line. Thus, one or more first dividing lines extending in the first direction R1, for example, one or more first dividing lines corresponding to the line segment CL1 shown in fig. 3F, may be determined in the table coordinate system based on the one or more first intermediate dividing lines determined in the pixel coordinate system, so that the dividing process of the plurality of region marking frames in the second direction R2 may be implemented based on the one or more first dividing lines obtained in the subsequent step.
For example, in some examples, the first pixel reference value may be 0, so that in response to the number of first pixel units PX1 included in a column of pixel units being equal to the first pixel reference value (i.e., equal to 0), that is, when the first pixel unit PX1 is not included in a column of pixel units, each pixel unit in the column is the second pixel unit PX2, the column of pixel units may be taken as a first intermediate split line. Alternatively, in some examples, the first pixel reference value may be determined based on an image height of the input image, for example, the first pixel reference value may be determined to be a positive number greater than 0 based on the image height of the input image, whereby a column of pixel units is taken as a first intermediate split line in response to the number of first pixel units PX1 included in the column being less than the first pixel reference value. For example, when the first pixel reference value is determined to be a positive number PR1 greater than 0 based on the image height of the input image, if the number N1 of the first pixel units PX1 included in a column of pixel units satisfies 0N 1 < PR1, the column of pixel units may be regarded as one first intermediate division line. Thus, the part of the input image which is not occupied by the object region or the object included in the object region or the relatively less part of the input image which is occupied by the object region or the object included in the object region can be accurately determined, the segmentation processing of the region labeling frame corresponding to the object region is realized based on the part, and the table structure corresponding to the table region is generated based on the region labeling frame after the segmentation processing.
For example, taking the example of determining the first pixel reference value PR1 based on the image height of the input image, the image height of the input image is, for example, the total number PN1 of pixels included in an entire column of pixels along the first direction R1 (i.e., the column direction), the first pixel reference value PR1 may be, for example, 0.3 times the image height of the input image, that is, pr1=0.3×pn1, so that when the number of first pixel units PX1 included in a column of pixel units in the pixel coordinate system is less than 0.3×pn1, it may be determined that the number of first pixel units PX1 included in the column of pixel units is significantly or relatively far less than the number of first pixel units PX1 included in other columns of pixel units, and then the column of pixel units may be used as a first intermediate division line, thereby improving the accuracy and reliability of the first intermediate division line determined in the pixel coordinate system, and facilitating the subsequent division of the region labeling frame.
It should be noted that, in other examples of the present disclosure, the first pixel reference value may also be 0.1 times, 0.15 times, 0.2 times, 0.25 times, 0.35 times, 0.4 times, or other suitable values of the image height of the input image, which embodiments of the present disclosure are not limited.
It should be noted that, in other examples of the present disclosure, the first pixel reference value may also be determined based on the image height of the table area or the table area frame, that is, the first pixel reference value may be based on the total number of pixels included in an entire column of pixels along the first direction R1 in the table area, for example, the first pixel reference value may be 0.15 times, 0.2 times, 0.3 times, or other suitable values of the image height of the table area or the table area frame, which is not limited in this embodiment of the present disclosure.
For example, step S426 may include the following steps S426A and S426B.
Step S426A: in response to any one of the at least one first intermediate split line not having an adjacent first intermediate split line in the second direction, mapping any one first intermediate split line from the pixel coordinate system into the table coordinate system to obtain one first split line corresponding to any one first intermediate split line in the table coordinate system.
Step S426B: in response to the at least one first intermediate split line comprising X first intermediate split lines that are consecutive in the second direction, mapping any one of the X first intermediate split lines from the pixel coordinate system into the table coordinate system to obtain one first split line corresponding to the X first intermediate split lines in the table coordinate system. For example, X is a positive integer.
For example, when the obtained one first intermediate division line does not have an adjacent first intermediate division line in the second direction R2, that is, when the number of first pixel units PX1 included in one column of pixel units is equal to 0 and the number of first pixel units PX1 included in any one column of pixel units adjacent to the one column of pixel units in the second direction R2 is greater than 0, or when the number of first pixel units PX1 included in one column of pixel units is smaller than the first pixel reference value PR1 (PR 1 > 0) and the number of first pixel units PX1 included in any one column of pixel units adjacent to the one column of pixel units in the second direction R2 is greater than or equal to the first pixel reference value PR1, the one column of pixel units may be regarded as one first intermediate division line, and the one first intermediate division line may be mapped from the pixel coordinate system to the table coordinate system.
For example, when the obtained one first intermediate division line has adjacent first intermediate division lines in the second direction R2, that is, when the number of first pixel units PX1 included in one column of pixel units is smaller than the first pixel reference value PR1 and the number of first pixel units PX1 included in any one column of pixel units adjacent to the one column of pixel units in the second direction R2 is also smaller than the first pixel reference value PR1, that is, when the number of first pixel units PX1 included in one column of pixel units is equal to or smaller than a positive integer of PR1 and the number of first pixel units PX1 included in any one column of pixel units adjacent to the one column of pixel units in the second direction R2 is also equal to or smaller than a positive integer of PR1, then any one first intermediate division line of the adjacent first intermediate division lines may be mapped from the pixel coordinate system to the table coordinate system.
For example, in some examples, when there are a plurality of first intermediate dividing lines that are continuous in the second direction R2 in the pixel coordinate system, one first intermediate dividing line located at an intermediate position in the plurality of first intermediate dividing lines in the second direction R2 may be mapped from the pixel coordinate system into the table coordinate system to obtain one first dividing line, for example, a center line of the plurality of first intermediate dividing lines may be taken and mapped from the pixel coordinate system into the table coordinate system to obtain one first dividing line, so that accuracy and reliability of dividing the plurality of region labeling frames based on the obtained dividing line in the subsequent step are improved, and thus the obtained table structure corresponding to the table region in the input image is optimized.
For example, as shown in fig. 10, determining at least one dividing line based on a plurality of object region frames in step S402 further includes the following steps S427 to S432.
Step S427: an object included in each of a plurality of object regions is identified.
Step S428: coordinates of the object included in each object region in the pixel coordinate system are determined.
Step S429: the pixel units occupied by the objects included in each object region in the pixel coordinate system are marked as third pixel units, and the pixel units in the pixel coordinate system except for the third pixel units occupied by the objects included in each object region are marked as fourth pixel units.
Step S430: the number of the third pixel units included in each row of pixel units in the pixel coordinate system is sequentially determined along the first direction.
Step S431: and responding to the fact that the number of the third pixel units included in any one row of pixel units is smaller than or equal to the second pixel reference value, taking any one row of pixel units as a second intermediate dividing line to obtain at least one second intermediate dividing line.
Step S432: at least one second split line extending in the second direction is determined in the table coordinate system based on the at least one second intermediate split line. For example, the at least one parting line comprises at least one second parting line.
For example, taking the input image shown in fig. 3A as an example, referring to fig. 3G and 3H, after a pixel coordinate system in units of pixels is established based on the table area frame 310 shown in fig. 3B, coordinates of an object included in the identified object area in the pixel coordinate system are determined.
For example, as shown in fig. 3G, a pixel unit occupied by an object included in each object region in the pixel coordinate system is marked as a third pixel unit PX3, for example, a white pixel unit shown in fig. 3G; the pixel units in the pixel coordinate system except for the third pixel unit PX3 occupied by the object included in each object region are each marked as a fourth pixel unit PX4, for example, a black pixel unit shown in fig. 3G, whereby the relative position of the object included in each object region in the pixel coordinate system can be represented by the third pixel unit PX3 and the fourth pixel unit PX 4.
Further, after each pixel unit in the pixel coordinate system is marked as a third pixel unit PX3 and a fourth pixel unit PX4, the number of third pixel units PX3 included in each row of pixel units in the pixel coordinate system is sequentially determined along the first direction R1, and the second intermediate division line is determined according to the number of third pixel units PX3 included in each row of pixel units. When the number of the third pixel units PX3 included in one row of pixel units is less than or equal to the second pixel reference value, for example, the number of the third pixel units PX3 included in the row of pixel units is 0 or the number of the third pixel units PX3 included in the row of pixel units is significantly less than the number of the third pixel units PX3 included in the other row of pixel units, the row of pixel units may serve as a second intermediate dividing line. Thus, one or more second dividing lines extending in the second direction R2, for example, one or more second dividing lines corresponding to the line segment CL2 shown in fig. 3H, may be determined in the table coordinate system based on the determined one or more second intermediate dividing lines in the pixel coordinate system, so that the dividing process of the plurality of region marking frames in the first direction R1 may be implemented based on the obtained one or more second dividing lines in the subsequent step.
In the above example, since the distance between objects such as text or data adjacent in the first direction R1 in the input image is small, the corresponding second intermediate division line extending in the second direction R2 may be determined in the pixel coordinate system based on the determined coordinates of the objects included in each object region, thereby improving the accuracy and reliability of the second division line in the table coordinate system obtained based on the second intermediate division line, and further improving the accuracy and reliability of the subsequent division processing of the plurality of region label frames based on the division line.
It should be noted that, in other examples of the present disclosure, the second intermediate dividing line may also be determined based on the object region frame with reference to the determination method of the first intermediate dividing line; alternatively, the first intermediate dividing line may also be determined based on the object included in each object region in the input image with reference to the determination method of the second intermediate dividing line according to the specific arrangement manner of the objects in the input image, which is not limited in the embodiments of the present disclosure.
For example, in some examples, the second pixel reference value may be 0, and thus in response to the number of third pixel units PX3 included in a row of pixel units being equal to the second pixel reference value (i.e., equal to 0), that is, when the third pixel unit PX3 is not included in a row of pixel units, each pixel unit in the row is the fourth pixel unit PX4, the row of pixel units may be taken as a second intermediate dividing line. Alternatively, in some examples, the second pixel reference value may be determined based on an image width of the input image or an image length of the input image, for example, the second pixel reference value may be determined to be a positive number greater than 0 based on the image width of the input image, whereby a row of pixel units is taken as a second intermediate division line in response to the number of third pixel units PX3 included in the row of pixel units being smaller than the second pixel reference value. For example, when the second pixel reference value is determined to be a positive number PR2 greater than 0 based on the image width of the input image, if the number N2 of the third pixel units PX3 included in a row of pixel units satisfies 0N 2 < PR2, the row of pixel units may be regarded as one second intermediate division line. Thus, it is possible to accurately determine a portion of the input image that is not occupied by the object included in the object region or a portion that is occupied relatively little by the object included in the object region, and based on the portion, to realize the division processing of the region labeling frame corresponding to the object region, and further generate a table structure corresponding to the table region based on the region labeling frame after the division processing.
For example, taking the case of determining the second pixel reference value PR2 based on the image width of the input image, the image width of the input image is, for example, the total number PN2 of pixels included in an entire row of pixels along the second direction R2 (i.e., the row direction), and the second pixel reference value PR2 may be, for example, 0.3 times the image width of the input image, that is, pr2=0.3×pn2, so that when the number of third pixel units PX3 included in a row of pixel units in the pixel coordinate system is less than 0.3×pn2, it may be determined that the number of third pixel units PX3 included in the row of pixel units is significantly or relatively far less than the number of third pixel units PX3 included in other rows of pixel units, and then the row of pixel units may be used as a second intermediate dividing line, thereby improving the accuracy and reliability of the second intermediate dividing line determined in the pixel coordinate system, and facilitating the subsequent accurate division of the region labeling frame.
It should be noted that, in other examples of the disclosure, the second pixel reference value may also be 0.1 times, 0.15 times, 0.2 times, 0.25 times, 0.35 times, 0.4 times, or other suitable values of the image width of the input image, which embodiments of the disclosure are not limited.
It should be noted that, in other examples of the present disclosure, the second pixel reference value may also be determined based on the image width or the image length of the table area or the table area frame, that is, the second pixel reference value may be based on the total number of pixels included in an entire row of pixels along the second direction R2 in the table area, for example, the second pixel reference value may be 0.15 times, 0.2 times, 0.3 times, or other suitable value of the image width of the table area or the table area frame, which is not limited in this embodiment of the present disclosure.
For example, step S432 may include the following step S432A and step S432B.
Step S432A: in response to any one of the at least one second intermediate split line not having an adjacent second intermediate split line in the first direction, mapping the any one second intermediate split line from the pixel coordinate system into the table coordinate system to obtain one second split line corresponding to the any one second intermediate split line in the table coordinate system.
Step S432B: in response to the at least one second intermediate split line comprising Y second intermediate split lines that are continuous in the first direction, mapping any one of the Y second intermediate split lines from the pixel coordinate system into the table coordinate system to obtain one second split line in the table coordinate system corresponding to the Y second intermediate split lines. For example, Y is a positive integer.
For example, when the obtained one second intermediate division line does not have an adjacent second intermediate division line in the first direction R1, that is, when the number of third pixel units PX3 included in one row of pixel units is equal to 0 and the number of third pixel units PX3 included in any one row of pixel units adjacent to the one row of pixel units in the first direction R1 is greater than 0, or when the number of third pixel units PX3 included in one row of pixel units is smaller than the second pixel reference value PR2 (PR 2 > 0) and the number of third pixel units PX3 included in any one row of pixel units adjacent to the one row of pixel units in the first direction R1 is greater than or equal to the second pixel reference value PR2, the one row of pixel units may be regarded as one second intermediate division line, and the one second intermediate division line may be mapped from the pixel coordinate system to the table coordinate system.
For example, when the obtained one second intermediate division line has an adjacent second intermediate division line in the first direction R1, that is, when the number of third pixel units PX3 included in one row of pixel units is smaller than the second pixel reference value PR2 and the number of third pixel units PX3 included in any row of pixel units adjacent to the row of pixel units in the first direction R1 is also smaller than the second pixel reference value PR2, that is, when the number of third pixel units PX3 included in one row of pixel units is equal to or a positive integer smaller than PR2 and the number of third pixel units PX3 included in any row of pixel units adjacent to the row of pixel units in the first direction R1 is also equal to or a positive integer smaller than PR2, then one second division line may be obtained by mapping any one second intermediate division line of the adjacent second intermediate division lines from the pixel coordinate system to the table coordinate system.
For example, in some examples, when there are a plurality of second intermediate dividing lines in the pixel coordinate system, which are continuous in the first direction R1, one second intermediate dividing line located at an intermediate position in the plurality of second intermediate dividing lines in the first direction R1 may be mapped from the pixel coordinate system into the table coordinate system to obtain one second dividing line, for example, a center line of the plurality of second intermediate dividing lines may be taken and mapped from the pixel coordinate system into the table coordinate system to obtain one second dividing line, so that accuracy and reliability of dividing the plurality of region labeling frames based on the obtained dividing line in the subsequent step are improved, and thus the obtained table structure corresponding to the table region in the input image is optimized.
For example, in some examples of the present disclosure, the dividing the plurality of region labeling frames by the at least one dividing line in step S402 to form the plurality of cells may include: and dividing the plurality of region labeling frames according to at least one first dividing line and at least one second dividing line to obtain a plurality of cells. Thus, after the first dividing line and the second dividing line in the table coordinate system are determined according to the first intermediate dividing line and the second intermediate dividing line in the pixel coordinate system, the dividing process of the plurality of region marking frames can be realized in the table coordinate system based on the obtained first dividing line and second dividing line, and accordingly a corresponding table structure is generated according to the region marking frames after the dividing process.
For example, in some examples of the present disclosure, the dividing the plurality of region labeling frames by the at least one dividing line in step S402 to form the plurality of cells further includes: and in response to the table line detection processing of the input image and the detection of the table line detection processing, at least one table line segment is obtained, the correction processing is carried out on at least one dividing line based on the at least one table line segment, and the division processing is carried out on a plurality of region labeling frames based on the at least one dividing line after the correction processing, so that a plurality of cells are obtained. Therefore, the accuracy and the reliability of the obtained dividing line can be further improved by correcting the determined dividing line according to the actual dividing mode of the object region in the input image, and the accuracy and the reliability of the subsequent dividing processing of the plurality of region labeling frames based on the dividing line are improved.
For example, in some examples of the present disclosure, step S403 includes the following step S4031.
Step S4031: the multiple cells are combined and/or partitioned to obtain multiple target cells based on the multiple cells. For example, the cell table includes a plurality of target cells.
For example, in some examples, step S4031 includes the following steps S4031A to S4031C.
Step S4031A: and determining whether the splitting processing is needed for any region labeling frame or not according to the fact that at least one splitting line passes through any region labeling frame.
Step S4031B: responding to the need of splitting treatment of any region labeling frame, splitting any region labeling frame into a plurality of splitting labeling frames, and taking a cell of each splitting labeling frame in the plurality of splitting labeling frames as a target cell.
Step S4031C: and in response to the fact that any region labeling frame does not need to be split, merging a plurality of cells occupied by any region labeling frame to obtain a target cell.
In this way, in the image processing method provided in the embodiment of the present disclosure, the basic table structure of the unit table corresponding to the table area may be determined based on the plurality of area marking frames after the alignment processing, for example, the approximate relative positions between the unit cells corresponding to the table structure may be approximately obtained, and further after determining the dividing line based on the plurality of object area frames that are not subjected to the alignment processing in the input image, the plurality of area marking frames may be subjected to the division processing through the dividing line, so that by combining the dividing line with the area marking frame after the alignment processing, accurate division and positioning of the unit cells are achieved. Furthermore, after the plurality of cells are obtained, the plurality of cells formed after the dividing line dividing process can be further split according to the position of the dividing line relative to the region labeling frame, so that more accurate target cells are obtained, and the accuracy and reliability of the cell table corresponding to the table region in the input image generated based on the target cells are further optimized.
Fig. 11 is a flowchart of still another image processing method according to at least one embodiment of the present disclosure.
It should be noted that, except for steps S60 and S70, steps S10 to S40 shown in fig. 11 are substantially the same as steps S10 to S40 shown in fig. 1, and the repetition is not repeated.
For example, as shown in fig. 11, the image processing method further includes the following steps S60 and S70.
Step S60: an object included in each of a plurality of object regions is identified.
Step S70: and correspondingly filling objects included in the object areas into each target cell of the cell table respectively to generate an object table.
For step S60, the objects contained in each object region of the input image may be identified by, for example, a character recognition model to achieve extraction of the related information of the objects contained in the input image. For example, the character recognition model may be implemented based on optical character recognition or the like and run on a general purpose computing device or a special purpose computing device, for example, the character recognition model may be a pre-trained neural network model.
In step S70, the identified object is filled into each target cell of the corresponding cell table, so as to generate an object table containing related information of the object in the input image, so that the user can acquire information such as data, text content and the like in the input image more intuitively and normally through the generated object table.
For example, taking the input image shown in fig. 3A as an example, the image processing method provided by the embodiment of the present disclosure may generate an object table as shown in fig. 3I. Therefore, compared with the input image shown in fig. 3A, the object table shown in fig. 3I can enable relevant information of the objects in the input image to be presented to the user more concisely, normally and intuitively, so that efficiency of the user in acquiring the relevant information of the objects in the input image is improved, and user experience is improved.
In some embodiments of the present disclosure, after the step S70, the image processing method may further include: based on the input image, the objects filled in the cells of the cell table are adjusted. For example, it may be determined whether the object filled in each cell is accurate, such as whether an error or omission occurs, in comparison with the input image, thereby improving the accuracy and reliability of the generated object table. For example, the object in the input image includes text, that is, the related information of the object can be shown by text such as letters, numbers, symbols, etc., so that the generated object table can be clearer and more standard by adjusting the word height and/or the font of the text filled in each cell of the cell table, thereby being helpful for users to intuitively and conveniently acquire the required information. For example, the word height of the original text may be recorded and the text filled in using a preset font.
For example, in the case where the table area of the input image includes a wired table, the above steps S60 and S70 are performed after step S50 shown in fig. 7, thereby generating an object table corresponding to the table area of the input image.
For example, taking the input image shown in fig. 2A as an example, the image processing method provided by the embodiment of the present disclosure may generate an object table as shown in fig. 2H. Therefore, compared with the input image shown in fig. 2A, the object table shown in fig. 2H can enable relevant information of objects in the input image to be presented to a user more concisely, normally and intuitively, and is helpful for the user to acquire required information more intuitively and conveniently.
At least one embodiment of the present disclosure further provides an image processing apparatus, and fig. 12 is a schematic block diagram of an image processing apparatus provided by at least one embodiment of the present disclosure.
As shown in fig. 12, the image processing apparatus 500 may include: an image acquisition module 501, a region identification processing module 502, a table line detection processing module 503, and a cell table generation module 504.
For example, the image acquisition module 501 is configured to acquire an input image. For example, the input image includes a table area including a plurality of object areas, each object area of the plurality of object areas including at least one object.
For example, the region identification processing module 502 is configured to perform region identification processing on an input image to obtain a plurality of object region frames corresponding to a plurality of object regions one to one and a table region frame corresponding to a table region.
For example, the form line detection processing module 503 is configured to perform form line detection processing on an input image to determine whether a form area includes a wired form.
For example, the cell table generation module 504 is configured to, in response to the table region not including the wired table: performing alignment treatment on the multiple object region frames to obtain multiple region labeling frames corresponding to the multiple object region frames one by one; determining at least one dividing line based on the plurality of object region frames, and dividing the plurality of region labeling frames through the at least one dividing line to form a plurality of cells; based on the plurality of cells, a cell table corresponding to the table area is generated.
For example, the image acquisition module 501, the region identification processing module 502, the table line detection processing module 503, and the cell table generation module 504 may include codes and programs stored in a memory; the processor may execute the code and program to implement some or all of the functions of the image acquisition module 501, the region identification processing module 502, the table line detection processing module 503, and the cell table generation module 504 as described above. For example, the image acquisition module 501, the region identification processing module 502, the table line detection processing module 503, and the cell table generation module 504 may be dedicated hardware devices for implementing some or all of the functions of the image acquisition module 501, the region identification processing module 502, the table line detection processing module 503, and the cell table generation module 504 as described above. For example, the image acquisition module 501, the region identification processing module 502, the table line detection processing module 503, and the cell table generation module 504 may be one circuit board or a combination of a plurality of circuit boards for realizing the functions as described above. In an embodiment of the present application, the circuit board or the combination of the circuit boards may include: (1) one or more processors; (2) One or more non-transitory memories coupled to the processor; and (3) firmware stored in the memory that is executable by the processor.
It should be noted that, the image obtaining module 501 is configured to implement step S10 shown in fig. 1, the region identification processing module 502 is configured to implement step S20 shown in fig. 1, the table line detection processing module 503 is configured to implement step S30 shown in fig. 1, and the unit table generating module 504 is configured to implement step S40 shown in fig. 1, and includes, for example, steps S401 to S403. Thus, specific descriptions of the functions that can be implemented by the image acquisition module 501, the area identification processing module 502, the table line detection processing module 503, and the unit table generation module 504 may refer to the descriptions related to steps S10 to S40 in the above-described embodiments of the image processing method, and the repetition is omitted. In addition, the image processing apparatus may achieve similar technical effects to those of the foregoing image processing method, and will not be described herein.
At least one embodiment of the present disclosure further provides an electronic device, and fig. 13 is a schematic diagram of an electronic device provided by at least one embodiment of the present disclosure.
For example, as shown in fig. 13, the electronic device includes a processor 601, a communication interface 602, a memory 603, and a communication bus 604. The processor 601, the communication interface 602, and the memory 603 communicate with each other via the communication bus 604, and the components of the processor 601, the communication interface 602, and the memory 603 may also communicate with each other via a network connection. The present disclosure is not limited herein with respect to the type and functionality of the network. It should be noted that the components of the electronic device shown in fig. 13 are exemplary only and not limiting, and that the electronic device may have other components as desired for practical applications.
For example, the memory 603 is used to non-transitory store computer readable instructions. The processor 601 is configured to implement the image processing method according to any of the embodiments described above when executing computer readable instructions. For specific implementation of each step of the image processing method and related explanation, reference may be made to the above embodiment of the image processing method, which is not described herein.
For example, other implementations of the image processing method implemented by the processor 601 executing computer readable instructions stored on the memory 603 are the same as those mentioned in the foregoing method embodiment, and will not be described herein again.
For example, communication bus 604 may be a peripheral component interconnect standard (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
For example, the communication interface 602 is used to enable communication between an electronic device and other devices.
For example, the processor 601 and the memory 603 may be provided at a server side (or cloud).
For example, the processor 601 may control other components in the electronic device to perform desired functions. The processor 601 may be a device with data processing and/or program execution capabilities such as a Central Processing Unit (CPU), network Processor (NP), tensor Processor (TPU), or Graphics Processor (GPU); but also Digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The Central Processing Unit (CPU) can be an X86 or ARM architecture, etc.
For example, memory 603 may include any combination of one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory can include, for example, random Access Memory (RAM) and/or cache memory (cache) and the like. The non-volatile memory may include, for example, read-only memory (ROM), hard disk, erasable programmable read-only memory (EPROM), portable compact disc read-only memory (CD-ROM), USB memory, flash memory, and the like. One or more computer readable instructions may be stored on the computer readable storage medium that can be executed by the processor 601 to implement various functions of the electronic device. Various applications and various data, etc. may also be stored in the storage medium.
For example, in some embodiments, the electronic device may further include an image acquisition component. The image acquisition section is for acquiring an input image. The memory 603 is also used to store input images.
For example, the image acquisition component may be a camera of a smart phone, a camera of a tablet computer, a camera of a personal computer, a lens of a digital camera, or even a web cam.
For example, the input image may be an original image directly acquired by the image acquisition section, or may be an image obtained after preprocessing the original image. Preprocessing may eliminate extraneous or noise information in the original image to facilitate better processing of the input image. The preprocessing may include, for example, image expansion (Data augmentations), image scaling, gamma (Gamma) correction, image enhancement or noise reduction filtering of the original image.
For example, a detailed description of a procedure of performing image processing by the electronic device may refer to a related description in an embodiment of an image processing method, and a detailed description is omitted.
Fig. 14 is a schematic diagram of a non-transitory computer readable storage medium according to at least one embodiment of the present disclosure. For example, as shown in FIG. 14, one or more computer-readable instructions 701 may be non-transitory stored on the storage medium 700. For example, computer readable instructions 701, when executed by a processor, may perform one or more steps in accordance with the image processing methods described above.
For example, the storage medium 700 may be applied to the above-described electronic device, and for example, the storage medium 700 may include the memory 603 in the electronic device.
For example, the description of the storage medium 700 may refer to the description of the memory in the embodiment of the electronic device, and the repetition is omitted.
For the purposes of this disclosure, the following points are also noted:
(1) The drawings of the embodiments of the present disclosure relate only to the structures related to the embodiments of the present disclosure, and other structures may refer to the general design.
(2) In the drawings for describing embodiments of the present invention, thicknesses and dimensions of layers or structures are exaggerated for clarity. It will be understood that when an element such as a layer, film, region or substrate is referred to as being "on" or "under" another element, it can be "directly on" or "under" the other element or intervening elements may be present.
(3) The embodiments of the present disclosure and features in the embodiments may be combined with each other to arrive at a new embodiment without conflict.
The foregoing is merely a specific embodiment of the disclosure, but the scope of the disclosure is not limited thereto and should be determined by the scope of the claims.

Claims (29)

1. An image processing method, comprising:
acquiring an input image, wherein the input image comprises a table area, the table area comprises a plurality of object areas, and each object area in the plurality of object areas comprises at least one object;
Performing region identification processing on the input image to obtain a plurality of object region frames corresponding to the plurality of object regions one by one and a table region frame corresponding to the table region;
performing table line detection processing on the input image to judge whether the table area comprises a wired table or not; and
in response to the table region not including a wired table:
performing alignment processing on the plurality of object region frames to obtain a plurality of region labeling frames corresponding to the plurality of object region frames one by one;
determining at least one dividing line based on the multiple object region frames, and dividing the multiple region labeling frames through the at least one dividing line to form multiple cells; and
based on the plurality of cells, a cell table corresponding to the table region is generated.
2. The image processing method according to claim 1, wherein performing alignment processing on the plurality of object region frames to obtain the plurality of region annotation frames corresponding to the plurality of object region frames one to one, comprises:
dividing the table region frame into a plurality of coordinate grid regions arranged in M rows and N columns along a first direction and a second direction by taking a datum reference value as a coordinate unit to establish a table coordinate system, wherein the M rows of coordinate grid regions are arranged along the first direction, the N columns of coordinate grid regions are arranged along the second direction, and M and N are positive integers;
Determining coordinates of the plurality of object region frames in the table coordinate system; and
and expanding the plurality of object region frames based on the coordinates of the plurality of object region frames in the table coordinate system to obtain the plurality of region annotation frames.
3. The image processing method according to claim 2, wherein the base reference value is determined from an average height of the plurality of object region frames in the first direction.
4. The image processing method according to claim 2, wherein determining coordinates of the plurality of object region frames in the table coordinate system includes:
determining a plurality of slopes of the plurality of object region boxes, wherein the slope of each object region box of the plurality of object region boxes represents a slope of an edge of each object region box extending in the second direction relative to the second direction;
correcting the input image according to a plurality of slopes of the plurality of object area frames to obtain a corrected input image; and
coordinates of the plurality of object region frames in the table coordinate system are determined based on the corrected input image.
5. The image processing method according to claim 4, wherein performing correction processing on the input image according to a plurality of slopes of the plurality of object region frames to obtain the corrected input image, comprises:
calculating an average value of a plurality of slopes according to the plurality of slopes of the plurality of object region frames; and
the input image is rotated in a plane constituted by the first direction and the second direction based on an average value of the plurality of slopes such that the average value of the plurality of slopes approaches 0.
6. The image processing method according to any one of claims 2 to 5, wherein performing expansion processing on the plurality of object region frames based on coordinates of the plurality of object region frames in the table coordinate system to obtain the plurality of region labeling frames, comprises:
determining first start and end coordinates of the plurality of object region frames in the first direction and second start and end coordinates of the plurality of object region frames in the table coordinate system, wherein the first start coordinates of any one of the plurality of object region frames comprise coordinates of a start row of a grid region occupied by the any one of the object region frames in the table coordinate system, the first end coordinates of any one of the object region frames comprise coordinates of an end row of a grid region occupied by the any one of the object region frames in the table coordinate system, the second start coordinates of any one of the object region frames comprise coordinates of a start column of a grid region occupied by the any one of the object region frames in the table coordinate system, and the second end coordinates of any one of the object region frames comprise coordinates of an end column of a grid region occupied by the any one of the object region frames in the table coordinate system;
Dividing the object region frames into a plurality of rows and a plurality of columns, performing row-by-row expansion processing on the object region frames according to the direction from a starting row to a terminating row in the table coordinate system, and sequentially performing expansion processing on each row of object region frames according to the direction from the starting column to the terminating column in the table coordinate system;
for an ith object region frame of the plurality of object region frames, where i is a positive integer,
expanding the ith object area frame in the first direction so that the start line of the grid area occupied by the ith object area frame moves the reference value in the first direction each time in a direction away from the end line of the grid area occupied by the ith object area frame, so that the end line of the grid area occupied by the ith object area frame moves the reference value in the first direction each time in a direction away from the start line of the grid area occupied by the ith object area frame until the first start coordinate of the ith object area frame is equal to 0 or equal to the first end coordinate of any one of the plurality of object area frames except the ith object area frame, and the first end coordinate of the ith object area frame is equal to the maximum line value of the table coordinate system or equal to the first start coordinate of any one of the plurality of object area frames except the ith object area frame,
And expanding the ith object area frame in the second direction, so that the starting column of the coordinate grid area occupied by the ith object area frame moves the reference value in the second direction along the direction away from the ending column of the coordinate grid area occupied by the ith object area frame each time, so that the ending column of the coordinate grid area occupied by the ith object area frame moves the reference value in the second direction each time along the direction away from the starting column of the coordinate grid area occupied by the ith object area frame, until the second starting coordinate of the ith object area frame is equal to 0 or equal to the second ending coordinate of any one of the plurality of object area frames except the ith object area frame, and the second ending coordinate of the ith object area frame is equal to the maximum column value of the table coordinate system or equal to the second ending coordinate of any one of the plurality of object area frames except the ith object area frame, thereby obtaining the label area corresponding to the ith object area.
7. The image processing method according to any one of claims 2 to 5, wherein determining the at least one dividing line based on the plurality of object region frames includes:
Establishing a pixel coordinate system based on the table area frame by taking pixels as coordinate units, wherein the pixel coordinate system comprises a plurality of pixel units, a first coordinate axis of the pixel coordinate system is parallel to the first direction, and a second coordinate axis of the pixel coordinate system is parallel to the second direction;
determining coordinates of the object region frames in the pixel coordinate system to obtain a plurality of pixel regions corresponding to the object region frames one by one;
marking pixel units occupied by the plurality of pixel areas in the pixel coordinate system as first pixel units, and marking pixel units except the first pixel units occupied by the plurality of pixel areas in the pixel coordinate system as second pixel units;
sequentially determining the number of first pixel units included in each column of pixel units in the pixel coordinate system along the second direction;
responding to the fact that the number of first pixel units included in Ren Yiyi columns of pixel units is smaller than or equal to a first pixel reference value, and taking any one column of pixel units as a first intermediate dividing line to obtain at least one first intermediate dividing line; and
at least one first split line extending along the first direction is determined in the table coordinate system based on the at least one first intermediate split line, wherein the at least one split line includes the at least one first split line.
8. The image processing method of claim 7, wherein determining the at least one first split line extending along the first direction in the table coordinate system based on the at least one first intermediate split line comprises:
mapping any one of the at least one first intermediate split lines from the pixel coordinate system into the table coordinate system in response to the first intermediate split line not having an adjacent first intermediate split line in the second direction to obtain one of the first split lines corresponding to the any one first intermediate split line in the table coordinate system;
in response to the at least one first intermediate split line comprising X first intermediate split lines that are continuous in the second direction, mapping any one of the X first intermediate split lines from the pixel coordinate system into the table coordinate system to obtain one of the first split lines in the table coordinate system that corresponds to the X first intermediate split lines, wherein X is a positive integer.
9. The image processing method according to claim 7, wherein the first pixel reference value is 0 or the first pixel reference value is determined based on an image height of the input image.
10. The image processing method of claim 7, wherein determining the at least one parting line based on the plurality of object region frames, further comprises:
identifying an object included in each of the plurality of object regions;
determining coordinates of an object included in each object region in the pixel coordinate system;
marking pixel units occupied by the objects included in each object area in the pixel coordinate system as third pixel units, and marking pixel units except the third pixel units occupied by the objects included in each object area in the pixel coordinate system as fourth pixel units;
sequentially determining the number of third pixel units included in each row of pixel units in the pixel coordinate system along the first direction;
responding to the fact that the number of third pixel units included in any one row of pixel units is smaller than or equal to a second pixel reference value, and taking the any one row of pixel units as a second intermediate dividing line to obtain at least one second intermediate dividing line; and
at least one second split line extending along the second direction is determined in the table coordinate system based on the at least one second intermediate split line, wherein the at least one split line includes the at least one second split line.
11. The image processing method of claim 10, wherein determining the at least one second split line extending along the second direction in the table coordinate system based on the at least one second intermediate split line comprises:
mapping any one of the at least one second intermediate split lines from the pixel coordinate system into the table coordinate system in response to the second intermediate split line not having an adjacent second intermediate split line in the first direction to obtain one of the second split lines corresponding to the any one second intermediate split line in the table coordinate system;
in response to the at least one second intermediate split line including Y second intermediate split lines that are continuous in the first direction, mapping any one of the Y second intermediate split lines from the pixel coordinate system into the table coordinate system to obtain one of the second split lines corresponding to the Y second intermediate split lines in the table coordinate system, wherein Y is a positive integer.
12. The image processing method according to claim 10, wherein the second pixel reference value is 0 or the second pixel reference value is determined based on an image width of the input image.
13. The image processing method according to claim 1, wherein the dividing the plurality of region marking frames by the at least one dividing line to form the plurality of cells includes:
and responding to the input image to perform the table line detection processing and detect at least one table line segment, performing correction processing on the at least one dividing line based on the at least one table line segment, and performing dividing processing on the plurality of region marking frames based on the at least one dividing line after the correction processing to obtain the plurality of cells.
14. The image processing method according to claim 1, wherein generating the cell table corresponding to the table area based on the plurality of cells, comprises:
and merging and/or dividing the cells to obtain a plurality of target cells based on the cells, wherein the cells comprise the target cells.
15. The image processing method according to claim 14, wherein merging and/or dividing the plurality of cells to obtain the plurality of target cells based on the plurality of cells, comprises:
Determining whether the at least one dividing line passes through any region labeling frame or not according to the fact that the at least one dividing line passes through any region labeling frame;
responding to the fact that any region labeling frame needs to be split, splitting the region labeling frame into a plurality of split labeling frames, and taking a cell of each split labeling frame in the plurality of split labeling frames as one target cell;
and in response to the fact that any region labeling frame does not need to be split, merging a plurality of cells occupied by the region labeling frame to obtain one target cell.
16. The image processing method according to claim 14 or 15, further comprising:
identifying an object included in each of the plurality of object regions; and
and correspondingly filling the objects included in the object areas into each target cell of the cell table respectively to generate an object table.
17. The image processing method according to claim 1, wherein performing a table line detection process on the input image to determine whether the table area includes a wired table, comprises:
judging whether the table area comprises a wired table or not based on one or more table line segments under the condition that the input image is subjected to table line detection processing and one or more table line segments are obtained;
In a case where a table line detection process is performed on the input image and it is detected that the input image does not have a table line segment, it is determined that the table area does not include a wired table.
18. The image processing method of claim 17, wherein performing a table line detection process on the input image to obtain the one or more table line segments comprises:
performing line segment detection on the input image to obtain a plurality of detection line segments;
combining the plurality of detection line segments to redraw to obtain a plurality of first middle table line segments;
performing expansion processing on the plurality of first middle table line segments respectively to obtain a plurality of second middle table line segments;
deleting a second middle table line segment located in any object region frame in the object region frames in the second middle table line segments, and taking the rest second middle table line segments in the second middle table line segments as third middle table line segments;
the merging processing is carried out on the plurality of third middle table line segments so as to obtain a plurality of fourth middle table line segments; and
and respectively performing expansion treatment on the plurality of fourth middle table line segments to obtain one or more fifth middle table line segments, and taking the one or more fifth middle table line segments as the one or more table line segments.
19. The image processing method according to claim 18, wherein the combining process includes: for a first line segment to be merged and a second line segment to be merged, in response to the difference between the slope of the first line segment to be merged and the slope of the second line segment to be merged being less than a slope threshold, and the distance between the end point of the first line segment to be merged near the second line segment to be merged and the end point of the second line segment to be merged near the first line segment to be merged being less than or equal to a distance threshold, merging the first line segment to be merged and the second line segment to be merged,
the first line segment to be merged and the second line segment to be merged are any two detection line segments in the plurality of detection line segments, or the first line segment to be merged and the second line segment to be merged are any two third middle table line segments in the plurality of third middle table line segments.
20. The image processing method of claim 17, wherein determining whether the table region includes a wired table based on the one or more table line segments comprises:
in response to obtaining the one form segment, determining that the form region does not include a wired form;
In response to deriving the plurality of table segments:
determining intersections between the plurality of table line segments;
determining that the table region includes a wired table in response to the number of intersection points being greater than or equal to a second reference value; and
in response to the number of intersection points being less than the second reference value, it is determined that the table region does not include a wired table.
21. The image processing method of claim 20, wherein determining the intersection between the plurality of table line segments comprises:
dividing the plurality of table line segments into a plurality of first table line segments and a plurality of second table line segments, wherein an included angle between each first table line segment and a third direction is in a first angle range, an included angle between each first table line segment and a fourth direction is in a second angle range, an included angle between each second table line segment and the third direction is in the second angle range, an included angle between each first table line segment and the fourth direction is in the first angle range, and the third direction and the fourth direction are perpendicular to each other;
dividing the plurality of first table line segments into a plurality of first line segment lines and marking line numbers of first line segment lines of each first table line segment in the plurality of first table line segments, wherein each first line segment line comprises at least one first table line segment arranged along the third direction;
Dividing the plurality of second table line segments into a plurality of second line segment columns and marking column numbers of second line segment columns to which each of the plurality of second table line segments belongs, wherein each second line segment column comprises at least one second table line segment arranged along the fourth direction; and
and identifying a plurality of intersection points between the plurality of first table line segments and the plurality of second table line segments, and determining coordinates of the plurality of intersection points, wherein the coordinates of any one of the plurality of intersection points comprise a row number corresponding to the first table line segment and a column number corresponding to the second table line segment which intersect to form the any one of the plurality of intersection points.
22. The image processing method according to claim 21, wherein the first angle range is 0 ° to 45 °, and the second angle range is 45 ° to 90 °.
23. The image processing method according to claim 21, wherein the second reference value is a larger value of a number of the plurality of first line segment rows and a number of the plurality of second line segment columns.
24. The image processing method according to claim 21, further comprising:
in response to the form region comprising a wired form:
a cell table corresponding to the table region is generated based on the plurality of table line segments.
25. The image processing method of claim 24, wherein generating a cell table corresponding to the table region based on the plurality of table line segments comprises:
and determining each cell in the cell table based on the plurality of intersection points, wherein the vertex of each cell in the cell table is formed by at least three intersection points in the plurality of intersection points.
26. The image processing method of claim 25, wherein determining each cell in the cell table based on the plurality of intersections comprises:
determining a current intersection point, wherein the current intersection point is any intersection point in the plurality of intersection points;
determining a first current table line segment and a second current table line segment corresponding to the current intersection point based on the coordinates of the current intersection point,
the first current table line segment is any one first table line segment, and the second current table line segment is any one second table line segment;
determining a first intersection point adjacent to the current intersection point on the first current table line segment, and determining a second intersection point adjacent to the current intersection point on the second current table line segment; and
A cell is determined based on the current intersection, the first intersection, and the second intersection.
27. An image processing apparatus comprising:
an image acquisition module configured to acquire an input image, wherein the input image includes a table area including a plurality of object areas, each of the plurality of object areas including at least one object;
the region identification processing module is configured to perform region identification processing on the input image so as to obtain a plurality of object region frames corresponding to the plurality of object regions one by one and a table region frame corresponding to the table region;
a form line detection processing module configured to perform a form line detection processing on the input image to determine whether the form area includes a wired form; and
a cell table generation module configured to, in response to the table region not including a wired table:
performing alignment processing on the plurality of object region frames to obtain a plurality of region labeling frames corresponding to the plurality of object region frames one by one;
determining at least one dividing line based on the multiple object region frames, and dividing the multiple region labeling frames through the at least one dividing line to form multiple cells;
Based on the plurality of cells, a cell table corresponding to the table region is generated.
28. An electronic device includes a processor and a memory,
wherein the memory is for storing computer readable instructions;
the processor is configured to implement the steps of the method of any one of claims 1-26 when executing the computer readable instructions.
29. A non-transitory computer readable storage medium for non-transitory storage of computer readable instructions which, when executed by a processor, implement the steps of the method of any one of claims 1-26.
CN202110169261.6A 2021-02-07 2021-02-07 Image processing method and device, electronic equipment and storage medium Active CN112906532B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110169261.6A CN112906532B (en) 2021-02-07 2021-02-07 Image processing method and device, electronic equipment and storage medium
PCT/CN2022/073988 WO2022166707A1 (en) 2021-02-07 2022-01-26 Image processing method and apparatus, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110169261.6A CN112906532B (en) 2021-02-07 2021-02-07 Image processing method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112906532A CN112906532A (en) 2021-06-04
CN112906532B true CN112906532B (en) 2024-01-05

Family

ID=76123794

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110169261.6A Active CN112906532B (en) 2021-02-07 2021-02-07 Image processing method and device, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN112906532B (en)
WO (1) WO2022166707A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112906532B (en) * 2021-02-07 2024-01-05 杭州睿胜软件有限公司 Image processing method and device, electronic equipment and storage medium
CN113657274B (en) * 2021-08-17 2022-09-20 北京百度网讯科技有限公司 Table generation method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5774584A (en) * 1993-01-07 1998-06-30 Canon Kk Method and apparatus for identifying table areas in documents
CN109948507A (en) * 2019-03-14 2019-06-28 北京百度网讯科技有限公司 Method and apparatus for detecting table
CN110008923A (en) * 2019-04-11 2019-07-12 网易有道信息技术(北京)有限公司 Image processing method and training method and device, calculate equipment at medium
CN111160234A (en) * 2019-12-27 2020-05-15 掌阅科技股份有限公司 Table recognition method, electronic device and computer storage medium
CN111325110A (en) * 2020-01-22 2020-06-23 平安科技(深圳)有限公司 Form format recovery method and device based on OCR and storage medium
CN111368744A (en) * 2020-03-05 2020-07-03 中国工商银行股份有限公司 Method and device for identifying unstructured table in picture
CN111382717A (en) * 2020-03-17 2020-07-07 腾讯科技(深圳)有限公司 Table identification method and device and computer readable storage medium
CN112149561A (en) * 2020-09-23 2020-12-29 杭州睿琪软件有限公司 Image processing method and apparatus, electronic device, and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446264B (en) * 2018-03-26 2022-02-15 阿博茨德(北京)科技有限公司 Method and device for analyzing table vector in PDF document
CN109635268B (en) * 2018-12-29 2023-05-05 南京吾道知信信息技术有限公司 Method for extracting form information in PDF file
CN112906532B (en) * 2021-02-07 2024-01-05 杭州睿胜软件有限公司 Image processing method and device, electronic equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5774584A (en) * 1993-01-07 1998-06-30 Canon Kk Method and apparatus for identifying table areas in documents
CN109948507A (en) * 2019-03-14 2019-06-28 北京百度网讯科技有限公司 Method and apparatus for detecting table
CN110008923A (en) * 2019-04-11 2019-07-12 网易有道信息技术(北京)有限公司 Image processing method and training method and device, calculate equipment at medium
CN111160234A (en) * 2019-12-27 2020-05-15 掌阅科技股份有限公司 Table recognition method, electronic device and computer storage medium
CN111325110A (en) * 2020-01-22 2020-06-23 平安科技(深圳)有限公司 Form format recovery method and device based on OCR and storage medium
CN111368744A (en) * 2020-03-05 2020-07-03 中国工商银行股份有限公司 Method and device for identifying unstructured table in picture
CN111382717A (en) * 2020-03-17 2020-07-07 腾讯科技(深圳)有限公司 Table identification method and device and computer readable storage medium
CN112149561A (en) * 2020-09-23 2020-12-29 杭州睿琪软件有限公司 Image processing method and apparatus, electronic device, and storage medium

Also Published As

Publication number Publication date
CN112906532A (en) 2021-06-04
WO2022166707A1 (en) 2022-08-11

Similar Documents

Publication Publication Date Title
CN112926421B (en) Image processing method and device, electronic equipment and storage medium
CN112149561B (en) Image processing method and device, electronic equipment and storage medium
CN110502985B (en) Form identification method and device and form identification equipment
CN113486828B (en) Image processing method, device, equipment and storage medium
CN112906532B (en) Image processing method and device, electronic equipment and storage medium
US20230222631A1 (en) Method and device for removing handwritten content from text image, and storage medium
CN111428717B (en) Text recognition method, text recognition device, electronic equipment and computer readable storage medium
CN111275139A (en) Handwritten content removal method, handwritten content removal device, and storage medium
CN107679442A (en) Method, apparatus, computer equipment and the storage medium of document Data Enter
CN111368638A (en) Spreadsheet creation method and device, computer equipment and storage medium
CN113343740A (en) Table detection method, device, equipment and storage medium
CN113436222A (en) Image processing method, image processing apparatus, electronic device, and storage medium
CN112419207A (en) Image correction method, device and system
CN112215811A (en) Image detection method and device, electronic equipment and storage medium
CN116402020A (en) Signature imaging processing method, system and storage medium based on OFD document
WO2020232866A1 (en) Scanned text segmentation method and apparatus, computer device and storage medium
US20230101426A1 (en) Method and apparatus for recognizing text, storage medium, and electronic device
CN108804978B (en) Layout analysis method and device
CN112580499A (en) Text recognition method, device, equipment and storage medium
CN111325106B (en) Method and device for generating training data
US9734610B2 (en) Image processing device, image processing method, and image processing program
CN113591433A (en) Text typesetting method and device, storage medium and computer equipment
WO2022183907A1 (en) Image processing method and apparatus, intelligent invoice recognition device, and storage medium
CN111753832B (en) Image processing method, image processing apparatus, electronic device, and storage medium
CN113096217B (en) Picture generation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant