WO2023029116A1 - Text image typesetting method and apparatus, electronic device, and storage medium - Google Patents

Text image typesetting method and apparatus, electronic device, and storage medium Download PDF

Info

Publication number
WO2023029116A1
WO2023029116A1 PCT/CN2021/119226 CN2021119226W WO2023029116A1 WO 2023029116 A1 WO2023029116 A1 WO 2023029116A1 CN 2021119226 W CN2021119226 W CN 2021119226W WO 2023029116 A1 WO2023029116 A1 WO 2023029116A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
text image
line
image block
current
Prior art date
Application number
PCT/CN2021/119226
Other languages
French (fr)
Chinese (zh)
Inventor
华杰
Original Assignee
广东艾檬电子科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广东艾檬电子科技有限公司 filed Critical 广东艾檬电子科技有限公司
Publication of WO2023029116A1 publication Critical patent/WO2023029116A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/109Font handling; Temporal or kinetic typography
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means

Definitions

  • the present application relates to the technical field of graphic typesetting, and in particular to a text image typesetting method, device, electronic equipment and storage medium.
  • Text layout can improve the user's reading experience.
  • the current text typesetting method is mainly aimed at text images with regular text content. Multiple sub-regions containing text lines are obtained by image segmentation of the text image, and the image coordinates of each sub-region are sorted from top to bottom and from left to right. However, in a text image with slanted or curved text content, it is difficult to sort each text line accurately.
  • the embodiment of the present application discloses a typesetting method, device, electronic device and storage medium of a text image, capable of accurately sorting text lines in a distorted text image whose text content is inclined or curved.
  • the first aspect of the embodiment of the present application discloses a text image typesetting method, the method comprising:
  • each text image block included in the first text image match each text image block to generate at least one text line to obtain a line list corresponding to the first text image, wherein, The matching value between two adjacent text image blocks in each text line satisfies the threshold condition.
  • the matching is performed on each text image block according to the line information corresponding to each text image block included in the first text image, Generate at least one line of text, including:
  • the line information corresponding to each text image block contained in the first text image determine each matching value between the first text image block and each other text image block, wherein the first text image block is the first Any text image block in the first text image, the other text image blocks are text image blocks in the first text image other than the first text image block;
  • each text image block contained in the first text image corresponds to After the line information, also include:
  • the text image blocks are pre-sorted to obtain the first text image The corresponding sequence of image blocks.
  • the matching is performed on each text image block according to the line information corresponding to each text image block included in the first text image, Generate at least one text line to obtain a line list corresponding to the first text image, including:
  • the current text image block is successfully matched with the text image block at the end of the target text line, then the current text image block is added to the end of the target text line to update the target text line at the end text image blocks;
  • the determining the current text image block from the first image block sequence combining the line information of the current text image block with the line Each text line in the list is matched with the line information of the text image block at the end, including:
  • the first text line is any text line in the line list
  • the matching value is greater than the matching threshold, the current text image block is successfully matched with the text image block at the end of the first text line, and the first text line is used as the target text line;
  • the matching value is not greater than the matching threshold, the current text image block is not successfully matched with the text image block at the end of the first text line.
  • the current text image block Added to the end of the target text line to update the text image block at the end of the target text line, including:
  • the current text image block is added to the end of the target text line to update all The text image block at the end of the target text line;
  • the matching values between the current text image block and at least two text image blocks whose target text lines are arranged at the end all satisfy the threshold condition, then determine the maximum of the matching values of the text image blocks whose target text lines are arranged at the end matching value, and adding the current text image block to the end of the target text line corresponding to the maximum matching value, so as to update the text image block at the end of the target text line.
  • the determining the current text image block from the first image block sequence combining the line information of the current text image block with the line Each text line in the list is matched with the line information of the text image block at the end, including:
  • each text line in the line list is respectively matched with the line information of the text image block at the end of each text line, wherein each text line in the line list is based on the The creation time of each text line is arranged from first to last;
  • the current text image block is added to the end of the target text line to update the target text line at the end text image block, and stop continuing to match the current text image block.
  • the second aspect of the embodiment of the present application discloses a text image typesetting device, the device includes:
  • a text detection module configured to perform text line detection on the first text image, and determine line information corresponding to each text image block contained in the first text image, wherein the line information includes line height, line head position coordinates and The position coordinates of the end of the line;
  • a text typesetting module configured to match each text image block according to the line information corresponding to each text image block contained in the first text image, and generate at least one text line, so as to obtain the first text image corresponding A line list of , where the matching values between two adjacent text image blocks in each text line satisfy the threshold condition.
  • the third aspect of the embodiment of the present application discloses an electronic device, including a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processor implements the embodiment of the present application
  • the first aspect discloses a text image typesetting method.
  • the fourth aspect of the embodiment of the present application discloses a computer-readable storage medium storing a computer program, wherein when the computer program is executed by a processor, the method for typesetting a text image disclosed in the first aspect of the embodiment of the present application is implemented.
  • the line information of each text image block contained in the first text image is determined.
  • the line information may include line height, line head position coordinates, and line end position coordinates.
  • the line information corresponding to each text image block of each text image block is matched to generate at least one text line, and then the line list corresponding to the first text image is obtained, wherein the distance between two adjacent text image blocks in each text line The matching value of satisfies the threshold condition.
  • each text image block contained in the first text image it is matched with other text image blocks in the first text image to generate at least one text line, and the text image in the first text image
  • Each text line constitutes the line list corresponding to the first text image, and can typeset the text lines of the skewed or curved distorted text image, and the typesetting process is carried out according to line information such as line height, line head position coordinates, and line end position coordinates. Matching, which can improve the accuracy of text typesetting for skewed or curved distorted text images.
  • FIG. 1 is a schematic flow diagram of a text image typesetting method disclosed in an embodiment of the present application
  • Fig. 2 is a schematic flow chart of a text image typesetting method disclosed in the embodiment of the present application
  • Fig. 3 is a flowchart of the row list construction disclosed in the embodiment of the present application.
  • FIG. 4 is a structural diagram of a sequence of constructed image blocks and a row list disclosed in an embodiment of the present application
  • Fig. 5 is a flow chart of matching the current text image block in the embodiment of the present application.
  • Fig. 6 is a schematic structural diagram of a text image typesetting device disclosed in an embodiment of the present application.
  • Fig. 7 is a schematic structural diagram of another text image typesetting device disclosed in the embodiment of the present application.
  • Fig. 8 is a schematic structural diagram of another text image typesetting device disclosed in the embodiment of the present application.
  • Fig. 9 is a schematic structural diagram of an electronic device disclosed by an embodiment.
  • the embodiment of the present application discloses a typesetting method, device, electronic device and storage medium of a text image, capable of accurately sorting text lines in a distorted text image whose text content is inclined or curved.
  • FIG. 1 is a schematic flowchart of a typesetting method for text images disclosed in an embodiment of the present application.
  • the typesetting method of the text image can be applied to a terminal device, and the terminal device may include but not limited to smart watches, smart phones, smart bracelets and tablet computers, etc., and the method may include the following steps:
  • the terminal device performs text line detection on the first text image, and the text line detection is used to determine each text image block contained in the first text image and the line information corresponding to each text image block, wherein, Text line detection can be realized by using PSENet (Progressive Scale Expansion) algorithm, CPTN (Connectionist Text Proposal Network) algorithm, CRAFT (Character Region Awareness for Text Detection) algorithm or LOMO (Local Maximal Occurrence) algorithm, etc. Algorithms that support curved text line detection in deep learning can be used, which is not specifically limited here.
  • the first text image is an image containing text, and the length of the text contained in the image is not limited.
  • the first text image may use a scanning device such as a scanning pen to input text, thereby generating a first text image corresponding to the input text.
  • the text line detection can first preprocess the first text image, such as grayscale, binarization, and smoothing, to unify the first text image specifications, and then pass one or more bounding boxes
  • the form marks one or more characters in the first text image, the bounding box is a text image block, and finally extracts the line height, line head position coordinates, and line end position coordinates of each bounding box as the corresponding text image block row information.
  • the text image blocks included in each first text image and the line information corresponding to the text image blocks can be determined conveniently.
  • the terminal device performs text line detection on the first text image, and determines each text image block in the first text image; establishes a coordinate system corresponding to the first text image area, and determines four text image blocks corresponding to each text image. According to the four corner points, the line information such as line height, line head position coordinates and line end position coordinates of each text image block is determined.
  • each text image block may include four corner points, and according to the coordinate system corresponding to the first text image area, the coordinates of the midpoint of the line connecting the two corner points on the left and the coordinates of the line connecting the two corner points on the right Midpoint coordinates, according to the four corner points of the text image block and the determined coordinates of the two midpoints, the row height in the row information can be determined as the difference between the vertical coordinates of the two left corner points, and the row head position
  • the coordinates can be the coordinates of the left midpoint, and the coordinates of the line end position can be the coordinates of the right midpoint.
  • the method before performing text line detection on the first text image at step 110 and determining line information corresponding to each text image block contained in the first text image, the method further includes:
  • the terminal device first obtains the full text image, and after obtaining the full text image, it can use a region segmentation algorithm to perform region segmentation on the full text image to obtain at least one first text image, wherein the full text image contains multiple An image of text, such as an image of a paper containing multiple paragraphs, or an image of a student assignment containing a large question with three sub-questions.
  • the full-text image may be obtained by the terminal device using a camera installed on the terminal device to capture or scan the text to obtain the full-text image, or by accessing a server or other terminal devices to obtain the full-text image.
  • the first text image is each sub-region obtained after region segmentation of the full-text image.
  • the first text image can be An image of the region for the three sub-questions.
  • the region segmentation algorithm used for region segmentation may be an edge-based image segmentation algorithm, a region growing algorithm, a region splitting and merging algorithm, or a level set algorithm, etc., which are not specifically limited here.
  • each text image block included in the first text image match each text image block to generate at least one text line to obtain a line list corresponding to the first text image, wherein each text line The matching value between two adjacent text image blocks in , satisfies the threshold condition.
  • the terminal device calculates each text image according to the line information of each text image block included in the first text image, that is, according to the line height, line head position coordinates, and line end position coordinates of each text image block.
  • Image blocks are matched, and if a text image block is successfully matched with another text image block, a text image block is added to another text image block to generate a text line. For a successful match, if the row heights are equal, and the coordinates of the first position of a text image block are the same as the coordinates of an end position of another text image block, then the matching is considered successful.
  • the line head of one text image block may be connected to the line end of another text image block.
  • the terminal device forms at least one text line according to each text image block after matching, and forms a line list corresponding to the first text image according to at least one text line formed in the first text image, wherein the line list is Typesetting text list in a text image.
  • the terminal device matches other text image blocks in the first text image according to the line information of each text image block contained in the first text image, so as to generate at least one text line
  • the first Each text line in the text image constitutes the line list corresponding to the first text image, which can typeset the text lines of the skewed or curved distorted text image, and the typesetting process is based on the line height, the position coordinates of the beginning of the line and the position coordinates of the end of the line Matching the equal line information can improve the accuracy of text typesetting for skewed or curved distorted text images.
  • matching each text image block to generate at least one text line may include the following steps:
  • the line information corresponding to each text image block contained in the first text image determine the respective matching values between the first text image block and each other text image block, wherein the first text image block is the first text image block Any text image block, other text image blocks are text image blocks other than the first text image block in the first text image;
  • a maximum matching value among the respective matching values is determined, and the first text image block is added to other text image blocks corresponding to the maximum matching value, so as to generate at least one text line.
  • the terminal device can determine the matching value between a text image block and other text image blocks in the first text image according to the line information corresponding to each text image block included in the first text image . For each text image block, select the maximum matching value from each matching value between the text image block and other text image blocks, for example, the first text image block and the other 3 text image blocks in the first text image.
  • the matching values between are 50, 60 and 90 respectively, then select the maximum matching value of 90, and add the first text image block to the text image block with a matching value of 90.
  • the terminal device can also select the matching value that best satisfies the threshold condition from the matching values between the text image block and other text image blocks, wherein, the most satisfying threshold value
  • the condition refers to within the range of the threshold condition and is closest to the threshold condition, for example, if the threshold condition is within 100, then the matching values between the first text image block and the other 3 text image blocks in the first text image are respectively is 80, 90, and 110, then select the matching value that most satisfies the threshold condition to be 90, and add the first text image block to the text image block with a matching value of 90.
  • each text image block in the first text image is matched with other text image blocks in the first text image to determine the most matching text image block, and splicing with the most matching text image block to At least one text line is generated, and the terminal device then forms a line list corresponding to the first text image according to the at least one text line generated in the first text image.
  • a text image block can be matched with other text image blocks, and the most matching text image block can be determined and added to the text image block to generate a text line and a line list corresponding to the first text image, effectively realizing the first Typography of text images.
  • the first text image block is added to another text image block, specifically, the line head of the first text image block is connected to the line end of another text image block.
  • FIG. 2 is a schematic flowchart of a typesetting method for a text image disclosed in an embodiment of the present application.
  • the typesetting method of the text image can be applied to the terminal device, and may include the following steps:
  • the process of the terminal device performing this step is the same as that of the above-mentioned embodiments, and will not be repeated here.
  • each text image block included in the first text image and in the order of the abscissa coordinates from small to large, pre-sort each text image block to obtain an image block corresponding to the first text image sequence.
  • the terminal device determines the line information corresponding to each text image block contained in the first text image, for the first text image, according to the line information of each text image block contained in the first text image
  • the abscissa in the first position coordinates pre-sorts each text image block in the first text image, wherein the pre-sorting is sorted according to the order of the abscissas of the first position coordinates of each line from small to large, to obtain the first An image block sequence corresponding to the text image, where the image block sequence is the pre-sorted text image blocks in the first text image. It can ensure that each text image block is in the regular text sequence from left to right in the first text image, thereby effectively improving the accuracy of the subsequently generated text lines and line lists.
  • each text image block included in the first text image match each text image block to generate at least one text line to obtain a line list corresponding to the first text image, wherein each text line The matching value between two adjacent text image blocks in , satisfies the threshold condition.
  • the terminal device matches each text image block with each other according to the line information corresponding to each text image block included in the first text image, if the matching value between two text image blocks satisfies the threshold condition , then the mosaic relationship between the two text image blocks can be determined according to the sequence in the image block sequence, for example, the matching value between text image block 1 and text image block 2 satisfies the threshold condition, and text image block 1 is in The order in the image block sequence is before the text image block 2, so between the two text image blocks, the beginning of the line of the text image block 2 needs to be connected with the end of the line of the text image block 1, and the text image blocks between each text image block
  • the splicing relationship is all such that at least one text line is generated and a list of lines corresponding to the first text image is obtained.
  • Fig. 3 is a flow chart of row list construction disclosed in the embodiment of the present application.
  • step 230 according to the line information corresponding to each text image block contained in the first text image, each text image block is matched to generate at least one text line, so as to obtain the line list corresponding to the first text image, which may include the following steps:
  • FIG. 4 is a structural diagram of the constructed image block sequence and row list disclosed in the embodiment of the present application.
  • the terminal device constructs a row list 420 for the first text image in the text image.
  • the terminal device creates a new text image block in the line list according to the first text image block arranged in the image block sequence corresponding to the first text image, that is, the text image block with the smallest abscissa in the first text image.
  • line that is to say, the text image block with the smallest abscissa first generates a text line in the line list alone.
  • the second step is performed in the order of the ordinate from large to small. Sort, select the text image block that is arranged in the first place after secondary sorting, and create a new text line in the line list.
  • the terminal device in the sequence of image blocks corresponding to the first text image, if there are multiple text image blocks with the same abscissa of the line-head position coordinates, the terminal device can The vertical coordinates of the image blocks are used to reorder these text image blocks, so that the text image block with the largest vertical coordinates is selected to create a new text line in the line list, which can ensure that each text image block is in the first text image Middle is the regular text order from left to right and top to bottom, which further improves the accuracy of subsequent generated text lines and line lists.
  • the terminal device determines the current text image block from the first image block sequence, please refer to Figure 4 again, each text image block in the first image block sequence corresponding to the first text image obtained by the terminal device Text image block 1, text image block 2, text image block 3, text image block 4, etc. in sequence, the terminal device arranges the first text image block in the image block sequence, that is, text image block 1, in After creating a new text line in the line list, the current text image block determined by the terminal device from the image block sequence is the text image block 2, and the terminal device arranges the text image block 2 and each text line in the line list at the end The line information of the text image block is matched.
  • the terminal device will match with the line information of the text image block 1 whose text line is arranged at the end.
  • the terminal device will add the current text image block to the end of the target text line to take the current text image block as the target The new text image block at the end of the text line.
  • the target text behavior is the text line where the text image block at the end is located that successfully matches the current text image block. Please refer to Figure 4 again. If the current text image block, that is, text image block 2 and text image block 1, are successfully matched, the terminal device will add text image block 2 to the end of the text line where text image block 1 is located, so that the text image Block 2 replaces text image block 1 as the new last text image block for the text line.
  • the terminal device creates a new text line in the line list according to the current text image block.
  • the terminal device when the current text image block is text image block 2, the terminal device only needs to match text image block 2 with text image block 1, and text image block 2 and text image block 1 If the matching is not successful, the terminal device creates a new text line in the line list according to the text image block 2, that is, opens another text line, and there is only the text image block 2 in this text line temporarily.
  • next text image block in the first image block sequence is the last text image block of the first image block sequence.
  • the terminal device matches the current text image block with the text image block at the end of each text line
  • the next text image block in the first image block sequence is used as the new current text image block
  • the The new current text image block continues to be matched with the line information of the text image block at the end of each text line in the line list, and the cycle continues until each text image block in the first image block sequence has been matched with each text line Matches the line information of the text image block at the end.
  • the terminal device matches the text image block 2 with the text image block 1 at the end of the only text line in the line list at the current moment. After the matching is unsuccessful, the terminal device matches the text image block 2 according to the Create a new text line in the line list. After that, the terminal device uses the next text image block in the image block sequence 410, that is, the text image block 3, as the new current text image block, and associates the text image block 3 with each text line in the line list 420 respectively. The text image blocks at the end are matched. At this time, there are only two text lines in the line list. The text image blocks at the end of these two text lines are text image block 1 and text image block 2 respectively.
  • the terminal device will The text image block 3 is matched with the text image block 1 and the text image block 2 respectively. At this time, the text image block 3 is successfully matched with the text image block 1, but not successfully matched with the text image block 2, so the terminal device will then use the current text image block block, that is, text image block 3, is added to the end of the target text line, and the target text line is the text line where text image block 1 is located, and the terminal device replaces text image block 1 with text image block 3 to update to this The text image block with the target text line at the end.
  • the terminal device uses the text image block 4 as a new current text image block according to the image block sequence 410, and matches the text image block 4 with the text image block at the end of each text line in the line list 420, and then There are only two text lines in the line list, and the text image blocks at the end of these two text lines are updated to text image block 3 and text image block 2. Therefore, the terminal device combines text image block 4 with text image block 3 and text image block 2 respectively. The text image block 2 is matched.
  • the terminal device will create a new text line in the line list according to the text image block 4 , that is, another text line is opened again, and there is only text image block 4 in this text line temporarily, and at this time, three text lines are included in the line list.
  • the terminal device will add the text image block 14 to a certain text line in the line list 420 or create a new text line according to the text image block 14, to end the above process.
  • the first text image and the number of text image blocks included in the image block sequence are not limited.
  • the text image block is selected to be added to the existing text image block.
  • a text line or a new text line is created, according to the conventional text order from left to right in the first text image, which further improves the accuracy of subsequent generated text lines and line lists.
  • FIG. 5 is a flow chart of matching current text image blocks in the embodiment of the present application.
  • Determine the current text image block from the image block sequence in step 320 the line information of the current text image block is matched with the line information of the text image block at the end of each text line in the line list, and may include the following steps:
  • the first text line is any text line in the line list.
  • the terminal device determines the current text image block from the first image block sequence, it calculates the current text image block and line
  • the first text line in the list is the matching value between the text image blocks at the end, and the matching value is compared with a preset matching threshold, where the matching threshold can represent the matching degree between two text image blocks , that is, the larger the matching value, the more matching between the two text image blocks, and the smaller the matching value, the less matching between the two text image blocks.
  • the first text line is any text line in the line list.
  • the matching value is greater than the matching threshold, the current text image block is successfully matched with the text image block at the end of the first text line, and the first text line is used as the target text line.
  • the terminal device can determine that the two text image blocks match successfully, and use the first text line as the target text line, where , the target text behavior matches the first text line where the text image block successfully matches the current text image block.
  • the matching value is not greater than the matching threshold, the current text image block is not successfully matched with the text image block at the end of the first text line.
  • the terminal device may determine that the two text image blocks are not successfully matched.
  • step 320 the current text image block is determined from the first image block sequence, and the line information of the current text image block and the lines of the text image block at the end of each text line in the line list are arranged Information matching may include the following steps:
  • the current text image block is respectively matched with the line information of the text image block at the end of each text line, wherein, the order of each text line in the line list is based on each The creation time of the lines of text is sorted from first to last.
  • step 330 if the current text image block and the text image block at the end of the target text line match successfully, then the current text image block is added to the end of the target text line to update the text image block at the end of the target text line , can include the following steps:
  • the terminal device determines the current text image block from the first image block sequence, it matches the current text image block with the line information of the text image block at the end of each text line in a certain order,
  • the certain order is the arrangement order of each text line in the line list, and the arrangement order of each text line is arranged from first to last according to the creation time of each text line.
  • the terminal device When the terminal device detects that the current text image block successfully matches the text image block at the end of the target text line, the terminal device adds the current text image block to the end of the target text line to use the current text image block as the target text line The new text image block at the end, and stop continuing to match the current text image block. Please refer to FIG. 4 again.
  • the line list currently contains 3 text lines, and these three text lines are based on If the creation time of the text line is determined from first to last, the order of the text line is that the text line where the text image block 1 and the text image block 3 are located is arranged first, the text line where the text image block 2 is located is arranged second, and the text line where the text image block 2 is located is arranged second.
  • the text line where the image block 4 is located is arranged third, then, the terminal device matches the text image block 5 with the line information of the text image block at the end of the three text lines according to the arrangement order of the text lines, that is to say The terminal device matches the text image block 5 first with the text image block 3 , then with the text image block 2 , and finally with the text image block 4 .
  • the terminal device will add the text image block 5 to the end of the text line where the text image block 3 is located, so that the text image block 5 replaces the text image block 3 , to be the new text image block at the end of the text line.
  • the terminal device stops matching the text image block 5 with the text image block 2 and the text image block 4 .
  • the terminal device will match the text image block 5 with the text image block 2, if the matching is successful, then stop matching with the text image block 4, if the matching is not successful , then match the text image block 5 with the text image block 4 at last, if the match is successful, then add the text image block 5 to the text row where the text image block 4 is located, and update the text image block 5 to the text row in the If the text image block at the end is still not matched successfully, a new text line is created in the line list according to the text image block 5.
  • the calculation can be reduced volume, and improve the efficiency of text and image typesetting.
  • the current text image block matches the text image block at the end of the target text line in step 330, the current text image block is added to the end of the target text line to update the target text line
  • the text image block at the end may include the following steps:
  • the current text image block is added to the end of the target text line to update the target text line at the end text image blocks;
  • the matching value between the current text image block and at least two target text lines at the end of the text image block satisfies the threshold condition, then determine the maximum matching value among the matching values of each target text line at the end of the text image block, And adding the current text image block to the end of the target text line corresponding to the maximum matching value, so as to update the text image block at the end of the target text line.
  • the terminal device when the terminal device matches the current text image block with the text image block at the end of each text line, if the current text image block is matched with the text image block at the end of each text line, only If the matching value between a text image block and the current text image block satisfies the threshold condition, then the terminal device will add the current text image block to the end of the target text line where the text image block whose matching value satisfies the threshold condition is located, so as to update the target A text image block with lines of text at the end.
  • the terminal device can calculate the current text image block and the multiple target text lines at the end. Matching values between text image blocks to select one of the maximum values, add the current text image block to the target text line where the text image block corresponding to the maximum value is located, and use the current text image block as the target text line The new text image block at the end.
  • the text image block with the largest matching value is selected, and the current text image block is added to the text image block.
  • the typesetting accuracy of the text image is effectively guaranteed.
  • each text image block is matched to generate at least one text line, so as to obtain the first text image corresponding
  • the following steps can also be performed:
  • each first text image to obtain a typesetting text image, wherein the first text image is any first text image in at least one first text image, and at least one first text image can be used for the full text
  • the image is segmented into regions.
  • the terminal device can separately analyze the text image blocks in each first text image at the same time For typesetting, the text image blocks in each first text image may also be typed one by one. After the typesetting of the text image blocks of each first text image is completed, the terminal device combines each first text image to obtain a complete typesetting text image, realizing the typesetting of the entire text image.
  • FIG. 6 is a schematic structural diagram of a text image typesetting device disclosed in an embodiment of the present application.
  • the text image typesetting device includes: a text detection module 610 and a text typesetting module 620 .
  • the text detection module 610 is configured to perform text line detection on the first text image respectively, and determine the line information corresponding to each text image block contained in the first text image, wherein the line information includes line height, line head position coordinates, and line end Position coordinates;
  • the text typesetting module 620 is configured to match each text image block according to the line information corresponding to each text image block included in the first text image, and generate at least one text line to obtain a line list corresponding to the first text image, wherein , the matching value between two adjacent text image blocks in each text line satisfies the threshold condition.
  • the text typesetting module 620 is also used for:
  • the line information corresponding to each text image block contained in the first text image determine the respective matching values between the first text image block and each other text image block, wherein the first text image block is the first text image block Any text image block, other text image blocks are text image blocks other than the first text image block in the first text image;
  • a maximum matching value among the respective matching values is determined, and the first text image block is added to other text image blocks corresponding to the maximum matching value, so as to generate at least one text line.
  • FIG. 7 is a schematic structural diagram of another text image typesetting device disclosed in an embodiment of the present application.
  • the text image typesetting device shown in FIG. 7 is further optimized from the text image typesetting device shown in FIG. 6 .
  • the text image typesetting device 600 shown in FIG. 7 may further include:
  • the text sorting module 630 is used for pre-sorting each text image block according to the abscissa of the line head position coordinates of each text image block contained in the first text image, according to the order of the abscissa from small to large, to obtain the first text
  • the image corresponds to a sequence of image blocks.
  • the text typesetting module 620 is also used for:
  • the current text image block matches successfully with the text image block at the end of the target text line, then the current text image block is added to the end of the target text line to update the text image block at the end of the target text line;
  • next text image block in the first image block sequence as the new current text image block, and continue to execute the line information of the current text image block and the line information of the text image block at the end of each text line in the line list
  • the matching step is performed until the current text image block is the last text image block of the first image block sequence.
  • the text typesetting module 620 is also used for:
  • the line information of the current text image block and the line information of the text image block whose first text line is at the end in the line list determine the matching value between the current text image block and the text image block whose first text line is at the end , the first text line is any text line in the line list;
  • the matching value is greater than the matching threshold, the current text image block is successfully matched with the text image block at the end of the first text line, and the first text line is used as the target text line;
  • the matching value is not greater than the matching threshold, the current text image block is not successfully matched with the text image block at the end of the first text line.
  • the text typesetting module 620 is also used for:
  • each text line in the line list match the current text image block with the line information of the text image block at the end of each text line in sequence, wherein each text line in the line list is based on the creation time of each text line Arranged from first to last;
  • the text typesetting module 620 is also used for:
  • the current text image block is added to the end of the target text line to update the target text line at the end text image blocks;
  • the matching values between the current text image block and at least two text image blocks whose target text lines are arranged at the end all satisfy the threshold condition, then determine the maximum matching value among the matching values of the text image blocks whose target text lines are arranged at the end , and add the current text image block to the end of the target text line corresponding to the maximum matching value, so as to update the text image block at the end of the target text line.
  • FIG. 8 is a schematic structural diagram of another text image typesetting device disclosed in an embodiment of the present application. Wherein, the text image typesetting device shown in FIG. 8 is further optimized from the text image typesetting device shown in FIG. 6 . Compared with the text image typesetting device shown in FIG. 6, the text image typesetting device 600 shown in FIG. 8 may further include:
  • the text segmentation module 640 is configured to perform region segmentation on the full text image to obtain at least one first text image, wherein the first text image is any first text image in the at least one first text image.
  • FIG. 9 is a schematic structural diagram of an electronic device disclosed by an embodiment. As shown in FIG. 9, the electronic device 900 may include:
  • a memory 910 storing executable program code
  • processor 920 coupled to the memory 910;
  • the processor 920 invokes the executable program code stored in the memory 910 to execute any text image typesetting method disclosed in the embodiments of the present application.
  • the electronic device shown in FIG. 9 may also include components not shown, such as a power supply, an input button, a camera, a speaker, a screen, an RF circuit, a Wi-Fi module, a Bluetooth module, and a sensor, which will not be described in detail in this embodiment.
  • the embodiment of the present application discloses a computer-readable storage medium, which stores a computer program, wherein the computer program causes a computer to execute any typesetting method of a text image disclosed in the embodiment of the present application.
  • the embodiment of the present application discloses a computer program product, the computer program product includes a non-transitory computer-readable storage medium storing the computer program, and the computer program is operable to cause the computer to execute any text disclosed in the embodiment of the present application How images are typed.
  • the units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, located in one place, or distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the above-mentioned integrated units are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-accessible memory.
  • the technical solution of the present application in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product, and the computer software product is stored in a memory , including several requests to make a computer device (which may be a personal computer, server, or network device, etc., specifically, a processor in the computer device) execute some or all of the steps of the above-mentioned methods in various embodiments of the present application.
  • ROM read-only Memory
  • RAM random access memory
  • PROM programmable read-only memory
  • EPROM Erasable Programmable Read Only Memory
  • OTPROM One-time Programmable Read-Only Memory
  • EEPROM Electronically Erasable Programmable Read-Only Memory
  • CD-ROM Compact Disc Read-Only Memory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Processing Or Creating Images (AREA)
  • Character Input (AREA)

Abstract

A text image typesetting method and apparatus, an electronic device, and a storage medium. The method comprises: performing text line detection on first text region images respectively to determine line information corresponding to text image blocks comprised in each of the first text region images, wherein the line information comprises line height, line head position coordinates, and line tail position coordinates (110); and matching the text image blocks according to the line information corresponding to the text image blocks comprised in the first text image to generate at least one text line to obtain a line list corresponding to the first text image, wherein a matching value between two adjacent text image blocks in each text line satisfies a threshold condition (120). According to the method, the accuracy of the text typesetting of skewed or curved distorted text images can be improved.

Description

文本图像的排版方法、装置、电子设备及存储介质Text image typesetting method, device, electronic device and storage medium 技术领域technical field
本申请涉及图文排版技术领域,具体涉及一种文本图像的排版方法、装置、电子设备及存储介质。The present application relates to the technical field of graphic typesetting, and in particular to a text image typesetting method, device, electronic equipment and storage medium.
背景技术Background technique
文本排版能够提高用户的阅读体验。目前的文本排版方法主要是针对文本内容规整的文本图像,通过对文本图像进行图像分割得到包含文本行的多个子区域,按照各个子区域的图像坐标从上到下、从左到右进行排序。而在文本内容倾斜或弯曲的文本图像中,很难对各个文本行进行准确的排序。Text layout can improve the user's reading experience. The current text typesetting method is mainly aimed at text images with regular text content. Multiple sub-regions containing text lines are obtained by image segmentation of the text image, and the image coordinates of each sub-region are sorted from top to bottom and from left to right. However, in a text image with slanted or curved text content, it is difficult to sort each text line accurately.
发明内容Contents of the invention
本申请实施例公开了一种文本图像的排版方法、装置、电子设备及存储介质,能够实现对文本内容倾斜或弯曲的畸变文本图像中的文本行进行准确排序。The embodiment of the present application discloses a typesetting method, device, electronic device and storage medium of a text image, capable of accurately sorting text lines in a distorted text image whose text content is inclined or curved.
本申请实施例第一方面公开了一种文本图像的排版方法,所述方法包括:The first aspect of the embodiment of the present application discloses a text image typesetting method, the method comprising:
对第一文本图像分别进行文本行检测,确定所述第一文本图像中包含的各个文本图像块对应的行信息,其中,所述行信息包括行高、行首位置坐标及行尾位置坐标;Perform text line detection on the first text image respectively, and determine line information corresponding to each text image block contained in the first text image, wherein the line information includes line height, line head position coordinates, and line end position coordinates;
根据所述第一文本图像中包含的各个文本图像块对应的行信息,对所述各个文本图像块进行匹配,生成至少一个文本行,以得到所述第一文本图像对应的行列表,其中,每个文本行中相邻两个文本图像块之间的匹配值满足阈值条件。According to the line information corresponding to each text image block included in the first text image, match each text image block to generate at least one text line to obtain a line list corresponding to the first text image, wherein, The matching value between two adjacent text image blocks in each text line satisfies the threshold condition.
作为一种可选的实施方式,在本申请实施例第一方面中,所述根据所述第一文本图像中包含的各个文本图像块对应的行信息,对所述各个文本图像块进行匹配,生成至少一个文本行,包括:As an optional implementation manner, in the first aspect of the embodiments of the present application, the matching is performed on each text image block according to the line information corresponding to each text image block included in the first text image, Generate at least one line of text, including:
根据第一文本图像中包含的各个文本图像块对应的行信息,确定第一文本图像块与各个其他文本图像块之间的各个匹配值,其中,所述第一文本图像块为所述第一第一文本图像中的任一文本图像块,所述其他文本图像块为所述第一文本图像中除所述第一文本图像块以外的文本图像块;According to the line information corresponding to each text image block contained in the first text image, determine each matching value between the first text image block and each other text image block, wherein the first text image block is the first Any text image block in the first text image, the other text image blocks are text image blocks in the first text image other than the first text image block;
确定所述各个匹配值中的最大匹配值,并将所述第一文本图像块加入所述最大匹配值对应的其他文本图像块,以生成至少一个文本行。Determine the maximum matching value among the respective matching values, and add the first text image block to other text image blocks corresponding to the maximum matching value, so as to generate at least one text line.
作为一种可选的实施方式,在本申请实施例第一方面中,在所述对所述第一文本图像进行文本行检测,确定所述第一文本图像中包含的各个文本图像块对应的行信息之后,还包括:As an optional implementation manner, in the first aspect of the embodiment of the present application, in performing text line detection on the first text image, it is determined that each text image block contained in the first text image corresponds to After the line information, also include:
根据所述第一文本图像中包含的各个文本图像块的行首位置坐标的横坐标,按照横坐标从小到大的顺序,对所述各个文本图像块进行预排序,得到所述第一文本图像对应的图像块序列。According to the abscissa of the line-head position coordinates of each text image block included in the first text image, and according to the order of the abscissa from small to large, the text image blocks are pre-sorted to obtain the first text image The corresponding sequence of image blocks.
作为一种可选的实施方式,在本申请实施例第一方面中,所述根据所述第一文本图像中包含的各个文本图像块对应的行信息,对所述各个文本图像块进行匹配,生成至少一个文本行,以得到所述第一文本图像对应的行列表,包括:As an optional implementation manner, in the first aspect of the embodiments of the present application, the matching is performed on each text image block according to the line information corresponding to each text image block included in the first text image, Generate at least one text line to obtain a line list corresponding to the first text image, including:
建立第一文本图像对应的行列表,并根据所述第一文本图像对应的第一图像块序列中排列在第一个的文本图像块,在所述行列表中创建一个新的文本行;Establishing a line list corresponding to the first text image, and creating a new text line in the line list according to the first text image block arranged in the first image block sequence corresponding to the first text image;
从所述第一图像块序列中确定当前文本图像块,将所述当前文本图像块的行信息与所述行列表中的各个文本行排在末尾的文本图像块的行信息进行匹配;Determining the current text image block from the first image block sequence, and matching the line information of the current text image block with the line information of the text image block at the end of each text line in the line list;
若所述当前文本图像块与目标文本行排在末尾的文本图像块匹配成功,则将所述当前文本图像块加入到所述目标文本行的末尾,以更新所述目标文本行排在末尾的文本图像块;If the current text image block is successfully matched with the text image block at the end of the target text line, then the current text image block is added to the end of the target text line to update the target text line at the end text image blocks;
若所述当前文本图像块与所述各个文本行排在末尾的文本图像块均未匹配成功,则根据所述当前文本图像块在所述行列表中创建一个新的文本行;If the current text image block and the text image blocks at the end of each text line are not matched successfully, then create a new text line in the line list according to the current text image block;
将所述第一图像块序列中的下一文本图像块作为新的当前文本图像块,并继续执行所述将所述当前文本图像块的行信息与所述行列表中的各个文本行排在末尾的文本图像块的行信息进行匹配的步骤,直至所述当前文本图像块为所述第一图像块序列的最后一个文本图像块。Taking the next text image block in the first image block sequence as a new current text image block, and continuing to perform the step of arranging the line information of the current text image block with each text line in the line list. The step of matching the line information of the last text image block until the current text image block is the last text image block of the first image block sequence.
作为一种可选的实施方式,在本申请实施例第一方面中,所述从所述第一图像块序列中确定当前文本图像块,将所述当前文本图像块的行信息与所述行列表中的各个文本行排在末尾的文本图像块的行信息进行匹配,包括:As an optional implementation manner, in the first aspect of the embodiments of the present application, the determining the current text image block from the first image block sequence, combining the line information of the current text image block with the line Each text line in the list is matched with the line information of the text image block at the end, including:
根据所述当前文本图像块的行信息及所述行列表中的第一文本行排在末尾的文本图像块的行信息,确定所述当前文本图像块与所述第一文本行排在末尾的文本图像块之间的匹配值,所述第一文本行为所述行列表中的任一文本行;According to the line information of the current text image block and the line information of the text image block whose first text line is at the end in the line list, determine the position between the current text image block and the first text line at the end Matching values between text image blocks, the first text line is any text line in the line list;
若所述匹配值大于匹配阈值,则所述当前文本图像块与所述第一文本行排在末尾的文本图像块匹配成功,并将所述第一文本行作为目标文本行;If the matching value is greater than the matching threshold, the current text image block is successfully matched with the text image block at the end of the first text line, and the first text line is used as the target text line;
若所述匹配值不大于所述匹配阈值,则所述当前文本图像块与所述第一文本行排在末尾的文本图像块未匹配成功。If the matching value is not greater than the matching threshold, the current text image block is not successfully matched with the text image block at the end of the first text line.
作为一种可选的实施方式,在本申请实施例第一方面中,所述若所述当前文本图像块与目标文本行排在末尾的文本图像块匹配成功,则将所述当前文本图像块加入到所述目标文本行的末尾,以更新所述目标文本行排在末尾的文本图像块,包括:As an optional implementation, in the first aspect of the embodiment of the present application, if the current text image block is successfully matched with the text image block at the end of the target text line, the current text image block Added to the end of the target text line to update the text image block at the end of the target text line, including:
若所述当前文本图像块仅与一个目标文本行排在末尾的文本图像块之间的匹配值满足阈值条件,则将所述当前文本图像块加入到所述目标文本行的末尾,以更新所述目标文本行排在末尾的文本图像块;If the matching value between the current text image block and a text image block at the end of a target text line satisfies the threshold condition, then the current text image block is added to the end of the target text line to update all The text image block at the end of the target text line;
若所述当前文本图像块与至少两个目标文本行排在末尾的文本图像块之间的匹配值均满足阈值条件,则确定各个目标文本行排在末尾的文本图像块的匹配值中的最大匹配值,并将所述当前文本图像块加入到所述最大匹配值对应的目标文本行的末尾,以更新所述目标文本行排在末尾的文本图像块。If the matching values between the current text image block and at least two text image blocks whose target text lines are arranged at the end all satisfy the threshold condition, then determine the maximum of the matching values of the text image blocks whose target text lines are arranged at the end matching value, and adding the current text image block to the end of the target text line corresponding to the maximum matching value, so as to update the text image block at the end of the target text line.
作为一种可选的实施方式,在本申请实施例第一方面中,所述从所述第一图像块序列中确定当前文本图像块,将所述当前文本图像块的行信息与所述行列表中的各个文本行排在末尾的文本图像块的行信息进行匹配,包括:As an optional implementation manner, in the first aspect of the embodiments of the present application, the determining the current text image block from the first image block sequence, combining the line information of the current text image block with the line Each text line in the list is matched with the line information of the text image block at the end, including:
按照所述行列表中各个文本行的排列顺序,将所述当前文本图像块分别与各个文本行排在末尾的文本图像块的行信息依次进行匹配,其中,行列表中各个文本行依据所述各个文本行的创建时间从先到后进行排列;According to the arrangement order of each text line in the line list, the current text image block is respectively matched with the line information of the text image block at the end of each text line, wherein each text line in the line list is based on the The creation time of each text line is arranged from first to last;
以及,所述若所述当前文本图像块与目标文本行排在末尾的文本图像块匹配成功,则将所述当前文本图像块加入到所述目标文本行的末尾,以更新所述目标文本行排在末尾的文本图像块,包括:And, if the current text image block is successfully matched with the text image block at the end of the target text line, then adding the current text image block to the end of the target text line to update the target text line Text image blocks at the end, including:
在检测到所述当前文本图像块与目标文本行排在末尾的文本图像块匹配成功时,将当前文本图像块加入到所述目标文本行的末尾,以更新所述目标文本行排在末尾的文本图像块,并停止继续对所述当前文本图像块进行匹配。When it is detected that the current text image block is successfully matched with the text image block at the end of the target text line, the current text image block is added to the end of the target text line to update the target text line at the end text image block, and stop continuing to match the current text image block.
作为一种可选的实施方式,在本申请实施例第一方面中,在所述对第一文本图像进行文本行检测,确定所述第一文本图像中包含的各个文本图像块对应的行信息之前,还包括:As an optional implementation manner, in the first aspect of the embodiment of the present application, in performing text line detection on the first text image, determine the line information corresponding to each text image block included in the first text image Previously, also included:
对全文本图像进行区域分割,得到至少一个第一文本图像,其中,所述第 一文本图像为所述至少一个第一文本图像中的任一第一文本图像。Perform region segmentation on the full text image to obtain at least one first text image, wherein the first text image is any first text image in the at least one first text image.
本申请实施例第二方面公开了一种文本图像的排版装置,所述装置包括:The second aspect of the embodiment of the present application discloses a text image typesetting device, the device includes:
文本检测模块,用于对第一文本图像进行文本行检测,确定所述第一文本图像中包含的各个文本图像块对应的行信息,其中,所述行信息包括行高、行首位置坐标及行尾位置坐标;A text detection module, configured to perform text line detection on the first text image, and determine line information corresponding to each text image block contained in the first text image, wherein the line information includes line height, line head position coordinates and The position coordinates of the end of the line;
文本排版模块,用于根据所述第一文本图像中包含的各个文本图像块对应的行信息,对所述各个文本图像块进行匹配,生成至少一个文本行,以得到所述第一文本图像对应的行列表,其中,每个文本行中相邻两个文本图像块之间的匹配值满足阈值条件。A text typesetting module, configured to match each text image block according to the line information corresponding to each text image block contained in the first text image, and generate at least one text line, so as to obtain the first text image corresponding A line list of , where the matching values between two adjacent text image blocks in each text line satisfy the threshold condition.
本申请实施例第三方面公开一种电子设备,包括存储器及处理器,所述存储器中存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器实现本申请实施例第一方面公开的一种文本图像的排版方法。The third aspect of the embodiment of the present application discloses an electronic device, including a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processor implements the embodiment of the present application The first aspect discloses a text image typesetting method.
本申请实施例第四方面公开一种计算机可读存储介质,其存储计算机程序,其中,所述计算机程序被处理器执行时实现本申请实施例第一方面公开的一种文本图像的排版方法。The fourth aspect of the embodiment of the present application discloses a computer-readable storage medium storing a computer program, wherein when the computer program is executed by a processor, the method for typesetting a text image disclosed in the first aspect of the embodiment of the present application is implemented.
与相关技术相比,本申请实施例具有以下有益效果:Compared with related technologies, the embodiments of the present application have the following beneficial effects:
通过对第一文本图像进行文本行检测,确定第一文本图像中包含的各个文本图像块的行信息,行信息可包括行高、行首位置坐标和行尾位置坐标,根据第一文本图像包含的各个文本图像块对应的行信息,对各个文本图像块进行匹配,生成至少一个文本行,进而得到第一文本图像对应的行列表,其中每个文本行中相邻两个文本图像块之间的匹配值满足阈值条件。本申请实施例中,依据第一文本图像中包含的各个文本图像块的行信息,与第一文本图像中的其他文本图像块进行匹配,以此生成至少一个文本行,第一文本图像中的各个文本行构成该第一文本图像对应的行列表,能够对倾斜或弯曲的畸变文本图像的文本行进行排版,并且排版过程中依据行高、行首位置坐标以及行尾位置坐标等行信息进行匹配,能够提高对倾斜或弯曲的畸变文本图像的文本排版的准确度。By performing text line detection on the first text image, the line information of each text image block contained in the first text image is determined. The line information may include line height, line head position coordinates, and line end position coordinates. According to the content of the first text image The line information corresponding to each text image block of each text image block is matched to generate at least one text line, and then the line list corresponding to the first text image is obtained, wherein the distance between two adjacent text image blocks in each text line The matching value of satisfies the threshold condition. In the embodiment of the present application, according to the line information of each text image block contained in the first text image, it is matched with other text image blocks in the first text image to generate at least one text line, and the text image in the first text image Each text line constitutes the line list corresponding to the first text image, and can typeset the text lines of the skewed or curved distorted text image, and the typesetting process is carried out according to line information such as line height, line head position coordinates, and line end position coordinates. Matching, which can improve the accuracy of text typesetting for skewed or curved distorted text images.
附图说明Description of drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还 可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the following will briefly introduce the accompanying drawings that need to be used in the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present application. For Those of ordinary skill in the art can also obtain other drawings based on these drawings without making creative efforts.
图1是本申请实施例公开的一种文本图像的排版方法的流程示意图;FIG. 1 is a schematic flow diagram of a text image typesetting method disclosed in an embodiment of the present application;
图2是本申请实施例公开的一种文本图像的排版方法的流程示意图;Fig. 2 is a schematic flow chart of a text image typesetting method disclosed in the embodiment of the present application;
图3是本申请实施例公开的行列表构建的流程图;Fig. 3 is a flowchart of the row list construction disclosed in the embodiment of the present application;
图4为本申请实施例公开的构建的图像块序列以及行列表的结构图;FIG. 4 is a structural diagram of a sequence of constructed image blocks and a row list disclosed in an embodiment of the present application;
图5是本申请实施例中当前文本图像块进行匹配的流程图;Fig. 5 is a flow chart of matching the current text image block in the embodiment of the present application;
图6是本申请实施例公开的一种文本图像的排版装置的结构示意图;Fig. 6 is a schematic structural diagram of a text image typesetting device disclosed in an embodiment of the present application;
图7是本申请实施例公开的另一种文本图像的排版装置的结构示意图;Fig. 7 is a schematic structural diagram of another text image typesetting device disclosed in the embodiment of the present application;
图8是本申请实施例公开的另一种文本图像的排版装置的结构示意图;Fig. 8 is a schematic structural diagram of another text image typesetting device disclosed in the embodiment of the present application;
图9是一个实施例公开的一种电子设备的结构示意图。Fig. 9 is a schematic structural diagram of an electronic device disclosed by an embodiment.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some, not all, embodiments of the application. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.
需要说明的是,本申请实施例及附图中的术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "comprising" and "having" and any variations thereof in the embodiments of the present application and the drawings are intended to cover non-exclusive inclusion. For example, a process, method, system, product or device comprising a series of steps or units is not limited to the listed steps or units, but optionally also includes unlisted steps or units, or optionally further includes For other steps or units inherent in these processes, methods, products or apparatuses.
本申请实施例公开了一种文本图像的排版方法、装置、电子设备及存储介质,能够实现对文本内容倾斜或弯曲的畸变文本图像中的文本行进行准确排序。以下结合附图进行详细描述。The embodiment of the present application discloses a typesetting method, device, electronic device and storage medium of a text image, capable of accurately sorting text lines in a distorted text image whose text content is inclined or curved. A detailed description is given below in conjunction with the accompanying drawings.
请参阅图1,图1是本申请实施例公开的一种文本图像的排版方法的流程示意图。该文本图像的排版方法可应用于终端设备,该终端设备可包括但不限于智能手表、智能手机、智能手环以及平板电脑等,该方法可以包括以下步骤:Please refer to FIG. 1 . FIG. 1 is a schematic flowchart of a typesetting method for text images disclosed in an embodiment of the present application. The typesetting method of the text image can be applied to a terminal device, and the terminal device may include but not limited to smart watches, smart phones, smart bracelets and tablet computers, etc., and the method may include the following steps:
110、对第一文本图像进行文本行检测,确定第一文本图像中包含的各个文本图像块对应的行信息,其中,行信息包括行高、行首位置坐标及行尾位置坐标。110. Perform text line detection on the first text image, and determine line information corresponding to each text image block included in the first text image, where the line information includes line height, line start position coordinates, and line end position coordinates.
在本申请实施例中,终端设备对第一文本图像进行文本行检测,文本行检 测用于确定第一文本图像中包含的各个文本图像块,以及各个文本图像块所对应的行信息,其中,文本行检测可以通过采用PSENet(Progressive Scale Expansion)算法、CPTN(Connectionist Text Proposal Network)算法、CRAFT(Character Region Awareness for Text Detection)算法或者LOMO(Local Maximal Occurrence)算法等来进行实现,文本行检测还可以采用深度学习中的支持弯曲文本行检测的算法,在此不作具体限定。In the embodiment of the present application, the terminal device performs text line detection on the first text image, and the text line detection is used to determine each text image block contained in the first text image and the line information corresponding to each text image block, wherein, Text line detection can be realized by using PSENet (Progressive Scale Expansion) algorithm, CPTN (Connectionist Text Proposal Network) algorithm, CRAFT (Character Region Awareness for Text Detection) algorithm or LOMO (Local Maximal Occurrence) algorithm, etc. Algorithms that support curved text line detection in deep learning can be used, which is not specifically limited here.
在本申请实施例中,第一文本图像为包含有文本的图像,图像中包含的文本长度不限。第一文本图像可以通过扫描设备如扫描笔来对文本进行录入,进而生成所录入的文本对应的第一文本图像。In the embodiment of the present application, the first text image is an image containing text, and the length of the text contained in the image is not limited. The first text image may use a scanning device such as a scanning pen to input text, thereby generating a first text image corresponding to the input text.
在一些实施例中,文本行检测可以先对第一文本图像进行预处理如进行灰度化、二值化以及平滑等处理,以统一第一文本图像规格,可以再通过一个或多个边界框形式对第一文本图像中的一个或多个文字进行标记,该边界框为文本图像块,最后提取各个边界框的行高、行首位置坐标及行尾位置坐标等信息作为文本图像块对应的行信息。能够便捷地对各个第一文本图像包含的文本图像块,以及文本图像块对应的行信息进行确定。In some embodiments, the text line detection can first preprocess the first text image, such as grayscale, binarization, and smoothing, to unify the first text image specifications, and then pass one or more bounding boxes The form marks one or more characters in the first text image, the bounding box is a text image block, and finally extracts the line height, line head position coordinates, and line end position coordinates of each bounding box as the corresponding text image block row information. The text image blocks included in each first text image and the line information corresponding to the text image blocks can be determined conveniently.
在一些实施例中,终端设备对第一文本图像进行文本行检测,确定第一文本图像中的各个文本图像块;建立第一文本图像区域对应的坐标系,确定各个文本图像块所对应的四个角点,依据四个角点确定各个文本图像块的行高、行首位置坐标以及行尾位置坐标等行信息。其中,各个文本图像块可包含四个角点,可依据第一文本图像区域对应的坐标系,左侧两个角点的连线的中点的坐标以及右侧两个角点的连线的中点坐标,依据该文本图像块的四个角点以及所确定的两个中点的坐标,可将行信息中的行高确定为左侧两个角点的纵坐标之差,行首位置坐标可以为左侧中点的坐标,行尾位置坐标可以为右侧中点的坐标。通过采用四个角点替代文本图像块来确定行信息,能够统一各个文本图像块的行信息标准。In some embodiments, the terminal device performs text line detection on the first text image, and determines each text image block in the first text image; establishes a coordinate system corresponding to the first text image area, and determines four text image blocks corresponding to each text image. According to the four corner points, the line information such as line height, line head position coordinates and line end position coordinates of each text image block is determined. Wherein, each text image block may include four corner points, and according to the coordinate system corresponding to the first text image area, the coordinates of the midpoint of the line connecting the two corner points on the left and the coordinates of the line connecting the two corner points on the right Midpoint coordinates, according to the four corner points of the text image block and the determined coordinates of the two midpoints, the row height in the row information can be determined as the difference between the vertical coordinates of the two left corner points, and the row head position The coordinates can be the coordinates of the left midpoint, and the coordinates of the line end position can be the coordinates of the right midpoint. By using four corner points instead of the text image block to determine the line information, the standard of the line information of each text image block can be unified.
作为一种可选的实施例,在步骤110对第一文本图像进行文本行检测,确定第一文本图像中包含的各个文本图像块对应的行信息之前,所述方法还包括:As an optional embodiment, before performing text line detection on the first text image at step 110 and determining line information corresponding to each text image block contained in the first text image, the method further includes:
对全文本图像进行区域分割,得到至少一个第一文本图像,其中,第一文本图像为至少一个第一文本图像中的任一第一文本图像。Perform region segmentation on the full text image to obtain at least one first text image, wherein the first text image is any first text image in the at least one first text image.
在本申请实施例中,终端设备先获取全文本图像,获取全文本图像后可采用区域分割算法对全文本图像进行区域分割,得到至少一个第一文本图像,其 中,全文本图像为包含有多个文本的图像,例如包含有多个段落的论文的图像,或者为包含有一道大题,该大题中包含有三道小题的学生作业的图像。全文本图像的获取方式可以为终端设备采用设置于终端设备上的摄像头对文本进行拍摄或扫描来获取全文本图像,或者通过访问服务器或者其他终端设备来获取全文本图像。第一文本图像为全文本图像进行区域分割后得到的各个子区域,例如,全文本图像为包含有一道大题,该大题中包含有三道小题的作业的图像时,第一文本图像可以为三道小题的区域图像。对于进行区域分割所采用的区域分割算法,可以为基于边缘的图像分割算法、区域生长算法、区域分裂合并算法或水平集算法等,在此不作具体限定。通过对全文本图像进行区域分割,能够在全文本图像包含较多内容时获取到较小的便于进行文本行检测以及排版的第一文本图像,能够提高全文本图像的排版效果。In the embodiment of the present application, the terminal device first obtains the full text image, and after obtaining the full text image, it can use a region segmentation algorithm to perform region segmentation on the full text image to obtain at least one first text image, wherein the full text image contains multiple An image of text, such as an image of a paper containing multiple paragraphs, or an image of a student assignment containing a large question with three sub-questions. The full-text image may be obtained by the terminal device using a camera installed on the terminal device to capture or scan the text to obtain the full-text image, or by accessing a server or other terminal devices to obtain the full-text image. The first text image is each sub-region obtained after region segmentation of the full-text image. For example, when the full-text image is an image containing a big question and three small questions in the big question, the first text image can be An image of the region for the three sub-questions. The region segmentation algorithm used for region segmentation may be an edge-based image segmentation algorithm, a region growing algorithm, a region splitting and merging algorithm, or a level set algorithm, etc., which are not specifically limited here. By performing region segmentation on the full text image, when the full text image contains more content, a smaller first text image that is convenient for text line detection and typesetting can be obtained, and the typesetting effect of the full text image can be improved.
120、根据第一文本图像中包含的各个文本图像块对应的行信息,对各个文本图像块进行匹配,生成至少一个文本行,以得到第一文本图像对应的行列表,其中,每个文本行中相邻两个文本图像块之间的匹配值满足阈值条件。120. According to the line information corresponding to each text image block included in the first text image, match each text image block to generate at least one text line to obtain a line list corresponding to the first text image, wherein each text line The matching value between two adjacent text image blocks in , satisfies the threshold condition.
在本申请实施例中,终端设备依据第一文本图像中包含的各个文本图像块的行信息,即依据各个文本图像块的行高、行首位置坐标和行尾位置坐标等信息,对各个文本图像块进行匹配,若一个文本图像块与另一个文本图像块匹配成功,则将一个文本图像块加入到另一个文本图像块,以生成一个文本行。对于匹配成功,可以为如行高相等,且一个文本图像块的行首位置坐标与另一个文本图像块的行尾位置坐标相同则认为匹配成功。或者,通过自定义算法,结合行高、行首位置坐标和行尾位置坐标等行信息来计算两个文本图像块之间的匹配值,将匹配值与阈值条件相比较,如果满足阈值条件则认为匹配成功,阈值条件的设置以及自定义的算法具体不做限定。对于一个文本图像块加入另一个文本图像块,可以为一个文本图像块的行首与另一个文本图像块的行尾相连接。In the embodiment of the present application, the terminal device calculates each text image according to the line information of each text image block included in the first text image, that is, according to the line height, line head position coordinates, and line end position coordinates of each text image block. Image blocks are matched, and if a text image block is successfully matched with another text image block, a text image block is added to another text image block to generate a text line. For a successful match, if the row heights are equal, and the coordinates of the first position of a text image block are the same as the coordinates of an end position of another text image block, then the matching is considered successful. Or, use a custom algorithm to calculate the matching value between two text image blocks by combining line information such as line height, line head position coordinates, and line end position coordinates, and compare the matching value with the threshold condition. If the threshold condition is met, then It is considered that the matching is successful, and the setting of the threshold condition and the custom algorithm are not specifically limited. For adding another text image block to one text image block, the line head of one text image block may be connected to the line end of another text image block.
终端设备依据匹配后的各个文本图像块,构成至少一个文本行,并且依据第一文本图像中所构成的至少一个文本行来构成该第一文本图像所对应的行列表,其中,行列表为第一文本图像中排版后的文本列表。The terminal device forms at least one text line according to each text image block after matching, and forms a line list corresponding to the first text image according to at least one text line formed in the first text image, wherein the line list is Typesetting text list in a text image.
在本申请实施例中,终端设备依据第一文本图像中包含的各个文本图像块的行信息,来与第一文本图像中的其他文本图像块进行匹配,以此生成至少一个文本行,第一文本图像中的各个文本行构成该第一文本图像对应的行列表, 能够对倾斜或弯曲的畸变文本图像的文本行进行排版,并且排版过程中依据行高、行首位置坐标以及行尾位置坐标等行信息进行匹配,能够提高对倾斜或弯曲的畸变文本图像的文本排版的准确度。In the embodiment of the present application, the terminal device matches other text image blocks in the first text image according to the line information of each text image block contained in the first text image, so as to generate at least one text line, the first Each text line in the text image constitutes the line list corresponding to the first text image, which can typeset the text lines of the skewed or curved distorted text image, and the typesetting process is based on the line height, the position coordinates of the beginning of the line and the position coordinates of the end of the line Matching the equal line information can improve the accuracy of text typesetting for skewed or curved distorted text images.
作为一种可选的实施例,步骤120中根据第一文本图像中包含的各个文本图像块对应的行信息,对各个文本图像块进行匹配,生成至少一个文本行,可以包括以下步骤:As an optional embodiment, in step 120, according to the line information corresponding to each text image block contained in the first text image, matching each text image block to generate at least one text line may include the following steps:
根据第一文本图像中包含的各个文本图像块对应的行信息,确定第一文本图像块与各个其他文本图像块之间的各个匹配值,其中,第一文本图像块为第一文本图像中的任一文本图像块,其他文本图像块为第一文本图像中除第一文本图像块以外的文本图像块;According to the line information corresponding to each text image block contained in the first text image, determine the respective matching values between the first text image block and each other text image block, wherein the first text image block is the first text image block Any text image block, other text image blocks are text image blocks other than the first text image block in the first text image;
确定各个匹配值中的最大匹配值,并将第一文本图像块加入最大匹配值对应的其他文本图像块,以生成至少一个文本行。A maximum matching value among the respective matching values is determined, and the first text image block is added to other text image blocks corresponding to the maximum matching value, so as to generate at least one text line.
在本申请实施例中,终端设备可以根据第一文本图像中包含的各个文本图像块对应的行信息,来确定一个文本图像块与在第一文本图像中的其他文本图像块之间的匹配值。对于每一个文本图像块,从该文本图像块与其他文本图像块之间的各个匹配值中,选取最大匹配值,例如,第一文本图像块与第一文本图像中的其他3个文本图像块之间的匹配值分别为50、60和90,那么选取最大匹配值90,将第一文本图像块加入匹配值为90的文本图像块。In the embodiment of the present application, the terminal device can determine the matching value between a text image block and other text image blocks in the first text image according to the line information corresponding to each text image block included in the first text image . For each text image block, select the maximum matching value from each matching value between the text image block and other text image blocks, for example, the first text image block and the other 3 text image blocks in the first text image The matching values between are 50, 60 and 90 respectively, then select the maximum matching value of 90, and add the first text image block to the text image block with a matching value of 90.
在本申请实施例中,对于每一个文本图像块,终端设备还可以从该文本图像块与其他文本图像块之间的各个匹配值中,选取最满足阈值条件的匹配值,其中,最满足阈值条件指的是在阈值条件的范围内并且与阈值条件最接近,例如,阈值条件为100以内,那么第一文本图像块与第一文本图像中的其他3个文本图像块之间的匹配值分别为80、90和110,那么选取最满足阈值条件的匹配值为90,将第一文本图像块加入匹配值为90的文本图像块。In this embodiment of the present application, for each text image block, the terminal device can also select the matching value that best satisfies the threshold condition from the matching values between the text image block and other text image blocks, wherein, the most satisfying threshold value The condition refers to within the range of the threshold condition and is closest to the threshold condition, for example, if the threshold condition is within 100, then the matching values between the first text image block and the other 3 text image blocks in the first text image are respectively is 80, 90, and 110, then select the matching value that most satisfies the threshold condition to be 90, and add the first text image block to the text image block with a matching value of 90.
因此,第一文本图像中的各个文本图像块,均与在第一文本图像中的其他文本图像块进行匹配,确定最匹配的文本图像块,并与该最匹配的文本图像块进行拼接,以生成至少一个文本行,终端设备再依据第一文本图像中生成的至少一个文本行来构成该第一文本图像所对应的行列表。能够令文本图像块与其他各个文本图像块进行匹配,确定其中最匹配的文本图像块并加入到该文本图像块,以生成文本行以及第一文本图像对应的行列表,有效地实现了第一文本图像的排版。其中,第一文本图像块加入另一文本图像块,具体可以为第一文 本图像块的行首与另一文本图像块的行尾相连接。Therefore, each text image block in the first text image is matched with other text image blocks in the first text image to determine the most matching text image block, and splicing with the most matching text image block to At least one text line is generated, and the terminal device then forms a line list corresponding to the first text image according to the at least one text line generated in the first text image. A text image block can be matched with other text image blocks, and the most matching text image block can be determined and added to the text image block to generate a text line and a line list corresponding to the first text image, effectively realizing the first Typography of text images. Wherein, the first text image block is added to another text image block, specifically, the line head of the first text image block is connected to the line end of another text image block.
请参阅图2,图2是本申请实施例公开的一种文本图像的排版方法的流程示意图。该文本图像的排版方法可应用于终端设备,可以包括以下步骤:Please refer to FIG. 2 . FIG. 2 is a schematic flowchart of a typesetting method for a text image disclosed in an embodiment of the present application. The typesetting method of the text image can be applied to the terminal device, and may include the following steps:
210、对第一文本图像进行文本行检测,确定第一文本图像中包含的各个文本图像块对应的行信息,其中,行信息包括行高、行首位置坐标及行尾位置坐标。210. Perform text line detection on the first text image, and determine line information corresponding to each text image block included in the first text image, where the line information includes line height, line start position coordinates, and line end position coordinates.
在本申请实施例中,终端设备执行本步骤的过程与上述各实施例相同,在此不再赘述In the embodiment of the present application, the process of the terminal device performing this step is the same as that of the above-mentioned embodiments, and will not be repeated here.
220、根据第一文本图像中包含的各个文本图像块的行首位置坐标的横坐标,按照横坐标从小到大的顺序,对各个文本图像块进行预排序,得到第一文本图像对应的图像块序列。220. According to the abscissa of the line head position coordinates of each text image block included in the first text image, and in the order of the abscissa coordinates from small to large, pre-sort each text image block to obtain an image block corresponding to the first text image sequence.
在本申请实施例中,终端设备在确定了第一文本图像中包含的各个文本图像块对应的行信息之后,对于第一文本图像,依据该第一文本图像中包含的各个文本图像块的行首位置坐标中的横坐标,对该第一文本图像中的各个文本图像块进行预排序,其中,预排序按照各个行首位置坐标的横坐标从小到大的顺序来进行排序,来得到第一文本图像对应的图像块序列,图像块序列即为第一文本图像中进行预排序后的文本图像块。能够保证各个文本图像块在第一文本图像中为从左到右的常规文本顺序,进而有效地提高了后续生成的文本行以及行列表的准确性。In the embodiment of the present application, after the terminal device determines the line information corresponding to each text image block contained in the first text image, for the first text image, according to the line information of each text image block contained in the first text image The abscissa in the first position coordinates pre-sorts each text image block in the first text image, wherein the pre-sorting is sorted according to the order of the abscissas of the first position coordinates of each line from small to large, to obtain the first An image block sequence corresponding to the text image, where the image block sequence is the pre-sorted text image blocks in the first text image. It can ensure that each text image block is in the regular text sequence from left to right in the first text image, thereby effectively improving the accuracy of the subsequently generated text lines and line lists.
230、根据第一文本图像中包含的各个文本图像块对应的行信息,对各个文本图像块进行匹配,生成至少一个文本行,以得到第一文本图像对应的行列表,其中,每个文本行中相邻两个文本图像块之间的匹配值满足阈值条件。230. According to the line information corresponding to each text image block included in the first text image, match each text image block to generate at least one text line to obtain a line list corresponding to the first text image, wherein each text line The matching value between two adjacent text image blocks in , satisfies the threshold condition.
在本申请实施例中,终端设备依据第一文本图像中包含的各个文本图像块对应的行信息,各个文本图像块之间进行相互匹配,若两个文本图像块之间的匹配值满足阈值条件,那么可依据图像块序列中的顺序来确定这两个文本图像块之间的拼接关系,例如,文本图像块1和文本图像块2之间的匹配值满足阈值条件,且文本图像块1在图像块序列中的顺序先于文本图像块2,那么在这两个文本图像块之间,需要将文本图像块2的行首与文本图像块1的行尾连接,各个文本图像块之间的拼接关系均为这样,以生成至少一个文本行,并得到第一文本图像对应的行列表。In the embodiment of the present application, the terminal device matches each text image block with each other according to the line information corresponding to each text image block included in the first text image, if the matching value between two text image blocks satisfies the threshold condition , then the mosaic relationship between the two text image blocks can be determined according to the sequence in the image block sequence, for example, the matching value between text image block 1 and text image block 2 satisfies the threshold condition, and text image block 1 is in The order in the image block sequence is before the text image block 2, so between the two text image blocks, the beginning of the line of the text image block 2 needs to be connected with the end of the line of the text image block 1, and the text image blocks between each text image block The splicing relationship is all such that at least one text line is generated and a list of lines corresponding to the first text image is obtained.
作为一种可选的实施例,请参阅图3,图3是本申请实施例公开的行列表构 建的流程图。步骤230中根据第一文本图像中包含的各个文本图像块对应的行信息,对各个文本图像块进行匹配,生成至少一个文本行,以得到第一文本图像对应的行列表,可以包括以下步骤:As an optional embodiment, please refer to Fig. 3, Fig. 3 is a flow chart of row list construction disclosed in the embodiment of the present application. In step 230, according to the line information corresponding to each text image block contained in the first text image, each text image block is matched to generate at least one text line, so as to obtain the line list corresponding to the first text image, which may include the following steps:
310、建立第一文本图像对应的行列表,并根据第一文本图像对应的第一图像块序列中排列在第一个的文本图像块,在行列表中创建一个新的文本行。310. Establish a line list corresponding to the first text image, and create a new text line in the line list according to the first text image block in the first image block sequence corresponding to the first text image.
在本申请实施例中,请参阅图4,图4为本申请实施例公开的构建的图像块序列以及行列表的结构图。终端设备在确定第一文本图像对应的图像块序列410后,对文本图像中的第一文本图像,构建一个行列表420。终端设备依据第一文本图像对应的图像块序列中排列在第一个的文本图像块,也就是在所在的第一文本图像中横坐标最小的文本图像块,在行列表中创建一个新的文本行,也就是说令该横坐标最小的文本图像块首先单独在行列表中生成一个文本行。In the embodiment of the present application, please refer to FIG. 4 , which is a structural diagram of the constructed image block sequence and row list disclosed in the embodiment of the present application. After determining the image block sequence 410 corresponding to the first text image, the terminal device constructs a row list 420 for the first text image in the text image. The terminal device creates a new text image block in the line list according to the first text image block arranged in the image block sequence corresponding to the first text image, that is, the text image block with the smallest abscissa in the first text image. line, that is to say, the text image block with the smallest abscissa first generates a text line in the line list alone.
在一些实施例中,若图像块序列中存在多个文本图像块并列第一,则根据该多个文本图像块的行首位置坐标的纵坐标,按照纵坐标从大到小的顺序进行二次排序,选取二次排序后排列在第一个的文本图像块,在行列表中创建一个新的文本行。In some embodiments, if there are a plurality of text image blocks in the image block sequence and rank first, then according to the ordinate of the line head position coordinates of the plurality of text image blocks, the second step is performed in the order of the ordinate from large to small. Sort, select the text image block that is arranged in the first place after secondary sorting, and create a new text line in the line list.
在本申请实施例中,在第一文本图像对应的图像块序列中,如果有多个文本图像块的行首位置坐标的横坐标大小相同,那么终端设备可以依据这几个横坐标相同的文本图像块的纵坐标来对这几个文本图像块进行二次排序,从而选取纵坐标最大的文本图像块来在行列表中创建一个新的文本行,能够保证各个文本图像块在第一文本图像中为从左到右以及从上到下的常规文本顺序,进一步提高了后续生成的文本行以及行列表的准确性。In this embodiment of the application, in the sequence of image blocks corresponding to the first text image, if there are multiple text image blocks with the same abscissa of the line-head position coordinates, the terminal device can The vertical coordinates of the image blocks are used to reorder these text image blocks, so that the text image block with the largest vertical coordinates is selected to create a new text line in the line list, which can ensure that each text image block is in the first text image Middle is the regular text order from left to right and top to bottom, which further improves the accuracy of subsequent generated text lines and line lists.
320、从第一图像块序列中确定当前文本图像块,将当前文本图像块的行信息与行列表中的各个文本行排在末尾的文本图像块的行信息进行匹配。320. Determine the current text image block from the first image block sequence, and match the line information of the current text image block with the line information of the text image block whose respective text lines are arranged at the end in the line list.
在本申请实施例中,终端设备从第一图像块序列中确定当前文本图像块,请再次参阅图4,终端设备所得到的第一文本图像对应的第一图像块序列中的各个文本图像块依次为文本图像块1、文本图像块2、文本图像块3、文本图像块4等等,在终端设备将图像块序列中排列在第一个的文本图像块,也就是文本图像块1,在行列表中创建一个新的文本行后,终端设备从图像块序列中确定的当前文本图像块即为文本图像块2,终端设备将文本图像块2与行列表中的各个文本行排在末尾的文本图像块的行信息进行匹配,此时行列表中只有以文本图像块1创建的一个文本行,因此该文本行排在末尾的文本图像块即为文本图像块1, 也就是说终端设备将文本图像块2与一个文本行排在末尾的文本图像块1的行信息进行匹配。In the embodiment of the present application, the terminal device determines the current text image block from the first image block sequence, please refer to Figure 4 again, each text image block in the first image block sequence corresponding to the first text image obtained by the terminal device Text image block 1, text image block 2, text image block 3, text image block 4, etc. in sequence, the terminal device arranges the first text image block in the image block sequence, that is, text image block 1, in After creating a new text line in the line list, the current text image block determined by the terminal device from the image block sequence is the text image block 2, and the terminal device arranges the text image block 2 and each text line in the line list at the end The line information of the text image block is matched. At this time, there is only one text line created with the text image block 1 in the line list, so the text image block at the end of the text line is the text image block 1, that is to say, the terminal device will The text image block 2 is matched with the line information of the text image block 1 whose text line is arranged at the end.
330、若当前文本图像块与目标文本行排在末尾的文本图像块匹配成功,则将当前文本图像块加入到目标文本行的末尾,以更新目标文本行排在末尾的文本图像块。330. If the current text image block matches the text image block at the end of the target text line successfully, add the current text image block to the end of the target text line, so as to update the text image block at the end of the target text line.
在本申请实施例中,若当前文本图像块与目标文本行排在末尾的文本图像块匹配成功,终端设备则将当前文本图像块加入到目标文本行的末尾,来将当前文本图像块作为目标文本行新的排在末尾的文本图像块。其中,目标文本行为与当前文本图像块匹配成功的排在末尾的文本图像块所在的文本行。请再次参阅图4,若当前文本图像块也就是文本图像块2与文本图像块1匹配成功,那么终端设备将文本图像块2加入到文本图像块1所在的文本行的末尾,从而令文本图像块2代替文本图像块1,来作为该文本行新的排在末尾的文本图像块。In the embodiment of the present application, if the current text image block matches the text image block at the end of the target text line successfully, the terminal device will add the current text image block to the end of the target text line to take the current text image block as the target The new text image block at the end of the text line. Wherein, the target text behavior is the text line where the text image block at the end is located that successfully matches the current text image block. Please refer to Figure 4 again. If the current text image block, that is, text image block 2 and text image block 1, are successfully matched, the terminal device will add text image block 2 to the end of the text line where text image block 1 is located, so that the text image Block 2 replaces text image block 1 as the new last text image block for the text line.
340、若当前文本图像块与各个文本行排在末尾的文本图像块均未匹配成功,则根据当前文本图像块在行列表中创建一个新的文本行。340. If the current text image block does not match successfully with the text image block at the end of each text line, create a new text line in the line list according to the current text image block.
在本申请实施例中,若当前文本图像块与各个文本行排在末尾的文本图像块均未匹配成功,那么终端设备依据当前文本图像块在行列表中创建一个新的文本行。请再次参阅图4,在图4中,在当前文本图像块为文本图像块2时,终端设备仅需要将文本图像块2与文本图像块1进行匹配,且文本图像块2与文本图像块1未匹配成功,此时终端设备依据文本图像块2在行列表中创建一个新的文本行,也就是另开一个文本行,该文本行中暂且只有文本图像块2。In the embodiment of the present application, if the current text image block is not successfully matched with the text image block at the end of each text line, then the terminal device creates a new text line in the line list according to the current text image block. Please refer to Fig. 4 again. In Fig. 4, when the current text image block is text image block 2, the terminal device only needs to match text image block 2 with text image block 1, and text image block 2 and text image block 1 If the matching is not successful, the terminal device creates a new text line in the line list according to the text image block 2, that is, opens another text line, and there is only the text image block 2 in this text line temporarily.
350、将第一图像块序列中的下一文本图像块作为新的当前文本图像块,并继续执行将当前文本图像块的行信息与行列表中的各个文本行排在末尾的文本图像块的行信息进行匹配的步骤,直至当前文本图像块为第一图像块序列的最后一个文本图像块。350. Use the next text image block in the first image block sequence as a new current text image block, and continue to execute the process of combining the line information of the current text image block with each text line in the line list at the end of the text image block The step of matching row information until the current text image block is the last text image block of the first image block sequence.
在本申请实施例中,终端设备将当前文本图像块与各个文本行排在末尾的文本图像块匹配之后,将第一图像块序列中的下一个文本图像块作为新的当前文本图像块,将该新的当前文本图像块继续与行列表中的各个文本行排在末尾的文本图像块的行信息进行匹配,不断循环直至第一图像块序列中的各个文本图像块均已完成与各个文本行排在末尾的文本图像块的行信息的匹配。In the embodiment of the present application, after the terminal device matches the current text image block with the text image block at the end of each text line, the next text image block in the first image block sequence is used as the new current text image block, and the The new current text image block continues to be matched with the line information of the text image block at the end of each text line in the line list, and the cycle continues until each text image block in the first image block sequence has been matched with each text line Matches the line information of the text image block at the end.
请再次参阅图4,举例来说,终端设备将文本图像块2与当前时刻行列表中唯一的文本行排在末尾的文本图像块1进行匹配,匹配不成功后终端设备依据 文本图像块2在行列表中创建一个新的文本行。在这之后,终端设备将图像块序列410中的下一个文本图像块,也就是文本图像块3,来作为新的当前文本图像块,将文本图像块3分别与行列表420中的各个文本行排在末尾的文本图像块进行匹配,此时行列表中只有两个文本行,这两个文本行排在末尾的文本图像块分别是文本图像块1和文本图像块2,因此,终端设备将文本图像块3分别与文本图像块1和文本图像块2进行匹配,此时,文本图像块3与文本图像块1匹配成功,与文本图像块2未匹配成功,那么终端设备将则当前文本图像块,也就是文本图像块3,加入到目标文本行的末尾,该目标文本行即为文本图像块1所在的文本行,并且终端设备将文本图像块3替代文本图像块1,来更新为该目标文本行排在末尾的文本图像块。Please refer to FIG. 4 again. For example, the terminal device matches the text image block 2 with the text image block 1 at the end of the only text line in the line list at the current moment. After the matching is unsuccessful, the terminal device matches the text image block 2 according to the Create a new text line in the line list. After that, the terminal device uses the next text image block in the image block sequence 410, that is, the text image block 3, as the new current text image block, and associates the text image block 3 with each text line in the line list 420 respectively. The text image blocks at the end are matched. At this time, there are only two text lines in the line list. The text image blocks at the end of these two text lines are text image block 1 and text image block 2 respectively. Therefore, the terminal device will The text image block 3 is matched with the text image block 1 and the text image block 2 respectively. At this time, the text image block 3 is successfully matched with the text image block 1, but not successfully matched with the text image block 2, so the terminal device will then use the current text image block block, that is, text image block 3, is added to the end of the target text line, and the target text line is the text line where text image block 1 is located, and the terminal device replaces text image block 1 with text image block 3 to update to this The text image block with the target text line at the end.
在这之后,终端设备依据图像块序列410将文本图像块4作为新的当前文本图像块,将文本图像块4分别与行列表420中的各个文本行排在末尾的文本图像块进行匹配,此时行列表中只有两个文本行,这两个文本行排在末尾的文本图像块更新为了文本图像块3和文本图像块2,因此,终端设备将文本图像块4分别与文本图像块3和文本图像块2进行匹配,此时,文本图像块4与文本图像块3以及文本图像块2均未匹配成功,那么终端设备将则依据文本图像块4在行列表中再创建一个新的文本行,也就是再次另开一个文本行,该文本行中暂且只有文本图像块4,此时行列表中就包含有三个文本行。After that, the terminal device uses the text image block 4 as a new current text image block according to the image block sequence 410, and matches the text image block 4 with the text image block at the end of each text line in the line list 420, and then There are only two text lines in the line list, and the text image blocks at the end of these two text lines are updated to text image block 3 and text image block 2. Therefore, the terminal device combines text image block 4 with text image block 3 and text image block 2 respectively. The text image block 2 is matched. At this time, the text image block 4, the text image block 3 and the text image block 2 are not matched successfully, then the terminal device will create a new text line in the line list according to the text image block 4 , that is, another text line is opened again, and there is only text image block 4 in this text line temporarily, and at this time, three text lines are included in the line list.
以此类推,直至按照图像块序列中的顺序将图像块序列中的各个文本图像块都加入到行列表中为止。如图4中的图像块序列410中有14个文本图像块,那么终端设备直至将文本图像块14加入到行列表420中的某个文本行或者依据文本图像块14创建一个新的文本行,才结束上述过程。第一文本图像以及图像块序列中包含的文本图像块数量不做限定。By analogy, until all the text image blocks in the image block sequence are added to the line list according to the sequence in the image block sequence. There are 14 text image blocks in the image block sequence 410 in Fig. 4, so the terminal device will add the text image block 14 to a certain text line in the line list 420 or create a new text line according to the text image block 14, to end the above process. The first text image and the number of text image blocks included in the image block sequence are not limited.
在本申请实施例中,通过依据第一图像块序列的顺序依次将文本图像块与行列表中已有的文本行排在末尾的文本图像块进行匹配,从而选择将文本图像块加入到已有的文本行或者创建一个新的文本行,依据了第一文本图像中为从左到右的常规文本顺序,更进一步地提高了后续生成的文本行以及行列表的准确性。In the embodiment of the present application, by sequentially matching the text image block with the text image block at the end of the existing text line in the line list according to the order of the first image block sequence, the text image block is selected to be added to the existing text image block. A text line or a new text line is created, according to the conventional text order from left to right in the first text image, which further improves the accuracy of subsequent generated text lines and line lists.
作为一种可选的实施例,请参阅图5,图5是本申请实施例中当前文本图像块进行匹配的流程图。步骤320中从图像块序列中确定当前文本图像块,将当前文本图像块的行信息与行列表中的各个文本行排在末尾的文本图像块的行信 息进行匹配,可以包括以下步骤:As an optional embodiment, please refer to FIG. 5 , which is a flow chart of matching current text image blocks in the embodiment of the present application. Determine the current text image block from the image block sequence in step 320, the line information of the current text image block is matched with the line information of the text image block at the end of each text line in the line list, and may include the following steps:
510、根据当前文本图像块的行信息及行列表中的第一文本行排在末尾的文本图像块的行信息,确定当前文本图像块与第一文本行排在末尾的文本图像块之间的匹配值,第一文本行为行列表中的任一文本行。510. According to the line information of the current text image block and the line information of the text image block with the first text line at the end in the line list, determine the distance between the current text image block and the text image block with the first text line at the end Match value, the first text line is any text line in the line list.
在本申请实施例中,终端设备在从第一图像块序列中确定了当前文本图像块之后,通过依据行高、行首位置坐标和行尾位置坐标等行信息来计算当前文本图像块与行列表中的第一文本行排在末尾的文本图像块之间的匹配值,并将匹配值与预设的匹配阈值相比较,其中,该匹配阈值可表示两个文本图像块之间的匹配程度,也就是匹配值越大则两个文本图像块之间就越匹配,反之匹配值越小则两个文本图像块之间就越不匹配。其中,第一文本行为行列表中的任意一个文本行。In the embodiment of the present application, after the terminal device determines the current text image block from the first image block sequence, it calculates the current text image block and line The first text line in the list is the matching value between the text image blocks at the end, and the matching value is compared with a preset matching threshold, where the matching threshold can represent the matching degree between two text image blocks , that is, the larger the matching value, the more matching between the two text image blocks, and the smaller the matching value, the less matching between the two text image blocks. Wherein, the first text line is any text line in the line list.
520、若匹配值大于匹配阈值,则当前文本图像块与第一文本行排在末尾的文本图像块匹配成功,并将第一文本行作为目标文本行。520. If the matching value is greater than the matching threshold, the current text image block is successfully matched with the text image block at the end of the first text line, and the first text line is used as the target text line.
在本申请实施例中,若两个文本图像块之间的匹配值大于匹配阈值,那么终端设备则可以判定这两个文本图像块匹配成功,并且将该第一文本行作为目标文本行,其中,目标文本行为与当前文本图像块匹配成功的文本图像块所在的第一文本行。In the embodiment of the present application, if the matching value between two text image blocks is greater than the matching threshold, then the terminal device can determine that the two text image blocks match successfully, and use the first text line as the target text line, where , the target text behavior matches the first text line where the text image block successfully matches the current text image block.
530、若匹配值不大于匹配阈值,则当前文本图像块与第一文本行排在末尾的文本图像块未匹配成功。530. If the matching value is not greater than the matching threshold, the current text image block is not successfully matched with the text image block at the end of the first text line.
在本申请实施例中,与步骤520相对应,若两个文本图像块之间的匹配值不大于匹配阈值,那么终端设备则可以判定这两个文本图像块未匹配成功。In the embodiment of the present application, corresponding to step 520, if the matching value between the two text image blocks is not greater than the matching threshold, the terminal device may determine that the two text image blocks are not successfully matched.
通过计算匹配值并且将匹配值与阈值进行比较的方式,能够更加快速地判定任意两个文本图像块之间是否匹配成功,提高了文本图像的排版效率。By calculating the matching value and comparing the matching value with the threshold value, it is possible to more quickly determine whether any two text image blocks are successfully matched, thereby improving the typesetting efficiency of the text image.
作为一种可选的实施例,步骤320中从第一图像块序列中确定当前文本图像块,将当前文本图像块的行信息与行列表中的各个文本行排在末尾的文本图像块的行信息进行匹配,可以包括以下步骤:As an optional embodiment, in step 320, the current text image block is determined from the first image block sequence, and the line information of the current text image block and the lines of the text image block at the end of each text line in the line list are arranged Information matching may include the following steps:
按照所述行列表中各个文本行的排列顺序,将当前文本图像块分别与各个文本行排在末尾的文本图像块的行信息依次进行匹配,其中,行列表中各个文本行的排序顺序依据各个文本行的创建时间从先到后进行排列。According to the arrangement order of each text line in the line list, the current text image block is respectively matched with the line information of the text image block at the end of each text line, wherein, the order of each text line in the line list is based on each The creation time of the lines of text is sorted from first to last.
以及,步骤330中若当前文本图像块与目标文本行排在末尾的文本图像块匹配成功,则将当前文本图像块加入到目标文本行的末尾,以更新目标文本行 排在末尾的文本图像块,可以包括以下步骤:And, in step 330, if the current text image block and the text image block at the end of the target text line match successfully, then the current text image block is added to the end of the target text line to update the text image block at the end of the target text line , can include the following steps:
在检测到当前文本图像块与目标文本行排在末尾的文本图像块匹配成功时,将当前文本图像块加入到目标文本行的末尾,以更新目标文本行排在末尾的文本图像块,并停止继续对所述当前文本图像块进行匹配。When detecting that the current text image block matches the text image block at the end of the target text line, add the current text image block to the end of the target text line to update the text image block at the end of the target text line, and stop Continue to match the current text image block.
在本申请实施例中,终端设备从第一图像块序列中确定当前文本图像块之后,将当前文本图像块按照一定的顺序依次与各个文本行排在末尾的文本图像块的行信息进行匹配,其中,该一定的顺序为各个文本行在行列表中的排列顺序,各个文本行的排列顺序依据各个文本行的创建时间从先到后进行排列。In the embodiment of the present application, after the terminal device determines the current text image block from the first image block sequence, it matches the current text image block with the line information of the text image block at the end of each text line in a certain order, Wherein, the certain order is the arrangement order of each text line in the line list, and the arrangement order of each text line is arranged from first to last according to the creation time of each text line.
终端设备在检测到当前文本图像块与与目标文本行排在末尾的文本图像块匹配成功时,终端设备将当前文本图像块加入到目标文本行的末尾,来将当前文本图像块作为目标文本行新的排在末尾的文本图像块,并停止继续对当前文本图像块进行匹配。请再次参阅图4,举例来说,若终端设备当前确定过的当前文本图像块为文本图像块5,依据上述实施例可知,行列表中当前包含有3个文本行,这三个文本行依据文本行的创建时间从先到后进行确定的话,文本行的顺序为文本图像块1和文本图像块3所在的文本行排列在第一,文本图像块2所在的文本行排列在第二,文本图像块4所在的文本行排列在第三,那么,终端设备将文本图像块5按照文本行的排列顺序,与这3个文本行排在末尾的文本图像块的行信息进行匹配,也就是说终端设备将文本图像块5先与文本图像块3进行匹配,然后与文本图像块2进行匹配,最后与文本图像块4进行匹配。此时,当文本图像块5与文本图像块3匹配成功的时候,终端设备就将文本图像块5加入到文本图像块3所在的文本行的末尾,从而令文本图像块5代替文本图像块3,来作为该文本行新的排在末尾的文本图像块,此时,终端设备停止将文本图像块5与文本图像块2以及文本图像块4进行匹配。而如果文本图像块5与文本图像块3未匹配成功,那么终端设备就将文本图像块5与文本图像块2进行匹配,如果匹配成功,则停止与文本图像块4相匹配,如果未匹配成功,则将文本图像块5最后与文本图像块4相匹配,如果匹配成功,则将文本图像块5加入到文本图像块4所在的文本行,并将文本图像块5更新为该文本行排在末尾的文本图像块,如果仍未匹配成功,则依据文本图像块5,在行列表中创建一个新的文本行。When the terminal device detects that the current text image block successfully matches the text image block at the end of the target text line, the terminal device adds the current text image block to the end of the target text line to use the current text image block as the target text line The new text image block at the end, and stop continuing to match the current text image block. Please refer to FIG. 4 again. For example, if the current text image block currently determined by the terminal device is text image block 5, according to the above embodiment, it can be known that the line list currently contains 3 text lines, and these three text lines are based on If the creation time of the text line is determined from first to last, the order of the text line is that the text line where the text image block 1 and the text image block 3 are located is arranged first, the text line where the text image block 2 is located is arranged second, and the text line where the text image block 2 is located is arranged second. The text line where the image block 4 is located is arranged third, then, the terminal device matches the text image block 5 with the line information of the text image block at the end of the three text lines according to the arrangement order of the text lines, that is to say The terminal device matches the text image block 5 first with the text image block 3 , then with the text image block 2 , and finally with the text image block 4 . At this time, when the text image block 5 is successfully matched with the text image block 3, the terminal device will add the text image block 5 to the end of the text line where the text image block 3 is located, so that the text image block 5 replaces the text image block 3 , to be the new text image block at the end of the text line. At this time, the terminal device stops matching the text image block 5 with the text image block 2 and the text image block 4 . And if the text image block 5 is not matched successfully with the text image block 3, then the terminal device will match the text image block 5 with the text image block 2, if the matching is successful, then stop matching with the text image block 4, if the matching is not successful , then match the text image block 5 with the text image block 4 at last, if the match is successful, then add the text image block 5 to the text row where the text image block 4 is located, and update the text image block 5 to the text row in the If the text image block at the end is still not matched successfully, a new text line is created in the line list according to the text image block 5.
在本申请实施例中,通过将当前文本图像块按照顺序依次与各个文本行排在末尾的文本图像块相匹配,并加入到第一个匹配成功的文本图像块所在的文 本行,能够减少运算量,提高文本图像排版的效率。In the embodiment of the present application, by matching the current text image block with the text image block at the end of each text line in sequence, and adding it to the text line where the first successfully matched text image block is located, the calculation can be reduced volume, and improve the efficiency of text and image typesetting.
作为一种可选的实施例,步骤330中若当前文本图像块与目标文本行排在末尾的文本图像块匹配成功,则将当前文本图像块加入到目标文本行的末尾,以更新目标文本行排在末尾的文本图像块,可以包括以下步骤:As an optional embodiment, if the current text image block matches the text image block at the end of the target text line in step 330, the current text image block is added to the end of the target text line to update the target text line The text image block at the end may include the following steps:
若当前文本图像块仅与一个目标文本行排在末尾的文本图像块之间的匹配值满足阈值条件,则将当前文本图像块加入到目标文本行的末尾,以更新目标文本行排在末尾的文本图像块;If the matching value between the current text image block and only one text image block at the end of the target text line satisfies the threshold condition, then the current text image block is added to the end of the target text line to update the target text line at the end text image blocks;
若当前文本图像块与至少两个目标文本行排在末尾的文本图像块之间的匹配值满足阈值条件,则确定各个目标文本行排在末尾的文本图像块的匹配值中的最大匹配值,并将当前文本图像块加入到最大匹配值对应的目标文本行的末尾,以更新目标文本行排在末尾的文本图像块。If the matching value between the current text image block and at least two target text lines at the end of the text image block satisfies the threshold condition, then determine the maximum matching value among the matching values of each target text line at the end of the text image block, And adding the current text image block to the end of the target text line corresponding to the maximum matching value, so as to update the text image block at the end of the target text line.
在本申请实施例中,终端设备在将当前文本图像块与各个文本行排在末尾的文本图像块进行匹配时,若当前文本图像块与各个文本行排在末尾的文本图像块匹配后,只有一个文本图像块与当前文本图像块之间的匹配值满足阈值条件,那么终端设备将当前文本图像块加入到与之匹配值满足阈值条件的文本图像块所在的目标文本行的末尾,以更新目标文本行排在末尾的文本图像块。In the embodiment of the present application, when the terminal device matches the current text image block with the text image block at the end of each text line, if the current text image block is matched with the text image block at the end of each text line, only If the matching value between a text image block and the current text image block satisfies the threshold condition, then the terminal device will add the current text image block to the end of the target text line where the text image block whose matching value satisfies the threshold condition is located, so as to update the target A text image block with lines of text at the end.
若当前文本图像块与多个目标文本行排在末尾的文本图像块之间的匹配值均满足阈值条件,此时终端设备可依据计算的当前文本图像块与多个目标文本行排在末尾的文本图像块之间的匹配值,来选取其中的一个最大值,将当前文本图像块加入到该最大值对应的文本图像块所在的目标文本行中,并将当前文本图像块作为该目标文本行新的排在末尾的文本图像块。If the matching values between the current text image block and the text image blocks with multiple target text lines at the end all meet the threshold condition, the terminal device can calculate the current text image block and the multiple target text lines at the end. Matching values between text image blocks to select one of the maximum values, add the current text image block to the target text line where the text image block corresponding to the maximum value is located, and use the current text image block as the target text line The new text image block at the end.
在本申请实施例中,在存在当前文本图像块与多个文本图像块之间的匹配值均满足阈值条件的情况下,选取匹配值最大的文本图像块,并将当前文本图像块加入到该文本图像块所在的目标文本行中,有效地保证了文本图像排版的准确性。In the embodiment of the present application, in the case that the matching values between the current text image block and multiple text image blocks all satisfy the threshold condition, the text image block with the largest matching value is selected, and the current text image block is added to the text image block. In the target text line where the text image block is located, the typesetting accuracy of the text image is effectively guaranteed.
作为一种可选的实施例,在步骤120根据第一文本图像中包含的各个文本图像块对应的行信息,对各个文本图像块进行匹配,生成至少一个文本行,以得到第一文本图像对应的行列表之后,还可以执行以下步骤:As an optional embodiment, in step 120, according to the line information corresponding to each text image block included in the first text image, each text image block is matched to generate at least one text line, so as to obtain the first text image corresponding After the list of lines, the following steps can also be performed:
将各个第一文本图像进行合并,以得到排版后的文本图像,其中,第一文本图像为至少一个第一文本图像中的任一第一文本图像,至少一个第一文本图像可通过对全文本图像进行区域分割得到。Combining each first text image to obtain a typesetting text image, wherein the first text image is any first text image in at least one first text image, and at least one first text image can be used for the full text The image is segmented into regions.
在本申请实施例中,若终端设备获取的是全文本图像且对全文本图像进行了区域分割得到多个第一文本图像,那么终端设备可以同时分别对各个第一文本图像中的文本图像块进行排版,也可以逐个对各个第一文本图像中的文本图像块进行排版。在各个第一文本图像的文本图像块均完成排版后,终端设备将各个第一文本图像进行合并,从而得到完整的排版后的文本图像,实现了整个文本图像的排版。In the embodiment of the present application, if the terminal device obtains a full text image and performs region segmentation on the full text image to obtain multiple first text images, then the terminal device can separately analyze the text image blocks in each first text image at the same time For typesetting, the text image blocks in each first text image may also be typed one by one. After the typesetting of the text image blocks of each first text image is completed, the terminal device combines each first text image to obtain a complete typesetting text image, realizing the typesetting of the entire text image.
请参阅图6,图6是本申请实施例公开的一种文本图像的排版装置的结构示意图。如图6所示,该文本图像的排版装置包括:文本检测模块610和文本排版模块620。Please refer to FIG. 6 . FIG. 6 is a schematic structural diagram of a text image typesetting device disclosed in an embodiment of the present application. As shown in FIG. 6 , the text image typesetting device includes: a text detection module 610 and a text typesetting module 620 .
文本检测模块610,用于对第一文本图像分别进行文本行检测,确定第一文本图像中包含的各个文本图像块对应的行信息,其中,行信息包括行高、行首位置坐标及行尾位置坐标;The text detection module 610 is configured to perform text line detection on the first text image respectively, and determine the line information corresponding to each text image block contained in the first text image, wherein the line information includes line height, line head position coordinates, and line end Position coordinates;
文本排版模块620,用于根据第一文本图像中包含的各个文本图像块对应的行信息,对各个文本图像块进行匹配,生成至少一个文本行,以得到第一文本图像对应的行列表,其中,每个文本行中相邻两个文本图像块之间的匹配值满足阈值条件。The text typesetting module 620 is configured to match each text image block according to the line information corresponding to each text image block included in the first text image, and generate at least one text line to obtain a line list corresponding to the first text image, wherein , the matching value between two adjacent text image blocks in each text line satisfies the threshold condition.
作为一种可选的实施例,文本排版模块620,还用于:As an optional embodiment, the text typesetting module 620 is also used for:
根据第一文本图像中包含的各个文本图像块对应的行信息,确定第一文本图像块与各个其他文本图像块之间的各个匹配值,其中,第一文本图像块为第一文本图像中的任一文本图像块,其他文本图像块为第一文本图像中除第一文本图像块以外的文本图像块;According to the line information corresponding to each text image block contained in the first text image, determine the respective matching values between the first text image block and each other text image block, wherein the first text image block is the first text image block Any text image block, other text image blocks are text image blocks other than the first text image block in the first text image;
确定各个匹配值中的最大匹配值,并将第一文本图像块加入最大匹配值对应的其他文本图像块,以生成至少一个文本行。A maximum matching value among the respective matching values is determined, and the first text image block is added to other text image blocks corresponding to the maximum matching value, so as to generate at least one text line.
请参阅图7,图7是本申请实施例公开的另一种文本图像的排版装置的结构示意图。其中,图7所示的文本图像的排版装置是由图6所示的文本图像的排版装置进一步优化得到的。与图6所示的文本图像的排版装置相比较,图7所示的文本图像的排版装置600还可以包括:Please refer to FIG. 7 . FIG. 7 is a schematic structural diagram of another text image typesetting device disclosed in an embodiment of the present application. Wherein, the text image typesetting device shown in FIG. 7 is further optimized from the text image typesetting device shown in FIG. 6 . Compared with the text image typesetting device shown in FIG. 6, the text image typesetting device 600 shown in FIG. 7 may further include:
文本排序模块630,用于根据第一文本图像中包含的各个文本图像块的行首位置坐标的横坐标,按照横坐标从小到大的顺序,对各个文本图像块进行预排序,得到第一文本图像对应的图像块序列。The text sorting module 630 is used for pre-sorting each text image block according to the abscissa of the line head position coordinates of each text image block contained in the first text image, according to the order of the abscissa from small to large, to obtain the first text The image corresponds to a sequence of image blocks.
作为一种可选的实施例,文本排版模块620,还用于:As an optional embodiment, the text typesetting module 620 is also used for:
建立第一文本图像对应的行列表,并根据第一文本图像对应的第一图像块序列中排列在第一个的文本图像块,在行列表中创建一个新的文本行;Establish a line list corresponding to the first text image, and create a new text line in the line list according to the first text image block arranged in the first image block sequence corresponding to the first text image;
从第一图像块序列中确定当前文本图像块,将当前文本图像块的行信息与行列表中的各个文本行排在末尾的文本图像块的行信息进行匹配;Determine the current text image block from the first image block sequence, and match the line information of the current text image block with the line information of the text image block at the end of each text line in the line list;
若当前文本图像块与目标文本行排在末尾的文本图像块匹配成功,则将当前文本图像块加入到目标文本行的末尾,以更新目标文本行排在末尾的文本图像块;If the current text image block matches successfully with the text image block at the end of the target text line, then the current text image block is added to the end of the target text line to update the text image block at the end of the target text line;
若当前文本图像块与各个文本行排在末尾的文本图像块均未匹配成功,则根据当前文本图像块在行列表中创建一个新的文本行;If the current text image block does not match the text image block at the end of each text line, a new text line is created in the line list according to the current text image block;
将第一图像块序列中的下一文本图像块作为新的当前文本图像块,并继续执行将当前文本图像块的行信息与行列表中的各个文本行排在末尾的文本图像块的行信息进行匹配的步骤,直至当前文本图像块为第一图像块序列的最后一个文本图像块。Use the next text image block in the first image block sequence as the new current text image block, and continue to execute the line information of the current text image block and the line information of the text image block at the end of each text line in the line list The matching step is performed until the current text image block is the last text image block of the first image block sequence.
作为一种可选的实施例,文本排版模块620,还用于:As an optional embodiment, the text typesetting module 620 is also used for:
根据当前文本图像块的行信息及行列表中的第一文本行排在末尾的文本图像块的行信息,确定当前文本图像块与第一文本行排在末尾的文本图像块之间的匹配值,第一文本行为行列表中的任一文本行;According to the line information of the current text image block and the line information of the text image block whose first text line is at the end in the line list, determine the matching value between the current text image block and the text image block whose first text line is at the end , the first text line is any text line in the line list;
若匹配值大于匹配阈值,则当前文本图像块与第一文本行排在末尾的文本图像块匹配成功,并将第一文本行作为目标文本行;If the matching value is greater than the matching threshold, the current text image block is successfully matched with the text image block at the end of the first text line, and the first text line is used as the target text line;
若匹配值不大于匹配阈值,则当前文本图像块与第一文本行排在末尾的文本图像块未匹配成功。If the matching value is not greater than the matching threshold, the current text image block is not successfully matched with the text image block at the end of the first text line.
作为一种可选的实施例,文本排版模块620,还用于:As an optional embodiment, the text typesetting module 620 is also used for:
按照行列表中各个文本行的排列顺序,将当前文本图像块分别与各个文本行排在末尾的文本图像块的行信息依次进行匹配,其中,行列表中各个文本行依据各个文本行的创建时间从先到后进行排列;According to the arrangement order of each text line in the line list, match the current text image block with the line information of the text image block at the end of each text line in sequence, wherein each text line in the line list is based on the creation time of each text line Arranged from first to last;
在检测到当前文本图像块与目标文本行排在末尾的文本图像块匹配成功时,将当前文本图像块加入到目标文本行的末尾,以更新目标文本行排在末尾的文本图像块,并停止继续对当前文本图像块进行匹配。When detecting that the current text image block matches the text image block at the end of the target text line, add the current text image block to the end of the target text line to update the text image block at the end of the target text line, and stop Continue to match the current text image block.
作为一种可选的实施例,文本排版模块620,还用于:As an optional embodiment, the text typesetting module 620 is also used for:
若当前文本图像块仅与一个目标文本行排在末尾的文本图像块之间的匹配值满足阈值条件,则将当前文本图像块加入到目标文本行的末尾,以更新目标 文本行排在末尾的文本图像块;If the matching value between the current text image block and only one text image block at the end of the target text line satisfies the threshold condition, then the current text image block is added to the end of the target text line to update the target text line at the end text image blocks;
若当前文本图像块与至少两个目标文本行排在末尾的文本图像块之间的匹配值均满足阈值条件,则确定各个目标文本行排在末尾的文本图像块的匹配值中的最大匹配值,并将当前文本图像块加入到最大匹配值对应的目标文本行的末尾,以更新目标文本行排在末尾的文本图像块。If the matching values between the current text image block and at least two text image blocks whose target text lines are arranged at the end all satisfy the threshold condition, then determine the maximum matching value among the matching values of the text image blocks whose target text lines are arranged at the end , and add the current text image block to the end of the target text line corresponding to the maximum matching value, so as to update the text image block at the end of the target text line.
请参阅图8,图8是本申请实施例公开的另一种文本图像的排版装置的结构示意图。其中,图8所示的文本图像的排版装置是由图6所示的文本图像的排版装置进一步优化得到的。与图6所示的文本图像的排版装置相比较,图8所示的文本图像的排版装置600还可以包括:Please refer to FIG. 8 . FIG. 8 is a schematic structural diagram of another text image typesetting device disclosed in an embodiment of the present application. Wherein, the text image typesetting device shown in FIG. 8 is further optimized from the text image typesetting device shown in FIG. 6 . Compared with the text image typesetting device shown in FIG. 6, the text image typesetting device 600 shown in FIG. 8 may further include:
文本分割模块640,用于对全文本图像进行区域分割,得到至少一个第一文本图像,其中,第一文本图像为至少一个第一文本图像中的任一第一文本图像。The text segmentation module 640 is configured to perform region segmentation on the full text image to obtain at least one first text image, wherein the first text image is any first text image in the at least one first text image.
请参阅图9,图9是一个实施例公开的一种电子设备的结构示意图。如图9所示,该电子设备900可以包括:Please refer to FIG. 9 . FIG. 9 is a schematic structural diagram of an electronic device disclosed by an embodiment. As shown in FIG. 9, the electronic device 900 may include:
存储有可执行程序代码的存储器910;a memory 910 storing executable program code;
与存储器910耦合的处理器920;a processor 920 coupled to the memory 910;
其中,处理器920调用存储器910中存储的可执行程序代码,执行本申请实施例公开的任意一种文本图像的排版方法。Wherein, the processor 920 invokes the executable program code stored in the memory 910 to execute any text image typesetting method disclosed in the embodiments of the present application.
需要说明的是,图9所示的电子设备还可以包括电源、输入按键、摄像头、扬声器、屏幕、RF电路、Wi-Fi模块、蓝牙模块、传感器等未显示的组件,本实施例不作赘述。It should be noted that the electronic device shown in FIG. 9 may also include components not shown, such as a power supply, an input button, a camera, a speaker, a screen, an RF circuit, a Wi-Fi module, a Bluetooth module, and a sensor, which will not be described in detail in this embodiment.
本申请实施例公开一种计算机可读存储介质,其存储计算机程序,其中,该计算机程序使得计算机执行本申请实施例公开的任意一种文本图像的排版方法。The embodiment of the present application discloses a computer-readable storage medium, which stores a computer program, wherein the computer program causes a computer to execute any typesetting method of a text image disclosed in the embodiment of the present application.
本申请实施例公开一种计算机程序产品,该计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,且该计算机程序可操作来使计算机执行本申请实施例公开的任意一种文本图像的排版方法。The embodiment of the present application discloses a computer program product, the computer program product includes a non-transitory computer-readable storage medium storing the computer program, and the computer program is operable to cause the computer to execute any text disclosed in the embodiment of the present application How images are typed.
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定特征、结构或特性可以以任意适合的方式结合在一个或多个实施例中。本领域技术人员也应该知悉,说明书中所描述的实施 例均属于可选实施例,所涉及的动作和模块并不一定是本申请所必须的。It should be understood that reference throughout the specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic related to the embodiment is included in at least one embodiment of the present application. Thus, appearances of "in one embodiment" or "in an embodiment" in various places throughout the specification are not necessarily referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments. Those skilled in the art should also know that the embodiments described in the specification are all optional embodiments, and the actions and modules involved are not necessarily required by this application.
在本申请的各种实施例中,应理解,上述各过程的序号的大小并不意味着执行顺序的必然先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。In various embodiments of the present application, it should be understood that the sequence numbers of the above-mentioned processes do not necessarily mean the order of execution. The implementation of the examples constitutes no limitation.
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物单元,即可位于一个地方,或者也可以分布到多个网络单元上。可根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。The units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, located in one place, or distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
上述集成的单元若以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可获取的存储器中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或者部分,可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储器中,包括若干请求用以使得一台计算机设备(可以为个人计算机、服务器或者网络设备等,具体可以是计算机设备中的处理器)执行本申请的各个实施例上述方法的部分或全部步骤。If the above-mentioned integrated units are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-accessible memory. Based on this understanding, the technical solution of the present application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product, and the computer software product is stored in a memory , including several requests to make a computer device (which may be a personal computer, server, or network device, etc., specifically, a processor in the computer device) execute some or all of the steps of the above-mentioned methods in various embodiments of the present application.
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质包括只读存储器(Read-Only Memory,ROM)、随机存储器(Random Access Memory,RAM)、可编程只读存储器(Programmable Read-only Memory,PROM)、可擦除可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、一次可编程只读存储器(One-time Programmable Read-Only Memory,OTPROM)、电子抹除式可复写只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储器、磁盘存储器、磁带存储器、或者能够用于携带或存储数据的计算机可读的任何其他介质。Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above-mentioned embodiments can be completed by instructing related hardware through a program, and the program can be stored in a computer-readable storage medium, and the storage medium includes read-only Memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), programmable read-only memory (Programmable Read-only Memory, PROM), erasable programmable read-only memory (Erasable Programmable Read Only Memory, EPROM), One-time Programmable Read-Only Memory (OTPROM), Electronically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc (Compact Disc Read-Only Memory, CD-ROM) or other optical disk storage, magnetic disk storage, tape storage, or any other computer-readable medium that can be used to carry or store data.
以上对本申请实施例公开的一种文本图像的排版方法、装置、无线耳机和存储介质进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式 进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想。同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。A text image typesetting method, device, wireless earphone, and storage medium disclosed in the embodiments of the present application have been described above in detail. In this paper, specific examples are used to illustrate the principles and implementation methods of the present application. The description of the above embodiments It is only used to help understand the method and core idea of this application. At the same time, for those skilled in the art, based on the idea of this application, there will be changes in the specific implementation and application scope. In summary, the content of this specification should not be construed as limiting the application.

Claims (11)

  1. 一种文本图像的排版方法,其特征在于,所述方法包括:A method for typesetting text images, characterized in that the method comprises:
    对第一文本图像进行文本行检测,确定所述第一文本图像中包含的各个文本图像块对应的行信息,其中,所述行信息包括行高、行首位置坐标及行尾位置坐标;Perform text line detection on the first text image, and determine line information corresponding to each text image block contained in the first text image, wherein the line information includes line height, line head position coordinates, and line end position coordinates;
    根据所述第一文本图像中包含的各个文本图像块对应的行信息,对所述各个文本图像块进行匹配,生成至少一个文本行,以得到所述第一文本图像对应的行列表,其中,每个文本行中相邻两个文本图像块之间的匹配值满足阈值条件。According to the line information corresponding to each text image block included in the first text image, match each text image block to generate at least one text line to obtain a line list corresponding to the first text image, wherein, The matching value between two adjacent text image blocks in each text line satisfies the threshold condition.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述第一文本图像中包含的各个文本图像块对应的行信息,对所述各个文本图像块进行匹配,生成至少一个文本行,包括:The method according to claim 1, characterized in that, according to the line information corresponding to each text image block included in the first text image, matching each text image block to generate at least one text line, include:
    根据第一文本图像中包含的各个文本图像块对应的行信息,确定第一文本图像块与各个其他文本图像块之间的各个匹配值,其中,所述第一文本图像块为所述第一文本图像中的任一文本图像块,所述其他文本图像块为所述第一第一文本图像中除所述第一文本图像块以外的文本图像块;According to the line information corresponding to each text image block contained in the first text image, determine each matching value between the first text image block and each other text image block, wherein the first text image block is the first Any text image block in the text image, the other text image blocks are text image blocks other than the first text image block in the first first text image;
    确定所述各个匹配值中的最大匹配值,并将所述第一文本图像块加入所述最大匹配值对应的其他文本图像块,以生成至少一个文本行。Determine the maximum matching value among the respective matching values, and add the first text image block to other text image blocks corresponding to the maximum matching value, so as to generate at least one text line.
  3. 根据权利要求1所述的方法,其特征在于,在所述对所述第一文本图像进行文本行检测,确定所述第一文本图像中包含的各个文本图像块对应的行信息之后,还包括:The method according to claim 1, characterized in that, after performing text line detection on the first text image and determining the line information corresponding to each text image block contained in the first text image, further comprising :
    根据所述第一文本图像中包含的各个文本图像块的行首位置坐标的横坐标,按照横坐标从小到大的顺序,对所述各个文本图像块进行预排序,得到所述第一文本图像对应的图像块序列。According to the abscissa of the line-head position coordinates of each text image block included in the first text image, and according to the order of the abscissa from small to large, the text image blocks are pre-sorted to obtain the first text image The corresponding sequence of image blocks.
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述第一文本图像中包含的各个文本图像块对应的行信息,对所述各个文本图像块进行匹配,生成至少一个文本行,以得到所述第一文本图像对应的行列表,包括:The method according to claim 3, characterized in that, according to the line information corresponding to each text image block included in the first text image, matching each text image block to generate at least one text line, To obtain the row list corresponding to the first text image, including:
    建立第一文本图像对应的行列表,并根据所述第一文本图像对应的第一图像块序列中排列在第一个的文本图像块,在所述行列表中创建一个新的文本行;Establishing a line list corresponding to the first text image, and creating a new text line in the line list according to the first text image block arranged in the first image block sequence corresponding to the first text image;
    从所述第一图像块序列中确定当前文本图像块,将所述当前文本图像块的行信息与所述行列表中的各个文本行排在末尾的文本图像块的行信息进行匹配;Determining the current text image block from the first image block sequence, and matching the line information of the current text image block with the line information of the text image block at the end of each text line in the line list;
    若所述当前文本图像块与目标文本行排在末尾的文本图像块匹配成功,则将所述当前文本图像块加入到所述目标文本行的末尾,以更新所述目标文本行排在末尾的文本图像块;If the current text image block is successfully matched with the text image block at the end of the target text line, then the current text image block is added to the end of the target text line to update the target text line at the end text image blocks;
    若所述当前文本图像块与所述各个文本行排在末尾的文本图像块均未匹配成功,则根据所述当前文本图像块在所述行列表中创建一个新的文本行;If the current text image block and the text image blocks at the end of each text line are not matched successfully, then create a new text line in the line list according to the current text image block;
    将所述第一图像块序列中的下一文本图像块作为新的当前文本图像块,并继续执行所述将所述当前文本图像块的行信息与所述行列表中的各个文本行排在末尾的文本图像块的行信息进行匹配的步骤,直至所述当前文本图像块为所述第一图像块序列的最后一个文本图像块。Taking the next text image block in the first image block sequence as a new current text image block, and continuing to perform the step of arranging the line information of the current text image block with each text line in the line list. The step of matching the line information of the last text image block until the current text image block is the last text image block of the first image block sequence.
  5. 根据权利要求4所述的方法,其特征在于,所述从所述第一图像块序列中确定当前文本图像块,将所述当前文本图像块的行信息与所述行列表中的各个文本行排在末尾的文本图像块的行信息进行匹配,包括:The method according to claim 4, characterized in that, determining the current text image block from the first image block sequence, combining the line information of the current text image block with each text line in the line list The line information of the text image block at the end is matched, including:
    根据所述当前文本图像块的行信息及所述行列表中的第一文本行排在末尾的文本图像块的行信息,确定所述当前文本图像块与所述第一文本行排在末尾的文本图像块之间的匹配值,所述第一文本行为所述行列表中的任一文本行;According to the line information of the current text image block and the line information of the text image block whose first text line is at the end in the line list, determine the position between the current text image block and the first text line at the end Matching values between text image blocks, the first text line is any text line in the line list;
    若所述匹配值大于匹配阈值,则所述当前文本图像块与所述第一文本行排在末尾的文本图像块匹配成功,并将所述第一文本行作为目标文本行;If the matching value is greater than the matching threshold, the current text image block is successfully matched with the text image block at the end of the first text line, and the first text line is used as the target text line;
    若所述匹配值不大于所述匹配阈值,则所述当前文本图像块与所述第一文本行排在末尾的文本图像块未匹配成功。If the matching value is not greater than the matching threshold, the current text image block is not successfully matched with the text image block at the end of the first text line.
  6. 根据权利要求5所述的方法,其特征在于,所述若所述当前文本图像块与目标文本行排在末尾的文本图像块匹配成功,则将所述当前文本图像块加入到所述目标文本行的末尾,以更新所述目标文本行排在末尾的文本图像块,包括:The method according to claim 5, wherein if the current text image block is successfully matched with the text image block at the end of the target text line, then adding the current text image block to the target text end of line to update the text image block at the end of the target text line, including:
    若所述当前文本图像块仅与一个目标文本行排在末尾的文本图像块之间的匹配值满足阈值条件,则将所述当前文本图像块加入到所述目标文本行的末尾,以更新所述目标文本行排在末尾的文本图像块;If the matching value between the current text image block and a text image block at the end of a target text line satisfies the threshold condition, then the current text image block is added to the end of the target text line to update all The text image block at the end of the target text line;
    若所述当前文本图像块与至少两个目标文本行排在末尾的文本图像块之间的匹配值均满足阈值条件,则确定各个目标文本行排在末尾的文本图像块的匹配值中的最大匹配值,并将所述当前文本图像块加入到所述最大匹配值对应的目标文本行的末尾,以更新所述目标文本行排在末尾的文本图像块。If the matching values between the current text image block and at least two text image blocks whose target text lines are arranged at the end all satisfy the threshold condition, then determine the maximum of the matching values of the text image blocks whose target text lines are arranged at the end matching value, and adding the current text image block to the end of the target text line corresponding to the maximum matching value, so as to update the text image block at the end of the target text line.
  7. 根据权利要求4所述的方法,其特征在于,所述从所述第一图像块序列中确定当前文本图像块,将所述当前文本图像块的行信息与所述行列表中的各个文本行排在末尾的文本图像块的行信息进行匹配,包括:The method according to claim 4, characterized in that, determining the current text image block from the first image block sequence, combining the line information of the current text image block with each text line in the line list The line information of the text image block at the end is matched, including:
    按照所述行列表中各个文本行的排列顺序,将所述当前文本图像块分别与各个文本行排在末尾的文本图像块的行信息依次进行匹配,其中,行列表中各个文本行依据所述各个文本行的创建时间从先到后进行排列;According to the arrangement order of each text line in the line list, the current text image block is respectively matched with the line information of the text image block at the end of each text line, wherein each text line in the line list is based on the The creation time of each text line is arranged from first to last;
    以及,所述若所述当前文本图像块与目标文本行排在末尾的文本图像块匹配成功,则将所述当前文本图像块加入到所述目标文本行的末尾,以更新所述目标文本行排在末尾的文本图像块,包括:And, if the current text image block is successfully matched with the text image block at the end of the target text line, then adding the current text image block to the end of the target text line to update the target text line Text image blocks at the end, including:
    在检测到所述当前文本图像块与目标文本行排在末尾的文本图像块匹配成功时,将当前文本图像块加入到所述目标文本行的末尾,以更新所述目标文本行排在末尾的文本图像块,并停止继续对所述当前文本图像块进行匹配。When it is detected that the current text image block is successfully matched with the text image block at the end of the target text line, the current text image block is added to the end of the target text line to update the target text line at the end text image block, and stop continuing to match the current text image block.
  8. 根据权利要求1~7任一所述的方法,其特征在于,在所述对第一文本图像进行文本行检测,确定所述第一文本图像中包含的各个文本图像块对应的行信息之前,还包括:The method according to any one of claims 1-7, characterized in that, before performing text line detection on the first text image and determining the line information corresponding to each text image block included in the first text image, Also includes:
    对全文本图像进行区域分割,得到至少一个第一文本图像,其中,所述第一文本图像为所述至少一个第一文本图像中的任一第一文本图像。Perform region segmentation on the full text image to obtain at least one first text image, wherein the first text image is any first text image in the at least one first text image.
  9. 一种文本图像的排版装置,其特征在于,所述装置包括:A typesetting device for text images, characterized in that the device includes:
    文本检测模块,用于对第一文本图像进行文本行检测,确定所述第一文本图像中包含的各个文本图像块对应的行信息,其中,所述行信息包括行高、行首位置坐标及行尾位置坐标;A text detection module, configured to perform text line detection on the first text image, and determine line information corresponding to each text image block contained in the first text image, wherein the line information includes line height, line head position coordinates and The position coordinates of the end of the line;
    文本排版模块,用于根据所述第一文本图像中包含的各个文本图像块对应的行信息,对所述各个文本图像块进行匹配,生成至少一个文本行,以得到所述第一文本图像对应的行列表,其中,每个文本行中相邻两个文本图像块之间的匹配值满足阈值条件。A text typesetting module, configured to match each text image block according to the line information corresponding to each text image block contained in the first text image, and generate at least one text line, so as to obtain the first text image corresponding A line list of , where the matching values between two adjacent text image blocks in each text line satisfy the threshold condition.
  10. 一种电子设备,其特征在于,包括存储器及处理器,所述存储器中存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器实现如权利要求1至8任一项所述的方法。An electronic device, characterized by comprising a memory and a processor, wherein a computer program is stored in the memory, and when the computer program is executed by the processor, the processor realizes any one of claims 1 to 8. method described in the item.
  11. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至8任一项所述的方法。A computer-readable storage medium on which a computer program is stored, wherein the computer program implements the method according to any one of claims 1 to 8 when executed by a processor.
PCT/CN2021/119226 2021-08-30 2021-09-18 Text image typesetting method and apparatus, electronic device, and storage medium WO2023029116A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111002561.1A CN115730563A (en) 2021-08-30 2021-08-30 Typesetting method and device for text image, electronic equipment and storage medium
CN202111002561.1 2021-08-30

Publications (1)

Publication Number Publication Date
WO2023029116A1 true WO2023029116A1 (en) 2023-03-09

Family

ID=85290618

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119226 WO2023029116A1 (en) 2021-08-30 2021-09-18 Text image typesetting method and apparatus, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN115730563A (en)
WO (1) WO2023029116A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110173532A1 (en) * 2010-01-13 2011-07-14 George Forman Generating a layout of text line images in a reflow area
CN109697414A (en) * 2018-12-13 2019-04-30 北京金山数字娱乐科技有限公司 A kind of text positioning method and device
CN110619333A (en) * 2019-08-15 2019-12-27 平安国际智慧城市科技股份有限公司 Text line segmentation method, text line segmentation device and electronic equipment
CN111626250A (en) * 2020-06-02 2020-09-04 泰康保险集团股份有限公司 Line dividing method and device for text image, computer equipment and readable storage medium
CN111985465A (en) * 2020-08-17 2020-11-24 中移(杭州)信息技术有限公司 Text recognition method, device, equipment and storage medium
CN112507782A (en) * 2020-10-22 2021-03-16 广东省电信规划设计院有限公司 Text image recognition method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110173532A1 (en) * 2010-01-13 2011-07-14 George Forman Generating a layout of text line images in a reflow area
CN109697414A (en) * 2018-12-13 2019-04-30 北京金山数字娱乐科技有限公司 A kind of text positioning method and device
CN110619333A (en) * 2019-08-15 2019-12-27 平安国际智慧城市科技股份有限公司 Text line segmentation method, text line segmentation device and electronic equipment
CN111626250A (en) * 2020-06-02 2020-09-04 泰康保险集团股份有限公司 Line dividing method and device for text image, computer equipment and readable storage medium
CN111985465A (en) * 2020-08-17 2020-11-24 中移(杭州)信息技术有限公司 Text recognition method, device, equipment and storage medium
CN112507782A (en) * 2020-10-22 2021-03-16 广东省电信规划设计院有限公司 Text image recognition method and device

Also Published As

Publication number Publication date
CN115730563A (en) 2023-03-03

Similar Documents

Publication Publication Date Title
US9721156B2 (en) Gift card recognition using a camera
CN110717366A (en) Text information identification method, device, equipment and storage medium
WO2022142551A1 (en) Form processing method and apparatus, and medium and computer device
WO2019000681A1 (en) Information layout method, device, apparatus and computer storage medium
WO2018161764A1 (en) Document reading-order detection method, computer device, and storage medium
WO2022142550A1 (en) Image recognition method and apparatus, and storage medium
WO2022089170A1 (en) Caption area identification method and apparatus, and device and storage medium
CN107748744B (en) Method and device for establishing drawing box knowledge base
CN110263792B (en) Image recognizing and reading and data processing method, intelligent pen, system and storage medium
US20210073535A1 (en) Information processing apparatus and information processing method for extracting information from document image
CN112926565B (en) Picture text recognition method, system, equipment and storage medium
CN111652141A (en) Question segmentation method, device, equipment and medium based on question number and text line
CN114937270A (en) Ancient book word processing method, ancient book word processing device and computer readable storage medium
KR20230062251A (en) Apparatus and method for document classification based on texts of the document
WO2023029116A1 (en) Text image typesetting method and apparatus, electronic device, and storage medium
CN111090343B (en) Method and device for identifying click-to-read content in click-to-read scene
JP6441142B2 (en) Search device, method and program
WO2022142549A1 (en) Text recognition method and apparatus, and storage medium
CN111062377B (en) Question number detection method, system, storage medium and electronic equipment
CN108764344A (en) A kind of method, apparatus and storage device based on limb recognition card
CN111582281B (en) Picture display optimization method and device, electronic equipment and storage medium
JP2008027133A (en) Form processor, form processing method, program for executing form processing method, and recording medium
CN111160265B (en) File conversion method and device, storage medium and electronic equipment
CN114283419A (en) Text image area detection method, related equipment and readable storage medium
JP6609181B2 (en) Character attribute estimation apparatus and character attribute estimation program

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE