CN116363659A - Handwriting multi-line character segmentation method, device and equipment - Google Patents

Handwriting multi-line character segmentation method, device and equipment Download PDF

Info

Publication number
CN116363659A
CN116363659A CN202310341510.4A CN202310341510A CN116363659A CN 116363659 A CN116363659 A CN 116363659A CN 202310341510 A CN202310341510 A CN 202310341510A CN 116363659 A CN116363659 A CN 116363659A
Authority
CN
China
Prior art keywords
height
strokes
row
value
line
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310341510.4A
Other languages
Chinese (zh)
Inventor
庄建明
郑晓敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hong Yuxing Private LLC
Original Assignee
Hong Yuxing Private LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hong Yuxing Private LLC filed Critical Hong Yuxing Private LLC
Priority to CN202310341510.4A priority Critical patent/CN116363659A/en
Publication of CN116363659A publication Critical patent/CN116363659A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19107Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/24Character recognition characterised by the processing or recognition method
    • G06V30/242Division of the character sequences into groups prior to recognition; Selection of dictionaries
    • G06V30/244Division of the character sequences into groups prior to recognition; Selection of dictionaries using graphical properties, e.g. alphabet type or font
    • G06V30/2455Discrimination between machine-print, hand-print and cursive writing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Character Input (AREA)

Abstract

The invention discloses a method, a device and equipment for dividing handwritten multi-line characters, wherein the dividing method comprises the following steps: setting a predicted height, a wide-high threshold value and an offset matrix of the current text, wherein the predicted height is a first proportional value of the heights of all stroke number positions, the wide-high threshold value is a larger value of the average widths and heights of all strokes, the offset matrix is a minimum X-axis value minus the predicted height of the current object, and the maximum X-axis value plus a positive rectangle of strokes contained in a comparison object in the predicted height range; performing one or more of a segmentation operation on the text, the segmentation operation including deleting the abnormal strokes; pre-segmentation; isolated points and pen pouring are independently in a row; re-dividing the oversized line; reordering rows; space merging; the word spacing in the line is divided into a plurality of lines excessively; it is determined whether the whole is a single word. The invention does not change the original handwriting, and accurately and efficiently segments the handwriting multi-line characters, so that the handwriting text has more standard and tidy typesetting.

Description

Handwriting multi-line character segmentation method, device and equipment
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method, an apparatus, and a device for dividing handwritten multi-line characters.
Background
With the development of technology, people are increasingly accustomed to handwriting text by using a handwriting pad or a touch screen. In handwriting text, unlike typing chat of a usual mobile device, the prior art can simulate text handwriting in reality, i.e. handwriting in the same space, and pen strokes on a handwriting pad or touch screen can be mapped directly into text, like drawing.
Thus, in order to better recognize each character in text, the prior art has a handwritten character segmentation technique, mainly for separating letters and numbers in handwritten text. It can help computers recognize and interpret handwritten text, providing support for machine learning and natural language processing applications. Its main principle is to analyze the characteristics of shape, size, spelling, continuity, etc. of the handwritten character using various data processing techniques and recognize each character based on these characteristics.
However, in the prior art, there is a lack of a method for sorting the segmentation in the text, that is, a method for adjusting the abnormal strokes, the character line spacing, etc. while retaining the handwriting of the user.
Disclosure of Invention
In view of the technical problems, the invention provides a method, a device and equipment for dividing handwritten multi-line characters, which can divide and sort the handwritten multi-line characters in a text and improve the accuracy of line division.
Other features and advantages of the present disclosure will be apparent from the following detailed description, or may be learned in part by the practice of the disclosure.
According to an aspect of the present disclosure, a method for dividing a handwritten multi-line character is provided, which is applied to recognition of a text, wherein the text includes a plurality of lines of handwritten characters, the characters are composed of one or more strokes, each of the strokes can be read, and the dividing method includes:
setting a predicted height, a wide-high threshold and an offset matrix of a current text, wherein the predicted height is a first proportional value of the heights of all stroke quantity positions, the wide-high threshold is a larger value of the average widths and heights of all strokes, the offset matrix is a positive rectangle of strokes contained in a comparison object in a range of the predicted height and a minimum X-axis value minus the predicted height of the current object;
performing one or more of segmentation operations on the text based on the estimated height, the width-to-height threshold, and the offset matrix, the segmentation operations including deleting outlier strokes; pre-segmentation; isolated points and pen pouring are independently in a row; re-dividing the oversized line; reordering rows; space merging; the word spacing in the line is divided into a plurality of lines excessively; it is determined whether the whole is a single word.
Further, the deleting the abnormal stroke includes deleting the stroke having one of:
the width and the height are all larger than 5 times of the width and the height threshold value;
a height greater than 7 times the wide-high threshold;
the height is greater than 3.75 times the predicted height.
Further, the pre-segmentation includes segmenting the stroke into a next row with one of:
subtracting the X-axis maximum value of the existing strokes from the X-axis maximum value of all strokes in the current row to be larger than the estimated height;
subtracting the estimated height by which the maximum value of the Y axis of all strokes in the current row is more than 2 times from the minimum value of the Y axis of the strokes;
the existing Y-axis minimum of the strokes is greater than the Y-axis maximum of the offset matrix.
Further, the isolated point and the pen fall are independently in one row, and the method comprises the following steps:
not all of the strokes comprise strokes in a point Y-axis value positive too far distributed confidence interval area of 95.449974%, and are individually one row.
The oversized row repartitioning includes:
and in the classified rows, if the row height is more than 2 times of the estimated height, classifying the rows by category number based on density cluster analysis.
Further, the reordering includes: the rows are circumscribed in front of the rows with smaller central values of the rectangle.
Further, the spatial merging includes: comparing the two adjacent lines of characters, and merging the two adjacent lines when one of the following conditions exists;
the minimum value of the Y axis of the current row is larger than the minimum value of the Y axis of the comparison row, and the maximum value of the Y axis of the current row is smaller than the maximum value of the Y axis of the comparison row;
one of the current row and the comparison row is smaller than 0.9 times of the estimated height, and the contrast ratio of the overlapping height of the current row and the comparison row to the two rows is larger than 0.85;
the overlapping heights of the current row and the comparison row are larger than 0.9 times of the two rows, and the overlapping positions of the current row and the comparison row are larger than the combining height of the two rows at the overlapping positions of 0.9 times.
Further, the word pitch in the line is divided into a plurality of lines, including:
in the same row, if the distance between the two characters is greater than 5 times of the estimated height, the two rows are divided by taking the middle of the two characters as a dividing line.
Further, the determining whether the whole word is a single word includes:
if the entire text is multi-line and the entire width is greater than 2 times its height, then it is not a word;
if the height of the text is greater than 2 times its width, it is a word.
According to a second aspect of the present disclosure, there is provided a handwritten multi-line character segmentation apparatus comprising: the pre-setting module is used for setting a pre-estimated height, a wide-high threshold and an offset matrix of the current text, wherein the pre-estimated height is a first proportional value of the heights of all stroke number positions, the wide-high threshold is a larger value of the average widths and the heights of all strokes, and the offset matrix is a positive rectangle formed by subtracting the pre-estimated height from the minimum X-axis value of the current character and adding the pre-estimated height to the maximum X-axis value; an execution module for performing one or more of a segmentation operation on the text based on the estimated height, the width-to-height threshold, and the offset matrix, the segmentation operation including deleting abnormal strokes; pre-segmentation; isolated points and pen pouring are independently in a row; re-dividing the oversized line; reordering rows; space merging; the word spacing in the line is divided into a plurality of lines excessively; it is determined whether the whole is a single word.
According to a third aspect of the present disclosure, there is provided a handwritten multi-line character segmentation apparatus comprising: setting a predicted height, a wide-high threshold and an offset matrix of a current text, wherein the predicted height is a first proportional value of the heights of all stroke number positions, the wide-high threshold is a larger value of the average widths and heights of all strokes, and the offset matrix is a positive rectangle formed by subtracting the predicted height from the minimum X-axis value of the current character and adding the predicted height to the maximum X-axis value; performing one or more of segmentation operations on the text based on the estimated height, the width-to-height threshold, and the offset matrix, the segmentation operations including deleting outlier strokes; pre-segmentation; isolated points and pen pouring are independently in a row; re-dividing the oversized line; reordering rows; space merging; the word spacing in the line is divided into a plurality of lines excessively; it is determined whether the whole is a single word.
The technical scheme of the present disclosure has the following beneficial effects:
according to the handwriting multi-line character segmentation method, firstly, abnormal strokes are deleted, the influence of the abnormal strokes on segmentation and the subsequent recognition result is avoided, and the strokes are roughly segmented into a plurality of lines by pre-segmentation. And then, using isolated points and pen pouring as one row, re-dividing the oversized row, re-ordering the rows, combining the space, and improving the row division accuracy rate by excessively dividing the word spacing in the rows into a plurality of rows. And finally, determining whether a second segmentation mode needs to be added according to whether the whole word is judged.
The invention does not change the original handwriting, and accurately and efficiently segments the handwriting multi-line characters, so that the handwriting text has more standard and tidy typesetting.
Drawings
FIG. 1 is a flow chart of a method of segmentation of handwritten multi-line characters in an embodiment of the present disclosure;
FIG. 2 is a schematic text diagram to be manipulated in an embodiment of the specification;
FIG. 3 is a schematic diagram of the text of FIG. 2 after the abnormal strokes have been deleted;
FIG. 4 is a schematic diagram of yet another text to be operated in an embodiment of the specification;
FIG. 5 is a schematic diagram of the text of FIG. 4 after a segmentation operation;
FIG. 6 is a diagram showing a result of judging whether the whole word is a single word or not in the embodiment of the specification;
FIG. 7 is a diagram showing another result of judging whether the whole is a word or not in the embodiment of the specification;
FIG. 8 is a block diagram of a handwritten multi-line character segmentation apparatus in an embodiment of the specification;
fig. 9 is a terminal device for implementing a handwritten multi-line character segmentation method in an embodiment of the present disclosure;
fig. 10 is a computer readable storage medium for implementing a handwritten multi-line character segmentation method in an embodiment of the present description.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the present disclosure. One skilled in the relevant art will recognize, however, that the aspects of the disclosure may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.
Furthermore, the drawings are only schematic illustrations of the present disclosure. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.
As shown in fig. 1, an embodiment of the present disclosure provides a method for dividing handwritten multi-line characters, where an execution subject of the method may be a terminal device, and the terminal device may be a mobile phone, a tablet computer, a personal computer, or the like. The method is applied to the recognition of text comprising a plurality of lines of handwritten characters, said characters being composed of one or more strokes, each of said strokes being readable. The method specifically comprises the following steps S101 to S102:
in step S101, a predicted height of the current text, a width-height threshold, and an offset matrix are set, where the predicted height is a first ratio of heights of all the stroke number positions, the width-height threshold is a larger value of average widths and heights of all the strokes, the offset matrix is a minimum X-axis value minus the predicted height of the current object, and a maximum X-axis value plus a positive rectangle of strokes included in the comparison object within the predicted height range.
In the estimated height, the height of a stroke may refer to the vertical height covered by each stroke, i.e., the distance from the highest point to the lowest point of the stroke. The stroke height of all stroke number positions may be the average height of all strokes. The first ratio value may be 0.89. The estimated height is 0.89 x the stroke height for all stroke number positions.
In the wide-to-high threshold, the width and height of a stroke may refer to the horizontal width and vertical height that each stroke covers. The average width and average height of all strokes may refer to an average value obtained by adding the widths and heights of all strokes, respectively, and dividing the sum by the number of strokes, and the larger value of the average width and height refers to which value is larger among the average width and average height, and the width-height threshold is that value.
In the offset matrix, the current character minimum X-axis value may refer to an X-axis coordinate value of a leftmost stroke of the current object in the horizontal direction. The maximum X-axis value may refer to an X-axis coordinate value of the rightmost stroke of the current object in the horizontal direction. Comparison object: the reference object is the object with which the current object needs to be compared, wherein the current object can be the current character, and the comparison object is other characters which are compared with the current character, such as adjacent characters; the current object may also be the current line of characters. Positive rectangle: refers to a rectangle in which all corners of the rectangle are right angles.
Specifically, in the embodiment of the invention, the estimated height is subtracted from the minimum X-axis value of the current object to obtain the left boundary value of the current object. And adding the estimated height to the maximum X-axis value of the current object to obtain the right boundary value of the current object. Then judging whether a horizontal overlapping part exists between the current object and the reference object: if the right boundary value of the current object is smaller than the left boundary value of the reference object or the left boundary value of the current object is larger than the right boundary value of the reference object, the horizontal overlapping part does not exist; if the right boundary value of the current object is equal to or greater than the left boundary value of the reference object and the left boundary value of the current object is equal to or less than the right boundary value of the reference object, then a horizontal overlap exists. If the horizontal overlapping part exists, right moving a positive rectangle formed by all strokes in the reference object by the distance between the left boundary value of the current object and the left boundary value of the reference object, and obtaining an offset matrix. If there is no horizontal overlap, the offset matrix is 0. By calculating the offset matrix, it is possible to help combine strokes or line text in the handwritten text in the correct order, thereby achieving automatic text segmentation and recognition.
In step S102, one or more of segmentation operations are performed on the text based on the estimated height, the width-to-height threshold, and the offset matrix, the segmentation operations including deleting abnormal strokes; pre-segmentation; isolated points and pen pouring are independently in a row; re-dividing the oversized line; reordering rows; space merging; the word spacing in the line is divided into a plurality of lines excessively; it is determined whether the whole is a single word.
Specific details of the splitting operation are obtained from the following embodiments.
In one embodiment, deleting an abnormal stroke includes deleting the stroke having one of:
the width and the height are all larger than 5 times of the width and the height threshold value; a height greater than 7 times the wide-high threshold; the height is greater than 3.75 times the predicted height.
As shown in fig. 2, the abnormal strokes in fig. 2 are two oblique lines passing through the character, and the two oblique lines are identified and deleted through the operation of deleting the abnormal strokes. In the electronic handwriting records in the prior art, each stroke has a corresponding record, and after the stroke is identified as abnormal, the stroke can be directly deleted. After deleting the abnormal strokes, as shown in FIG. 3.
In one embodiment, the pre-segmentation may be triggered by conditions 1-3:
condition 1: subtracting the X-axis maximum value of the existing strokes from the X-axis maximum value of all strokes in the current row to be larger than the estimated height; condition 2: subtracting the estimated height by which the maximum value of the Y axis of all strokes in the current row is more than 2 times from the minimum value of the Y axis of the strokes; condition 3: the existing Y-axis minimum of the strokes is greater than the Y-axis maximum of the offset matrix.
As shown in fig. 4, if the strokes of the current line satisfy both conditions 1 and 3, or satisfy both conditions 2 and 3, the current stroke belongs to the next line.
In an embodiment, the isolated point and the pen are independently arranged in a row, and the method comprises the following steps: not all of the strokes comprise strokes in a point Y-axis value positive too far distributed confidence interval area of 95.449974%, and are individually one row.
Specifically, as shown in fig. 5, fig. 5 is a result of performing the operation of isolating points and inverting strokes to be single lines in fig. 4. Isolated points refer to one or more discrete stroke points that are not connected to other strokes, which points are typically generated in handwritten characters due to writing irregularities or hand tremors. The pen-down refers to the stroke of which the movement direction of the hand is opposite to the conventional direction during writing. For example, we conventionally write a straight line from top to bottom and tip over from bottom to top, or write a circular arc clockwise and tip over counterclockwise, etc. The presence of these pen-down strokes also affects character recognition.
If the Y-axis coordinate of a stroke is not within the normal distribution confidence interval (95.449974%) of the Y-axis coordinates of all strokes in the current line, then the stroke is considered to be not in the current line and is treated as a single line.
In one embodiment, the oversized row repartitioning includes: and in the classified rows, if the row height is more than 2 times of the estimated height, classifying the rows by category number based on density cluster analysis.
Specifically, with continued reference to fig. 5, fig. 5 is the result of performing the oversized row repartitioning operation of fig. 4. If the height of a row is greater than twice the estimated height, the row is subdivided. The specific repartitioning method is to use density cluster analysis to cluster the strokes in the line into several classes, and then to divide the strokes of different classes into different lines. If there are several classes of strokes, then it is divided into several lines. In this way, it is effectively avoided that some oversized lines affect the recognition effect of the whole text.
In one embodiment, the reordering comprises: the rows are circumscribed in front of the rows with smaller central values of the rectangle.
With continued reference to fig. 5, fig. 5 is a result of the reorder operation of fig. 4. In the middle, the rows with smaller central values of the rows are arranged in front of the rows with the central positions higher than the central positions, so that the upper and lower structures of the whole text are clearer, and the whole text looks more natural and smooth.
In one embodiment, with continued reference to fig. 5, fig. 5 is a result of performing the spatial merging operation of fig. 4, where the spatial merging includes: comparing the two adjacent lines of characters, and merging the two adjacent lines when one of the following conditions exists;
the minimum value of the Y axis of the current row is larger than the minimum value of the Y axis of the comparison row, and the maximum value of the Y axis of the current row is smaller than the maximum value of the Y axis of the comparison row;
one of the current row and the comparison row is smaller than 0.9 times of the estimated height, and the contrast ratio of the overlapping height of the current row and the comparison row to the two rows is larger than 0.85;
the overlapping heights of the current row and the comparison row are larger than 0.9 times of the two rows, and the overlapping positions of the current row and the comparison row are larger than the combining height of the two rows at the overlapping positions of 0.9 times.
In one embodiment, with continued reference to fig. 5, fig. 5 is a result of performing the operation of oversized line-to-line word spacing in a plurality of lines on fig. 4, where the oversized line-to-line word spacing includes:
in the same row, if the distance between the two characters is greater than 5 times of the estimated height, the two rows are divided by taking the middle of the two characters as a dividing line.
In one embodiment, as shown in fig. 6 and 7, fig. 6 and 7 are two results after "judging whether the whole is a single word" operation, where the judging whether the whole is a single word includes:
if the entire text is multi-line and the entire width is greater than 2 times its height, then it is not a word;
if the height of the text is greater than 2 times its width, it is a word.
Based on the same concept, as shown in fig. 8, the exemplary embodiment of the disclosure further provides a handwritten multi-line character segmentation apparatus 800, which includes a preset module 801, where the preset module 801 is configured to set a predicted height of a current text, a wide-height threshold, and an offset matrix, where the predicted height is a first ratio value of heights of all stroke number positions, the wide-height threshold is a larger value of average widths and heights of all strokes, and the offset matrix is a positive rectangle formed by subtracting the predicted height from a minimum X-axis value of the current character and adding the maximum X-axis value to the predicted height; an execution module 802, wherein the execution module 802 is configured to perform one or more segmentation operations on the text based on the estimated height, the width-to-height threshold, and the offset matrix, the segmentation operations including deleting abnormal strokes; pre-segmentation; isolated points and pen pouring are independently in a row; re-dividing the oversized line; reordering rows; space merging; the word spacing in the line is divided into a plurality of lines excessively; it is determined whether the whole is a single word.
With the above-described handwritten multi-line character segmentation apparatus 800, abnormal strokes are first deleted, the abnormal strokes are prevented from affecting segmentation and subsequent recognition results, and pre-segmentation is used to coarsely segment the strokes into multiple lines. And then, using isolated points and pen pouring as one row, re-dividing the oversized row, re-ordering the rows, combining the space, and improving the row division accuracy rate by excessively dividing the word spacing in the rows into a plurality of rows. And finally, determining whether a second segmentation mode needs to be added according to whether the whole word is judged.
The handwritten multi-line character segmentation apparatus 800 does not change the original written handwriting, and performs accurate and efficient segmentation on the handwritten multi-line characters, so that the handwritten text has more standard and tidy typesetting.
The specific details of each module/unit in the above apparatus are already described in the method section embodiments, and the details not disclosed may refer to the method section embodiments, so that they will not be described in detail.
Based on the same idea, the embodiment of the present disclosure further provides a handwritten multi-line character segmentation apparatus, as shown in fig. 9.
The handwritten multi-line character segmentation apparatus may be a terminal apparatus or a server provided in the above-described embodiments.
The handwriting multi-line character segmentation apparatus may be of a relatively large variety due to configuration or performance, may include one or more processors 901 and memory 902, and may have one or more stored applications or data stored in memory 902. The memory 902 may include, among other things, readable media in the form of volatile memory units, such as Random Access Memory (RAM) units and/or cache memory units, and may further include read-only memory units. The application programs stored in memory 902 may include one or more program modules (not shown) including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Still further, the processor 901 may be arranged to communicate with the memory 902 to execute a series of computer executable instructions in the memory 902 on the handwritten multi-line character segmentation apparatus. The handwriting multi-line character segmentation device may also include one or more power supplies 903, one or more wired or wireless network interfaces 904, one or more I/O interfaces (input-output interfaces) 905, one or more external devices 906 (e.g., keyboard, hand-drawn pad, bluetooth device, etc.), one or more devices that enable a user to interact with the device, and/or any devices (e.g., routers, modems, etc.) that enable the device to communicate with one or more other computing devices. Such communication may occur through the I/O interface 905. Also, devices can communicate with one or more networks (e.g., a Local Area Network (LAN)) via a wired or wireless interface 904.
In particular, in this embodiment, the handwriting multi-line character segmentation apparatus includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the handwriting multi-line character segmentation apparatus, and the execution of the one or more programs by the one or more processors comprises computer-executable instructions for:
setting a predicted height, a wide-high threshold and an offset matrix of a current text, wherein the predicted height is a first proportional value of the heights of all stroke quantity positions, the wide-high threshold is a larger value of the average widths and heights of all strokes, the offset matrix is a positive rectangle of strokes contained in a comparison object in a range of the predicted height and a minimum X-axis value minus the predicted height of the current object;
performing one or more of segmentation operations on the text based on the estimated height, the width-to-height threshold, and the offset matrix, the segmentation operations including deleting outlier strokes; pre-segmentation; isolated points and pen pouring are independently in a row; re-dividing the oversized line; reordering rows; space merging; the word spacing in the line is divided into a plurality of lines excessively; it is determined whether the whole is a single word.
Based on the same idea, exemplary embodiments of the present disclosure further provide a computer readable storage medium having stored thereon a program product capable of implementing the method described in the present specification. In some possible implementations, various aspects of the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the disclosure as described in the "exemplary methods" section of this specification, when the program product is run on the terminal device.
Referring to fig. 10, a program product 1000 for implementing the above-described method according to an exemplary embodiment of the present disclosure is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the exemplary embodiments of the present disclosure.
Furthermore, the above-described figures are only schematic illustrations of processes included in the method according to the exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with exemplary embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A method of segmentation of handwritten multi-line characters for use in recognition of text, said text including a plurality of lines of handwritten characters, said characters being comprised of one or more strokes, each of said strokes being readable, said method comprising:
setting a predicted height, a wide-high threshold and an offset matrix of a current text, wherein the predicted height is a first proportional value of the heights of all stroke quantity positions, the wide-high threshold is a larger value of the average widths and heights of all strokes, the offset matrix is a positive rectangle of strokes contained in a comparison object in a range of the predicted height and a minimum X-axis value minus the predicted height of the current object;
performing one or more of segmentation operations on the text based on the estimated height, the width-to-height threshold, and the offset matrix, the segmentation operations including deleting outlier strokes; pre-segmentation; isolated points and pen pouring are independently in a row; re-dividing the oversized line; reordering rows; space merging; the word spacing in the line is divided into a plurality of lines excessively; it is determined whether the whole is a single word.
2. The method of claim 1, wherein the deleting an abnormal stroke comprises deleting the stroke having one of:
the width and the height are all larger than 5 times of the width and the height threshold value;
a height greater than 7 times the wide-high threshold;
the height is greater than 3.75 times the predicted height.
3. The method of claim 1, wherein the pre-segmentation includes segmenting the strokes into a next line with one of:
subtracting the X-axis maximum value of the existing strokes from the X-axis maximum value of all strokes in the current row to be larger than the estimated height;
subtracting the estimated height by which the maximum value of the Y axis of all strokes in the current row is more than 2 times from the minimum value of the Y axis of the strokes;
the existing Y-axis minimum of the strokes is greater than the Y-axis maximum of the offset matrix.
4. The method of claim 1, wherein the isolated point and the pen are separated into one line, and the method comprises:
not all of the strokes comprise strokes in a point Y-axis value positive too far distributed confidence interval area of 95.449974%, and are individually one row.
The oversized row repartitioning includes:
and in the classified rows, if the row height is more than 2 times of the estimated height, classifying the rows by category number based on density cluster analysis.
5. The method of claim 1, wherein the reordering comprises: the rows are circumscribed in front of the rows with smaller central values of the rectangle.
6. The method of claim 1, wherein the spatial merging comprises: comparing the two adjacent lines of characters, and merging the two adjacent lines when one of the following conditions exists;
the minimum value of the Y axis of the current row is larger than the minimum value of the Y axis of the comparison row, and the maximum value of the Y axis of the current row is smaller than the maximum value of the Y axis of the comparison row;
one of the current row and the comparison row is smaller than 0.9 times of the estimated height, and the contrast ratio of the overlapping height of the current row and the comparison row to the two rows is larger than 0.85;
the overlapping heights of the current row and the comparison row are larger than 0.9 times of the two rows, and the overlapping positions of the current row and the comparison row are larger than the combining height of the two rows at the overlapping positions of 0.9 times.
7. The method of claim 1, wherein the word-line distance is divided into a plurality of lines, comprising:
in the same row, if the distance between the two characters is greater than 5 times of the estimated height, the two rows are divided by taking the middle of the two characters as a dividing line.
8. The method for dividing characters into a plurality of handwriting lines according to claim 1, wherein said determining whether the whole is a single word comprises:
if the entire text is multi-line and the entire width is greater than 2 times its height, then it is not a word;
if the height of the text is greater than 2 times its width, it is a word.
9. A handwritten multi-line character segmentation apparatus, comprising:
the pre-setting module is used for setting a pre-estimated height, a wide-high threshold and an offset matrix of the current text, wherein the pre-estimated height is a first proportional value of the heights of all stroke number positions, the wide-high threshold is a larger value of the average widths and the heights of all strokes, and the offset matrix is a positive rectangle formed by subtracting the pre-estimated height from the minimum X-axis value of the current character and adding the pre-estimated height to the maximum X-axis value;
an execution module for performing one or more of a segmentation operation on the text based on the estimated height, the width-to-height threshold, and the offset matrix, the segmentation operation including deleting abnormal strokes; pre-segmentation; isolated points and pen pouring are independently in a row; re-dividing the oversized line; reordering rows; space merging; the word spacing in the line is divided into a plurality of lines excessively; it is determined whether the whole is a single word.
10. A handwritten multi-line character segmentation apparatus, comprising:
a processor; and a memory arranged to store computer executable instructions that, when executed, cause the processor to:
setting a predicted height, a wide-high threshold and an offset matrix of a current text, wherein the predicted height is a first proportional value of the heights of all stroke number positions, the wide-high threshold is a larger value of the average widths and heights of all strokes, and the offset matrix is a positive rectangle formed by subtracting the predicted height from the minimum X-axis value of the current character and adding the predicted height to the maximum X-axis value;
performing one or more of segmentation operations on the text based on the estimated height, the width-to-height threshold, and the offset matrix, the segmentation operations including deleting outlier strokes; pre-segmentation; isolated points and pen pouring are independently in a row; re-dividing the oversized line; reordering rows; space merging; the word spacing in the line is divided into a plurality of lines excessively; it is determined whether the whole is a single word.
CN202310341510.4A 2023-03-31 2023-03-31 Handwriting multi-line character segmentation method, device and equipment Pending CN116363659A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310341510.4A CN116363659A (en) 2023-03-31 2023-03-31 Handwriting multi-line character segmentation method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310341510.4A CN116363659A (en) 2023-03-31 2023-03-31 Handwriting multi-line character segmentation method, device and equipment

Publications (1)

Publication Number Publication Date
CN116363659A true CN116363659A (en) 2023-06-30

Family

ID=86936288

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310341510.4A Pending CN116363659A (en) 2023-03-31 2023-03-31 Handwriting multi-line character segmentation method, device and equipment

Country Status (1)

Country Link
CN (1) CN116363659A (en)

Similar Documents

Publication Publication Date Title
JP6628442B2 (en) Text image processing method and apparatus
US11886799B2 (en) Determining functional and descriptive elements of application images for intelligent screen automation
CN109614944B (en) Mathematical formula identification method, device, equipment and readable storage medium
US9697423B1 (en) Identifying the lines of a table
US9904847B2 (en) System for recognizing multiple object input and method and product for same
WO2020063314A1 (en) Character segmentation identification method and apparatus, electronic device, and storage medium
US20210357710A1 (en) Text recognition method and device, and electronic device
US8515175B2 (en) Storage medium, apparatus and method for recognizing characters in a document image using document recognition
CN111062365A (en) Method, device, chip circuit and computer readable storage medium for identifying mixed typesetting characters
US8494278B2 (en) Handwritten character recognition based on frequency variations in characters
CN108701215B (en) System and method for identifying multi-object structures
CN108734161B (en) Method, device and equipment for identifying prefix number area and storage medium
CN111507330A (en) Exercise recognition method and device, electronic equipment and storage medium
US11636666B2 (en) Method and apparatus for identifying key point locations in image, and medium
US9367237B2 (en) Shaping device
US11055526B2 (en) Method, system and apparatus for processing a page of a document
US20150139547A1 (en) Feature calculation device and method and computer program product
CN113887375A (en) Text recognition method, device, equipment and storage medium
CN113553428A (en) Document classification method and device and electronic equipment
US20150186718A1 (en) Segmentation of Overwritten Online Handwriting Input
JP2018067298A (en) Handwritten content editing device and handwritten content editing method
CN116363659A (en) Handwriting multi-line character segmentation method, device and equipment
CN114663902B (en) Document image processing method, device, equipment and medium
US20230343125A1 (en) Handwriting Recognition Method and Apparatus, Handwriting Recognition System and Interactive Display
CN113128496B (en) Method, device and equipment for extracting structured data from image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination