US5138668A - Character discrimination system employing height-to-width ratio and vertical extraction position information - Google Patents

Character discrimination system employing height-to-width ratio and vertical extraction position information Download PDF

Info

Publication number
US5138668A
US5138668A US07/742,449 US74244991A US5138668A US 5138668 A US5138668 A US 5138668A US 74244991 A US74244991 A US 74244991A US 5138668 A US5138668 A US 5138668A
Authority
US
United States
Prior art keywords
character
rectangular area
discrete
electronically
characters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
US07/742,449
Other languages
English (en)
Inventor
Keiko Abe
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Corp
Original Assignee
Sony Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corp filed Critical Sony Corp
Application granted granted Critical
Publication of US5138668A publication Critical patent/US5138668A/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/158Segmentation of character regions using character size, text spacings or pitch estimation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • This invention relates to a character recognition system, and more particularly to a character recognition system by which a character area can be extracted efficiently.
  • a process of recognizing a character in accordance with character information extracted from a character row signal using a rectangular area which is formed from the character row signal and circumscribes a complete character or a component of a discrete character which comprises the step of judging that the rectangular area is a component of a discrete character in accordance with a height-to-width ratio and a vertical extraction position of the rectangular area.
  • a process of recognizing a character in accordance with character information extracted from a character row signal using a rectangular area which is formed from the character row signal and circumscribes a complete character or a component of a discrete character which comprises the steps of judging that the character area is a component of a discrete character in accordance with a height-to-width ratio and a vertical extraction position of the character area, judging, when it is judged that the first rectangular area is a component of a discrete character, that a second adjacent character area is another component of the discrete character, integrating the second character area with the first character area under the condition that the character pitch of the integrated character areas do not exceed an average character pitch, and discriminating the character information extracted from the integrated first and second character areas as character information of the components of the discrete character.
  • a character recognition system which recognizes a character in accordance with character information extracted from a character row signal using a rectangular area which is formed from the character row signal and circumscribes a complete character or a component of a discrete character, which comprises means for judging that the rectangular area is a component of a discrete character in accordance with a height-to-width ratio and a vertical extraction position of the rectangular area.
  • a character recognition system which recognizes a character in accordance with character information extracted from a character row signal using a rectangular area which is formed from the character row signal and circumscribes a complete character or a component of a discrete character, which comprises means for judging that the rectangular area is a component of a discrete character in accordance with a height-to-width ratio and a vertical extraction position of the rectangular area, means for judging, when it is judged that the first rectangular area is a component of a discrete character, that a second adjacent character area is another component of the discrete character, means for integrating the second character area with the first character area under the condition that the character pitch of the integrated character areas does not exceed an average character pitch, and means for discriminating the character information extracted from the integrated first and second character areas as character information of the components of the discrete character.
  • a component of any discrete character is discriminated in accordance with a height-to-width ratio and a vertical extraction position of the rectangular area. Accordingly, extraction accuracy of a discrete character is improved.
  • FIG. 1 is a block diagram of a character recognition system showing a preferred embodiment of the present invention
  • FIG. 2 is a schematic diagram illustrating extraction of a character row
  • FIGS. 3, 4A, 4B and 4C and 5 are schematic diagrams illustrating extraction of characters
  • FIG. 6 is a flow chart showing the procedure of an extracting processing by the character recognition system shown in FIG. 1;
  • FIGS. 7 and 8A, 8B and 8C are schematic diagrams illustrating an integrating processing of a discrete character.
  • a character recognition system 1 is connected to a document image reader 2 and receives an image signal S1 produced by the document reader 2.
  • the character recognition system 1 includes a pre-processing section including a noise eliminating means 3 and a rotation correcting means 4, a character area processing section including a character row extracting means 5 and a character extracting means 6, and a character discriminator 7.
  • the noise eliminating means 3 receives an input image signal S1 from the document image reader 2 and removes from the input image signal S1 noise images such as isolated points included in the document image read by the document image reader 2 so that isolated points or the like are not erroneously recognized as part of a character or characters.
  • the noise eliminating means 3 thus delivers a noise-free image signal S2 to the rotation correcting means 4.
  • the rotation correcting means 4 Upon reception of such noise-free image signal S2, the rotation correcting means 4 corrects rotation of the read document on a plane and delivers a corrected image signal S3 to the character row extracting means 5.
  • the character row extracting means 5 separates the printed document into a character area and other areas (e.g. photograph, drawing and so forth) and then extracts only image character data included in the character area. Then, the character row extracting means 5 confirms that character lines included in the character area are laterally arranged rows, extracts the character rows and delivers a signal S4 representative of the character rows.
  • the character extracting means 6 receives the character row signal S4 from the character row extracting means 5 and extracts from the character row signal S4 non-discrete ordinary characters and special characters in the form of em characters while it also extracts discrete characters making use of an integration technique as is required.
  • the character extracting means 6 delivers data of the thus extracted characters as input character data S5 to the character discriminator 7.
  • the character discriminator 7 has a dictionary of standard characters relative to all of the object characters for discrimination therein and selects a standard character having the most similar features to those of the input character data S5. The character discriminator 7 thus delivers data S6 of thus recognized character as an output of the character recognition system 1.
  • Extraction of character rows by the character row extracting means 5 is executed in the following manner.
  • the position of each dot in a character area AR is represented by x-y coordinates wherein the x-axis extends in the horizontal direction and the y-axis extends in the vertical direction
  • the sums of dots of the logic "1" level in the form of dots constituting black characters included in the character area AR are taken in directions parallel to the x-axis and projected on the y-axis to produce a y-projection signal Sy which will be hereinafter referred to as y-projection.
  • the signal level of the y-projection signal Sy is "0" at any position between adjacent character rows AR1, AR2, . . . because there is no black character portion. To the contrary, at any position on the y-axis corresponding to the character rows AR1, AR2 and so forth, the signal level corresponds to a total number of dots on the line passing the point on the y-axis and extending parallel to the x-axis. Thus, the y-projection signal Sy is compared with a predetermined threshold level to consequently obtain character row extraction data CL which presents the logic "1" level in any region where the level of the signal Sy is higher than the threshold level.
  • the character row extracting means 5 uses such character row extraction data CL to extract those portions of the corrected image signal S3 delivered from the rotation correcting means 4 which correspond to timings at which the character row extraction data CL presents the logic "H" level.
  • the character row extracting means 5 thus delivers the extracted signal portions as a character row signal S4 indicative of the individual character rows AR1, AR2 and so forth.
  • any region where the character row extraction data CL presents the logic "1" level has a maximum height HL (HL1, HL2, . . . ) of the character row AR1, AR2, . . . , and any vertical position in the y-axis direction of a character included in any character row is within the maximum height HL of the relevant character row.
  • the character extracting means 6 receives the character row signal S4 and executes such a processing that it detects, with respect to both x-axis and y-axis directions, the positions and ranges where characters and components of discrete characters exist in the individual character rows AR1, AR2 and so forth to extract each of the positions and ranges as a rectangular area CHR surrounded by a circumscribing frame FRAME which circumscribes a complete character or a component of a character as illustrated in FIG. 3.
  • such extraction process is executed such that x-projection is performed in the y-axis direction corresponding to the character height direction and y-projection is performed in the x-axis direction corresponding to the character width direction to obtain an x-projection signal Sx and a y-projection signal Sy, and the x- and y-projection signals Sx and Sy are compared with a predetermined threshold level to detect positions of the circumscribing frames FRAME in both x- and y-axis directions.
  • the first rectangular area CHR H presents height-to-width ratio h H /w H substantially equal to 1 as given by ##EQU1##
  • Such rectangular area CHR H is provided by ordinary characters of the non-discrete square or em-character type which have no such discontinuity as in a discrete character and have no special fixed feature with regard to its size and shape. Such ordinary characters will be hereinafter referred to as non-discrete ordinary characters, and most of the Japanese characters belong to such non-discrete ordinary characters.
  • the rectangular area CHR B makes a component of a discrete character and presents a height-to-width ratio h B /w B greater than 1 as given by ##EQU2##
  • Such rectangular area CHR B is provided by components of discrete characters and vertically elongate special characters and has a tendency that the position thereof in the character height direction is substantially at the center with respect to the maximum height HL.
  • the third rectangular area CHR T has no particular fixed feature in its height-to-width ratio h T /w T , but the height h T and the width w T thereof are comparatively small, and the position thereof in the character height direction is not at the vertical center with respect to the maximum height HL.
  • Some special characters such as "'", ",", ". ⁇ and so forth belong to the third type.
  • rectangular areas CHR (FIG. 3) obtained by an extracting operation of characters from each of the character rows AR1, AR2, and so forth present a random arrangement within the maximum height HL of the relevant character row.
  • rectangular areas CHR H of non-discrete ordinary characters, rectangular areas CHR B of some special characters or components of discrete characters, and rectangular areas CHR T of special characters may be arranged successively at random in the x-axis direction in each character row.
  • the character extracting means 6 properly discriminates between the types of such normal characters and special characters out of the arrangement of rectangular areas CHR obtained per line. Then, if there exists any discrete character, an average character pitch P given by the following equation ##EQU3## is used for the purpose of exactly extracting such character.
  • the rectangles are processed as rectangles of a single square character by a so-called blurring processing.
  • the character extracting means 6 adopts a maximum height HL1 of the first character row as the average character pitch P as given by the equation (3) above. Then, in processing of any of the following character rows, the character extracting means 6 executes a calculation of the equation (3) using a square character pitch Pi of the preceding character row to find out an average character pitch P.
  • the above procedure is based on the reason that, in regard to the first character row, it is impossible to obtain an average character pitch by calculation.
  • the character extracting means 6 executes such an extraction processing program RTO as shown in FIG. 6 in order to extract characters from each of the character rows AR1, AR2 and so forth.
  • the character extracting means 6 executes a basic square extracting processing for each of the character rows AR1, AR2 and so forth in a step SP1 to generate a row of such rectangles as shown in FIGS. 4A, 4B and 4C wherein rectangular areas CHR are arranged sequentially in the x-axis direction with rectangular spaces d left therebetween.
  • the character extracting means 6 executes such an integration processing as shown in FIGS. 7 and 8A to 8C successively for the first, second and successive rectangular areas CHR constituting the relevant rectangle row.
  • the character extracting means 6 makes a decision as to whether or not the height-to-width ratio h/w of a rectangular area CHR, the first rectangular area of the relevant rectangle row in this instance, is greater than 1. When the result of such decision is negative, this signifies that the relevant area CHR belongs to a character which has a height-to-width ratio h/w substantially equal to 1 such as a discrete square character or a special character of a small size.
  • the character extracting means 6 terminates the extraction processing program in a step SP3 and delivers the character data of the thus processed rectangular area CHR as input character data S5 to the character discriminator 7.
  • the character extracting means 6 proceeds to a step SP4 and makes a decision as to whether or not the vertical position of the extracted rectangular area CHR is at the center.
  • the step SP4 is provided to judge whether a relevant rectangular area CHR belongs to a discrete character or a special character, and when the result of such decision is negative, this signifies that the rectangular area CHR is above or below a center line L CTR passing through the center position of the maximum height HL of the relevant character row as seen at a rectangular area CHR X1 or CHR X2 in FIG. 7.
  • the character extracting means 6 proceeds to the step SP3 to terminate the processing program and delivers the character data of the processed rectangular area CHR as input character data S5 to the character discriminator 7.
  • Such character is regarded as a component of a discrete character, and in this instance, the character extracting means 6 proceeds to a step SP5.
  • the character extracting means 6 is capable of exactly distinguishing a component of a discrete character from any other special character.
  • the procedure then advances to the step SP5.
  • a decision is made as to whether or not the height-to-width ratio of a next rectangular area adjacent to the rectangular area CHR is greater than 1. In case the result of such decision is negative, this signifies that the rectangular area such as a rectangular area CHAR11 shown in FIG. 8A is followed by a rectangular area having a height-to-width ratio h/w substantially equal to 1 such as a rectangular CHR12 shown in FIG. 8(A).
  • next rectangular area CHR12 fails to satisfy the condition required for a component of any discrete character, and this signifies that the rectangular area CHR11 which satisfies the condition required for a component of a discrete character is followed by the rectangular area CHR12 which cannot be integrated with the preceding character area CHR11.
  • the character extracting means 6 terminates the processing program in the step SP3 and delivers the input character data S5 indicative of the rectangular area CHR11 to the character discriminator 7.
  • step SP5 if the result of the aforementioned decision in the step SP5 is affirmative, this signifies that a rectangular area such as a rectangular area CHR22 or CHR24 shown in FIG. 8B following another rectangular area such as a rectangular area CHR21 or CHR23 shown in FIG. 8B satisfies a requirement for a component of a discrete character.
  • the character extracting means 6 thus proceeds to a step SP6.
  • step SP6 a decision is made as to whether or not the extraction position of the following adjacent rectangular area is at the center. This is a confirmation of a second condition that the following rectangular area make a component of a discrete character.
  • step SP6 If the result of such decision is negative in the step SP6, this signifies that the following adjacent rectangular area is not astride the center line L CTR as described hereinabove in connection with FIG. 7 and thus signifies that the relevant rectangular area belongs to a special character but not to a component of any discrete character.
  • the character extracting means 6 subsequently proceeds to the step SP3 to terminate the processing program and delivers character data indicative of the rectangular area CHR21 or CHR23 being processed for extraction at present in the case of FIG. 8(B) as input character information S5 to the character discriminator 7.
  • step SP6 if the result obtained in the step SP6 is affirmative, this signifies that the extraction position of the following rectangular area is astride the center line L CTR as mentioned hereinabove in connection with FIG. 7 and thus signifies that the following adjacent rectangular area satisfies the second condition for a component of a discrete character.
  • FIG. 8B for example, the following rectangular area CHR22 or CHR24 adjacent to the rectangular area CHR21 or CHR23 which is being processed at present satisfies also the second condition for a component of a discrete character.
  • the character extracting means 6 proceeds to a step SP7 at which it makes a decision as to whether or not the rectangular area integrated with the following rectangular area presents a smaller pitch than the average character pitch. This is a confirmation of a third condition that the rectangular area which is now being processed make a component of a discrete character.
  • the average character pitch P is calculated on the basis of the square or em characters included in the preceding line to the line which includes the rectangular area being processed now. Practically, however, a character whose pitch is greater than that of a square or em character is not included in a printed document. Therefore, such integration of character areas as would produce a character pitch greater than the average character pitch P must be inhibited.
  • the character extracting means 6 proceeds to the step SP3 to terminate the processing program and delivers to the character discriminator 7 input character data S5 which represents that the rectangular area being processed now belongs to an independent character but not to a component of any discrete character.
  • step SP7 if the result obtained in the step SP7 is affirmative, this signifies that the integrated character areas satisfy the third condition for a component of a discrete character. Therefore, the character extracting means 6 proceeds to a step SP8 to actually execute an integrating processing of the rectangular area with the following rectangular area and then returns to the aforementioned step SP5.
  • the character extracting means 6 integrates, in the case of FIG. 8(B), for example, the rectangular area CHR21 or CHR23 being processed now with the following character area CHR22 or CHR24 based on the ground that a discrete character which may be formed by integration of the rectangular area CHR21 or CHR23 being processed now with the following rectangular area CHR22 or CHR24 would have a character pitch smaller than the average character pitch P.
  • the character extracting means 6 executes the aforementioned decisions in the steps SP5, SP6 and SP7 with respect to a further following rectangular area adjacent to the integrated rectangular area and, when the results obtained in the steps are all affirmative, the integration processing is executed again in the step SP8. To the contrary, if any one of the results obtained is negative, the extraction processing program is terminated in the step SP3, and discrete character data composed of two or more integrated rectangular areas is delivered as input character data S5 to the character discriminator 7.
  • the character extracting means 6 returns to the step SP5 after integration of the rectangular area CHR24 with the area CHR23 in FIG. 8B, a decision is made in the step SP5 with regard to a height-to-width ratio of a following rectangular area CHR25.
  • the height-to-width ratio of the rectangular area CHR25 is substantially equal to 1, and consequently, the character extracting means 6 obtains a negative result in the step SP5.
  • the character extracting means 6 proceeds to the step SP3 without executing an integrating processing of the rectangular area CHR25 and thus delivers the discrete character data of the rectangular areas CHR23 and CHR24 as input character data S5 to the character discriminator 7.
  • the extraction position is adopted, in addition to the height-to-width ratio, as a condition for deciding that the rectangular area to be processed is a component of a discrete character. Accordingly, the extraction accuracy of any discrete character can be further enhanced.
  • the integration is executed in such a manner that the height-to-width ratio and the extraction position of the next rectangular area are judged while maintaining the condition that the character pitch after such integration processing never exceeds the average character pitch P. Consequently, the extraction accuracy of a discrete character can be further improved.
  • the above embodiment has been described with regard to an exemplary case of calculating the average character pitch P in each character row and executing integration of a discrete character by utilizing the average character pitch P obtained in a preceding character row.
  • a modification may be employed that the entire or partial square characters in a character row are used as the data of the average character pitch P for each character row, or else an average character pitch is calculated with respect to partial square or em characters in a preceding character row and is used as the average character pitch data in the relevant character row.
  • the average character pitch data may be calculated with regard to square characters in a plurality of character rows.
  • the new character row may be regarded as the first character row, and the process of such exclusion may be executed by utilizing the maximum height in the new character row.
  • the characters to be excluded from calculation of an average character pitch may be selected with reference to the maximum height in the first character row of the inserted paragraph of the 8-point character rows, which will assure execution of desired extraction of discrete characters with a sufficiently high precision in practical use.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Character Input (AREA)
US07/742,449 1988-05-19 1991-08-05 Character discrimination system employing height-to-width ratio and vertical extraction position information Expired - Fee Related US5138668A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP63-122272 1988-05-19
JP63122272A JP2822189B2 (ja) 1988-05-19 1988-05-19 文字認識装置及び方法

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US07352129 Continuation 1989-05-15

Publications (1)

Publication Number Publication Date
US5138668A true US5138668A (en) 1992-08-11

Family

ID=14831855

Family Applications (1)

Application Number Title Priority Date Filing Date
US07/742,449 Expired - Fee Related US5138668A (en) 1988-05-19 1991-08-05 Character discrimination system employing height-to-width ratio and vertical extraction position information

Country Status (6)

Country Link
US (1) US5138668A (de)
JP (1) JP2822189B2 (de)
KR (1) KR890017630A (de)
DE (1) DE3916323A1 (de)
FR (1) FR2631723A1 (de)
GB (1) GB2218839B (de)

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5321768A (en) * 1992-09-22 1994-06-14 The Research Foundation, State University Of New York At Buffalo System for recognizing handwritten character strings containing overlapping and/or broken characters
US5343537A (en) * 1991-10-31 1994-08-30 International Business Machines Corporation Statistical mixture approach to automatic handwriting recognition
WO1994028505A1 (en) * 1993-05-20 1994-12-08 Aha| Software Corporation Method and apparatus for grouping and manipulating electronic representations of handwriting, printing and drawings
US5396566A (en) * 1993-03-04 1995-03-07 International Business Machines Corporation Estimation of baseline, line spacing and character height for handwriting recognition
WO1995008158A1 (en) * 1993-09-17 1995-03-23 Fficiency Software, Inc. Universal symbolic handwriting recognition system
US5410611A (en) * 1993-12-17 1995-04-25 Xerox Corporation Method for identifying word bounding boxes in text
WO1995030965A1 (en) * 1994-05-10 1995-11-16 Motorola Inc. Method for recognizing handwritten input
US5535287A (en) * 1990-09-03 1996-07-09 Canon Kabushiki Kaisha Method of and apparatus for separating image
US5557691A (en) * 1992-06-30 1996-09-17 Fujitsu Limited Image processing system
US5563964A (en) * 1990-05-15 1996-10-08 Canon Kabushiki Kaisha Method and apparatus for processing a plurality of designated areas of an image
US5572602A (en) * 1993-02-25 1996-11-05 Fujitsu Limited Image extraction system for extracting patterns such as characters, graphics and symbols from image having frame formed by straight line portions
US5581633A (en) * 1993-06-11 1996-12-03 Fujitsu Limited Method and apparatus for segmenting a character and for extracting a character string based on a histogram
US5596350A (en) * 1993-08-02 1997-01-21 Apple Computer, Inc. System and method of reflowing ink objects
US5684891A (en) * 1991-10-21 1997-11-04 Canon Kabushiki Kaisha Method and apparatus for character recognition
US5717939A (en) * 1991-11-18 1998-02-10 Compaq Computer Corporation Method and apparatus for entering and manipulating spreadsheet cell data
US5729630A (en) * 1990-05-14 1998-03-17 Canon Kabushiki Kaisha Image processing method and apparatus having character recognition capabilities using size or position information
US5742704A (en) * 1993-04-30 1998-04-21 Fuji Xerox Co., Ltd. Image coding apparatus capable of coding in accordance with an image type
US5751850A (en) * 1993-06-30 1998-05-12 International Business Machines Corporation Method for image segmentation and classification of image elements for documents processing
US5757979A (en) * 1991-10-30 1998-05-26 Fuji Electric Co., Ltd. Apparatus and method for nonlinear normalization of image
US5757957A (en) * 1991-11-29 1998-05-26 Ricoh Company, Ltd. Apparatus and method for area separation for image, having improved separation accuracy
US5774582A (en) * 1995-01-23 1998-06-30 Advanced Recognition Technologies, Inc. Handwriting recognizer with estimation of reference lines
US5825920A (en) * 1991-01-28 1998-10-20 Hitachi, Ltd. Method and unit for binary processing in image processing unit and method and unit for recognizing characters
US5835632A (en) * 1995-03-08 1998-11-10 Canon Kabushiki Kaisha Image processing method and an image processing apparatus
US5907630A (en) * 1993-07-07 1999-05-25 Fujitsu Limited Image extraction system
US5911005A (en) * 1994-11-18 1999-06-08 Ricoh Company, Ltd. Character recognition method and system
US5991439A (en) * 1995-05-15 1999-11-23 Sanyo Electric Co., Ltd Hand-written character recognition apparatus and facsimile apparatus
US6005976A (en) * 1993-02-25 1999-12-21 Fujitsu Limited Image extraction system for extracting patterns such as characters, graphics and symbols from image having frame formed by straight line portions
US6256408B1 (en) * 1994-04-28 2001-07-03 International Business Machines Corporation Speed and recognition enhancement for OCR using normalized height/width position
US6587587B2 (en) 1993-05-20 2003-07-01 Microsoft Corporation System and methods for spacing, storing and recognizing electronic representations of handwriting, printing and drawings
US20080193052A1 (en) * 1999-05-25 2008-08-14 Silverbrook Research Pty Ltd Method of interpreting handwritten data inputted on a printed form
CN105095890A (zh) * 2014-04-25 2015-11-25 广州市动景计算机科技有限公司 图像中字符分割方法及装置
USD751573S1 (en) 2012-06-13 2016-03-15 Microsoft Corporation Display screen with animated graphical user interface
US10114889B2 (en) * 2012-06-27 2018-10-30 Beijing Qihoo Technology Company Limited System and method for filtering keywords

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2597006B2 (ja) * 1989-04-18 1997-04-02 シャープ株式会社 矩形座標抽出方法
CA2037173C (en) * 1990-03-30 1996-01-09 Hirofumi Kameyama Character recognizing system
US5850476A (en) * 1995-12-14 1998-12-15 Xerox Corporation Automatic method of identifying drop words in a document image without performing character recognition
US5848191A (en) * 1995-12-14 1998-12-08 Xerox Corporation Automatic method of generating thematic summaries from a document image without performing character recognition
US5892842A (en) * 1995-12-14 1999-04-06 Xerox Corporation Automatic method of identifying sentence boundaries in a document image
KR102256667B1 (ko) 2020-03-23 2021-05-26 주식회사 신한디에스 문서 인식 방법 및 그 장치

Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB1144319A (en) * 1965-10-24 1969-03-05 Ibm Character recognition systems
GB1337159A (en) * 1970-09-25 1973-11-14 Ibm Character recognition apparatus
US3846752A (en) * 1972-10-02 1974-11-05 Hitachi Ltd Character recognition apparatus
GB1442273A (en) * 1973-02-21 1976-07-14 Nederlanden Staat Method and device for reading characters preferably digits
US4045773A (en) * 1974-11-13 1977-08-30 Hitachi, Ltd. Pattern segmenting system for a pattern recognizing device
US4162482A (en) * 1977-12-07 1979-07-24 Burroughs Corporation Pre-processing and feature extraction system for character recognition
US4193056A (en) * 1977-05-23 1980-03-11 Sharp Kabushiki Kaisha OCR for reading a constraint free hand-written character or the like
WO1980002761A1 (en) * 1979-06-01 1980-12-11 Dest Data Corp Apparatus and method for separation of optical character recognition data
US4284975A (en) * 1978-12-12 1981-08-18 Nippon Telegraph & Telephone Public Corp. On-line pattern recognition system for hand-written characters
US4317109A (en) * 1979-05-18 1982-02-23 Nippon Telegraph & Telephone Public Corp. Pattern recognition system for hand-written characters operating on an on-line real-time basis
US4365234A (en) * 1980-10-20 1982-12-21 Hendrix Electronics, Inc. Segmentation system and method for optical character scanning
US4377803A (en) * 1980-07-02 1983-03-22 International Business Machines Corporation Algorithm for the segmentation of printed fixed pitch documents
US4527283A (en) * 1980-02-26 1985-07-02 Tokyo Keiki Company Limited Character information separating apparatus for printed character reading systems
US4562594A (en) * 1983-09-29 1985-12-31 International Business Machines Corp. (Ibm) Method and apparatus for segmenting character images
US4594732A (en) * 1983-03-01 1986-06-10 Nec Corporation Letter pitch detection system
US4610025A (en) * 1984-06-22 1986-09-02 Champollion Incorporated Cryptographic analysis system
US4635290A (en) * 1983-12-20 1987-01-06 Nec Corporation Sectioning apparatus and method for optical character reader systems
GB2182796A (en) * 1985-09-27 1987-05-20 Sony Corp Character recognition system
US4932065A (en) * 1988-11-16 1990-06-05 Ncr Corporation Universal character segmentation scheme for multifont OCR images
US4933977A (en) * 1987-11-05 1990-06-12 Glory Kogyo Kabushiki Kaisha Method for identifying plural connected figures
US4959868A (en) * 1984-10-31 1990-09-25 Canon Kabushiki Kaisha Image processing system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6118079A (ja) * 1984-07-05 1986-01-25 Fujitsu Ltd パタ−ン分離装置
JPS61117670A (ja) * 1984-11-13 1986-06-05 Fujitsu Ltd 文字切り出し処理方式
JPH0782525B2 (ja) * 1985-07-09 1995-09-06 松下電器産業株式会社 文字認識装置
JPS6316392A (ja) * 1986-07-08 1988-01-23 Matsushita Electric Ind Co Ltd 文字認識装置

Patent Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB1144319A (en) * 1965-10-24 1969-03-05 Ibm Character recognition systems
GB1337159A (en) * 1970-09-25 1973-11-14 Ibm Character recognition apparatus
US3846752A (en) * 1972-10-02 1974-11-05 Hitachi Ltd Character recognition apparatus
GB1442273A (en) * 1973-02-21 1976-07-14 Nederlanden Staat Method and device for reading characters preferably digits
US4045773A (en) * 1974-11-13 1977-08-30 Hitachi, Ltd. Pattern segmenting system for a pattern recognizing device
US4193056A (en) * 1977-05-23 1980-03-11 Sharp Kabushiki Kaisha OCR for reading a constraint free hand-written character or the like
US4162482A (en) * 1977-12-07 1979-07-24 Burroughs Corporation Pre-processing and feature extraction system for character recognition
US4284975A (en) * 1978-12-12 1981-08-18 Nippon Telegraph & Telephone Public Corp. On-line pattern recognition system for hand-written characters
US4317109A (en) * 1979-05-18 1982-02-23 Nippon Telegraph & Telephone Public Corp. Pattern recognition system for hand-written characters operating on an on-line real-time basis
WO1980002761A1 (en) * 1979-06-01 1980-12-11 Dest Data Corp Apparatus and method for separation of optical character recognition data
US4527283A (en) * 1980-02-26 1985-07-02 Tokyo Keiki Company Limited Character information separating apparatus for printed character reading systems
US4377803A (en) * 1980-07-02 1983-03-22 International Business Machines Corporation Algorithm for the segmentation of printed fixed pitch documents
US4365234A (en) * 1980-10-20 1982-12-21 Hendrix Electronics, Inc. Segmentation system and method for optical character scanning
US4594732A (en) * 1983-03-01 1986-06-10 Nec Corporation Letter pitch detection system
US4562594A (en) * 1983-09-29 1985-12-31 International Business Machines Corp. (Ibm) Method and apparatus for segmenting character images
US4635290A (en) * 1983-12-20 1987-01-06 Nec Corporation Sectioning apparatus and method for optical character reader systems
US4610025A (en) * 1984-06-22 1986-09-02 Champollion Incorporated Cryptographic analysis system
US4959868A (en) * 1984-10-31 1990-09-25 Canon Kabushiki Kaisha Image processing system
GB2182796A (en) * 1985-09-27 1987-05-20 Sony Corp Character recognition system
US4933977A (en) * 1987-11-05 1990-06-12 Glory Kogyo Kabushiki Kaisha Method for identifying plural connected figures
US4932065A (en) * 1988-11-16 1990-06-05 Ncr Corporation Universal character segmentation scheme for multifont OCR images

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5729630A (en) * 1990-05-14 1998-03-17 Canon Kabushiki Kaisha Image processing method and apparatus having character recognition capabilities using size or position information
US5563964A (en) * 1990-05-15 1996-10-08 Canon Kabushiki Kaisha Method and apparatus for processing a plurality of designated areas of an image
US5535287A (en) * 1990-09-03 1996-07-09 Canon Kabushiki Kaisha Method of and apparatus for separating image
US5825920A (en) * 1991-01-28 1998-10-20 Hitachi, Ltd. Method and unit for binary processing in image processing unit and method and unit for recognizing characters
US5684891A (en) * 1991-10-21 1997-11-04 Canon Kabushiki Kaisha Method and apparatus for character recognition
US5757979A (en) * 1991-10-30 1998-05-26 Fuji Electric Co., Ltd. Apparatus and method for nonlinear normalization of image
US5343537A (en) * 1991-10-31 1994-08-30 International Business Machines Corporation Statistical mixture approach to automatic handwriting recognition
US5717939A (en) * 1991-11-18 1998-02-10 Compaq Computer Corporation Method and apparatus for entering and manipulating spreadsheet cell data
US5757957A (en) * 1991-11-29 1998-05-26 Ricoh Company, Ltd. Apparatus and method for area separation for image, having improved separation accuracy
US5557691A (en) * 1992-06-30 1996-09-17 Fujitsu Limited Image processing system
US5321768A (en) * 1992-09-22 1994-06-14 The Research Foundation, State University Of New York At Buffalo System for recognizing handwritten character strings containing overlapping and/or broken characters
US6005976A (en) * 1993-02-25 1999-12-21 Fujitsu Limited Image extraction system for extracting patterns such as characters, graphics and symbols from image having frame formed by straight line portions
US5572602A (en) * 1993-02-25 1996-11-05 Fujitsu Limited Image extraction system for extracting patterns such as characters, graphics and symbols from image having frame formed by straight line portions
US5396566A (en) * 1993-03-04 1995-03-07 International Business Machines Corporation Estimation of baseline, line spacing and character height for handwriting recognition
US5742704A (en) * 1993-04-30 1998-04-21 Fuji Xerox Co., Ltd. Image coding apparatus capable of coding in accordance with an image type
US7203903B1 (en) 1993-05-20 2007-04-10 Microsoft Corporation System and methods for spacing, storing and recognizing electronic representations of handwriting, printing and drawings
US5517578A (en) * 1993-05-20 1996-05-14 Aha! Software Corporation Method and apparatus for grouping and manipulating electronic representations of handwriting, printing and drawings
US6651221B1 (en) 1993-05-20 2003-11-18 Microsoft Corporation System and methods for spacing, storing and recognizing electronic representations of handwriting, printing and drawings
US6587587B2 (en) 1993-05-20 2003-07-01 Microsoft Corporation System and methods for spacing, storing and recognizing electronic representations of handwriting, printing and drawings
WO1994028505A1 (en) * 1993-05-20 1994-12-08 Aha| Software Corporation Method and apparatus for grouping and manipulating electronic representations of handwriting, printing and drawings
US5581633A (en) * 1993-06-11 1996-12-03 Fujitsu Limited Method and apparatus for segmenting a character and for extracting a character string based on a histogram
US5751850A (en) * 1993-06-30 1998-05-12 International Business Machines Corporation Method for image segmentation and classification of image elements for documents processing
US5907630A (en) * 1993-07-07 1999-05-25 Fujitsu Limited Image extraction system
US5596350A (en) * 1993-08-02 1997-01-21 Apple Computer, Inc. System and method of reflowing ink objects
US5454046A (en) * 1993-09-17 1995-09-26 Penkey Corporation Universal symbolic handwriting recognition system
WO1995008158A1 (en) * 1993-09-17 1995-03-23 Fficiency Software, Inc. Universal symbolic handwriting recognition system
US5410611A (en) * 1993-12-17 1995-04-25 Xerox Corporation Method for identifying word bounding boxes in text
US6256408B1 (en) * 1994-04-28 2001-07-03 International Business Machines Corporation Speed and recognition enhancement for OCR using normalized height/width position
US5600735A (en) * 1994-05-10 1997-02-04 Motorola, Inc. Method of recognizing handwritten input
AU672558B2 (en) * 1994-05-10 1996-10-03 Motorola, Inc. Method for recognizing handwritten input
WO1995030965A1 (en) * 1994-05-10 1995-11-16 Motorola Inc. Method for recognizing handwritten input
US5911005A (en) * 1994-11-18 1999-06-08 Ricoh Company, Ltd. Character recognition method and system
US5774582A (en) * 1995-01-23 1998-06-30 Advanced Recognition Technologies, Inc. Handwriting recognizer with estimation of reference lines
US5835632A (en) * 1995-03-08 1998-11-10 Canon Kabushiki Kaisha Image processing method and an image processing apparatus
US5991439A (en) * 1995-05-15 1999-11-23 Sanyo Electric Co., Ltd Hand-written character recognition apparatus and facsimile apparatus
US20080193052A1 (en) * 1999-05-25 2008-08-14 Silverbrook Research Pty Ltd Method of interpreting handwritten data inputted on a printed form
US20090302107A1 (en) * 1999-05-25 2009-12-10 Silverbrook Research Pty Ltd Method For Online Purchasing Using Printed Form
US7958157B2 (en) * 1999-05-25 2011-06-07 Silverbrook Research Pty Ltd Method of interpreting handwritten data inputted on a printed form
US8010414B2 (en) 1999-05-25 2011-08-30 Silverbrook Research Pty Ltd Method for online purchasing using printed form
USD751573S1 (en) 2012-06-13 2016-03-15 Microsoft Corporation Display screen with animated graphical user interface
US10114889B2 (en) * 2012-06-27 2018-10-30 Beijing Qihoo Technology Company Limited System and method for filtering keywords
CN105095890A (zh) * 2014-04-25 2015-11-25 广州市动景计算机科技有限公司 图像中字符分割方法及装置
CN105095890B (zh) * 2014-04-25 2019-02-26 广州市动景计算机科技有限公司 图像中字符分割方法及装置

Also Published As

Publication number Publication date
GB2218839B (en) 1992-04-29
GB2218839A (en) 1989-11-22
KR890017630A (ko) 1989-12-16
JPH01292486A (ja) 1989-11-24
JP2822189B2 (ja) 1998-11-11
DE3916323A1 (de) 1989-11-30
FR2631723B1 (de) 1995-04-28
FR2631723A1 (fr) 1989-11-24
GB8911303D0 (en) 1989-07-05

Similar Documents

Publication Publication Date Title
US5138668A (en) Character discrimination system employing height-to-width ratio and vertical extraction position information
US4850025A (en) Character recognition system
US5058182A (en) Method and apparatus for handwritten character recognition
US4903312A (en) Character recognition with variable subdivisions of a character region
US5410611A (en) Method for identifying word bounding boxes in text
JP2694101B2 (ja) パターン認識と妥当性検査の方法及び装置
EP0483391B1 (de) Automatische Unterschriftsprüfung
US6327384B1 (en) Character recognition apparatus and method for recognizing characters
US5212739A (en) Noise tolerant optical character recognition system
US9239946B2 (en) Method and apparatus for detecting and processing specific pattern from image
US4998285A (en) Character recognition apparatus
US5640466A (en) Method of deriving wordshapes for subsequent comparison
US7340110B2 (en) Device and method for correcting skew of an object in an image
JPH05242292A (ja) 分離方法
WO1991017519A1 (en) Row-by-row segmentation and thresholding for optical character recognition
US5502777A (en) Method and apparatus for recognizing table and figure having many lateral and longitudinal lines
US5703963A (en) Character recognition apparatus that subdivides a character into subregions to obtain feature vectors
US5535287A (en) Method of and apparatus for separating image
US5253303A (en) Character recognizing method and apparatus thereof
JPH0749926A (ja) 文字認識装置
JP3104355B2 (ja) 特徴抽出装置
US5272765A (en) System for processing character images
JP3006823B2 (ja) 文字および単語の認識方式
JP2612383B2 (ja) 文字認識処理方式
JPH11120291A (ja) パタン認識システム

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Free format text: PAYER NUMBER DE-ASSIGNED (ORIGINAL EVENT CODE: RMPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
FP Lapsed due to failure to pay maintenance fee

Effective date: 20040811

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362