WO2015078383A1 - 手写字符识别方法和系统 - Google Patents

手写字符识别方法和系统 Download PDF

Info

Publication number
WO2015078383A1
WO2015078383A1 PCT/CN2014/092366 CN2014092366W WO2015078383A1 WO 2015078383 A1 WO2015078383 A1 WO 2015078383A1 CN 2014092366 W CN2014092366 W CN 2014092366W WO 2015078383 A1 WO2015078383 A1 WO 2015078383A1
Authority
WO
WIPO (PCT)
Prior art keywords
stroke
template
character
standard
matching
Prior art date
Application number
PCT/CN2014/092366
Other languages
English (en)
French (fr)
Inventor
江淑红
吴波
Original Assignee
夏普株式会社
江淑红
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 夏普株式会社, 江淑红 filed Critical 夏普株式会社
Priority to JP2016532526A priority Critical patent/JP6275840B2/ja
Publication of WO2015078383A1 publication Critical patent/WO2015078383A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/32Digital ink
    • G06V30/36Matching; Classification
    • G06V30/373Matching; Classification using a special pattern or subpattern alphabet
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/768Arrangements for image or video recognition or understanding using pattern recognition or machine learning using context analysis, e.g. recognition aided by known co-occurring patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/28Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet
    • G06V30/293Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet of characters other than Kanji, Hiragana or Katakana

Definitions

  • the present invention relates to human-computer interaction techniques and, more particularly, to a handwritten character recognition method and system.
  • Handwritten character recognition technology is also a part of natural human-computer interaction.
  • Most handwritten character recognition systems recognize the user after writing all the strokes of the character. This system has a slow input speed when inputting characters with multiple strokes. In addition, it is difficult for the user to input all strokes of complex characters. .
  • some predictive handwritten character recognition methods have been proposed, which can recognize characters after inputting a part of strokes of characters. Some of these methods are based on a predictive character database that provides character candidates including one or more given strokes based on the frequency of use of all characters stored in a database including a given stroke. Another type of method is to give a prediction result after the user inputs the character radical. Similarly, such methods also provide character candidates based on frequency of use.
  • Chinese Patent Application No. 201210284415 entitled “Fast Input Method Based on Non-Complete Recognition” proposes a handwritten character recognition method for giving prediction candidates based on the prediction rate. This method produces a complete template for each Chinese character and a series of derived sub-templates.
  • the derived child template can be the character radical of a Chinese character or any other incomplete part.
  • Each sub-template has an "integrity weight" based on its integrity level relative to the full kanji character.
  • the present invention proposes a partial stroke prediction handwritten character recognition method and system, which can accurately give a prediction candidate, and can recognize the character to be input after the user inputs a part of the stroke of the character by handwriting.
  • a handwritten character recognition method which first receives a handwritten trajectory input by a user, and then matches the handwritten trajectory with at least one stroke template to determine the degree of matching of the stroke template. Finally, according to the matching degree, the standard characters corresponding to the matching stroke template are output.
  • the stroke template is a matching template of standard characters
  • the matching template of the at least one standard character comprises a complete stroke template of the at least one standard character and an incomplete stroke standard character of the at least one standard character A stroke template, wherein the incomplete stroke standard character corresponding to the incomplete stroke template of the at least one standard character does not constitute part or all of any other standard characters.
  • the matching and outputting steps are performed each time a stroke input is received.
  • At least one stroke template to be matched with the handwritten trajectory is all matching templates of all standard characters.
  • the step of matching the handwriting trajectory with the at least one stroke template to determine the matching degree of the stroke template further comprises: according to the difference between the number of strokes that have been input and the number of strokes corresponding to the standard characters of the matching stroke template, The degree of matching is weighted.
  • the standard characters corresponding to the most matching stroke template are also displayed in the background of the handwritten trajectory.
  • the incomplete stroke template is generated by the following steps:
  • Each standard character is represented by an index number corresponding to each stroke/head of the standard character, and an index number sequence obtained by combining the stroke order of the standard character;
  • the unique index number sequence being a partial index number sequence from a start index number to the different index number in an index number sequence of the at least one standard character
  • the unique index number sequence further includes: an index number after the different index numbers.
  • the incomplete stroke standard character corresponding to the unique index number sequence further includes: including and starting from the index number A stroke/head that corresponds to a partial index number sequence of an index number before the different index number, and an incomplete stroke standard character of a stroke-by-stroke portion of the stroke/head corresponding to the different index number.
  • the corresponding stroke/ radical is an incomplete stroke template for one of the specific strokes/heads.
  • the specific stroke/ radical can be And one of the " ⁇ ".
  • the incomplete stroke template of the at least one standard character further comprises a template generated by the following steps:
  • the generated incomplete stroke template and the stroke template corresponding to the intermediate standard character are used as the incomplete stroke template of each of the at least two standard characters.
  • the standard character may be a character of one of Chinese, Japanese, and Korean.
  • a handwritten character recognition system comprising: a handwriting input unit for receiving a handwritten trajectory input by a user; a template storage library, a matching template storing standard characters; and a template matching unit Configuring to match the handwritten trajectory with at least one template to determine a matching degree of the template; and outputting the unit configured to output a standard character corresponding to the matched template according to the matching degree determined by the template matching unit, wherein at least one The matching template of the standard character includes a complete stroke template of the at least one standard character and an incomplete stroke template of the incomplete stroke standard character of the at least one standard character, wherein the incomplete stroke template of the at least one standard character corresponds to Incomplete stroke standard characters do not form part or all of any other standard characters.
  • each character is represented by a series of "strokes/partial index numbers", and "unique index (stroke/head) of each character is recorded” , as a unique stroke or radical for representing the character. Then, an incomplete stroke template of the character is generated according to the "unique index".
  • the handwritten character recognition method and system according to the present invention gives prediction candidates based on the "unique index" of characters, and therefore, the prediction candidates for each character are not confused with other characters. Moreover, the character can be recognized after the user has entered one or more strokes of the character. This saves work load and improves input efficiency.
  • FIG. 1 is a schematic block diagram showing a handwritten character recognition system according to an embodiment of the present invention.
  • FIG. 2 shows a flow chart of a method for generating an incomplete stroke template in accordance with an embodiment of the present invention.
  • FIG. 3 shows an example of basic strokes and radicals in accordance with an embodiment of the present invention.
  • Figure 4 illustrates a portion of a "stroke index table" in accordance with an embodiment of the present invention
  • FIG. 5 shows an example of a character stroke information sorting table according to an embodiment of the present invention
  • FIG. 6 shows the same index number and unique index number in the character stroke information sorting table shown in FIG. 5;
  • Figure 7 shows a training sample of the complete stroke character of the "whip"
  • Figure 8 shows a training sample of an incomplete stroke template of the "whip” obtained from the complete stroke template of the "whip” shown in Figure 7;
  • FIG. 9 is a flowchart showing a handwritten character recognition method according to an embodiment of the present invention.
  • FIG. 10 shows an example of an operation result of a handwritten character recognition method according to an embodiment of the present invention
  • FIG. 10(a) shows a handwritten trajectory input by a user
  • FIG. 10(b) shows a height matching with a handwritten trajectory.
  • Incomplete stroke template Figure 10(c) shows the output of the recognized character in the background of the handwritten track;
  • FIG. 11 shows an example of an incomplete stroke template and a complete stroke template of the "bad" word according to an embodiment of the present invention
  • FIG. 12 illustrates an example of constructing intermediate characters in accordance with an embodiment of the present invention
  • Figure 13 shows a schematic stroke sample.
  • FIG. 1 is a schematic block diagram showing a handwritten character recognition system 100 in accordance with an embodiment of the present invention.
  • the system 100 includes four units: a handwriting input unit 110, a template repository 120, a template matching unit 130, and an output unit 140. The function of each unit is described below.
  • Handwriting input unit 110 This unit is used to receive handwritten input data from the user.
  • the unit may be a sensor that recognizes the contact of the user's finger with the touchpad.
  • the unit may be an infrared sensor.
  • Template Repository 120 This unit is used to store matching templates for each standard character.
  • Template matching unit 130 The unit determines the matching degree of the matching template by matching the handwritten input data with the incomplete stroke template and/or the complete stroke unit.
  • Output unit 140 This unit outputs the recognition result to the user.
  • the unit can output standard characters corresponding to the template with the highest degree of matching.
  • the unit also performs optimization of all matching results and sorts the candidates according to predetermined rules. For example, for an electronic device such as a mobile phone whose display screen size is limited, the output unit 140 outputs only the top ten matching results with the highest matching degree. In one embodiment, the output may be, for example, ordered by matching degree.
  • FIG. 1 also shows that the system can also include a pre-processing unit 150.
  • Pre-processing unit 150 This unit is used for smoothing, linear normalization and/or non-linear normalization of handwritten input data. The purpose of this unit is to make the handwritten input data smoother to facilitate matching in the template matching unit 130.
  • a key technique of the present invention is the definition of an incomplete stroke template.
  • each character consists of strokes and/or radicals. Different characters may include the same strokes and radicals. But every character It must contain a unique stroke or radical sequence that is different from other characters.
  • this unique stroke and radical is used to define an incomplete stroke template for each character. That is, constructing an incomplete stroke template of the character, the incomplete stroke character corresponding to the incomplete stroke template includes the unique stroke/partial sequence, such that the incomplete stroke character does not constitute part or all of any other characters, ie Distinguish from other characters.
  • FIG. 2 shows a flow diagram of an incomplete stroke template generation method 200 in accordance with an embodiment of the present invention.
  • step S210 basic strokes and radicals are defined as shown in FIG.
  • step S220 for the defined basic strokes and radicals, a "stroke index table" is defined, and an index number is assigned to each stroke/head in the basic stroke and the radical.
  • Figure 4 shows a portion of the "stroke index table”.
  • each character is represented by an index number corresponding to each stroke/part of the character, an index number sequence obtained in accordance with the stroke order of the character, and the index number of each character is sorted. , can get the stroke information of the character. Therefore, a table that sorts the stroke information of each character is defined.
  • Figure 5 shows schematically a part of the table.
  • step S240 the unique stroke/head of each character is found, and a "single index number sequence" is obtained. That is, the index number of each character different from other characters is first recognized.
  • Fig. 6 shows the same character stroke information sorting table as Fig. 5, but the same portions are illustrated.
  • the dotted box on the left contains the six characters shown together. Strokes.
  • the middle dashed box contains the common strokes of the first five characters.
  • the dotted box on the right indicates the common radical of "Jin” and " ⁇ ".
  • the last stroke/ radical of the second, third, fifth, and sixth characters represents the unique stroke/ radical of the corresponding character. Therefore, a "unique index table" is obtained. That is, for “ ⁇ ”, the index number sequence is "1-3-3-7-6-1-211", where "211" indicates its unique stroke/ radical. For "Jin”, the index number sequence is "1-3-3-7-6-1-236", but there is no unique stroke/ radical. For " ⁇ ”, the index number sequence is "1-3-3-7-6-1-236-233", where "233” indicates its unique stroke/ radical. For “strict”, the index number sequence is "1-3-3-7-6-201", where "201" indicates its unique stroke/ radical.
  • the unique index number sequence of the character is determined, and the unique index number sequence is a partial index number sequence from the start index number to the different index number in the sequence of index numbers of one character.
  • the unique index number sequence is a partial index number sequence from the start index number to the different index number in the sequence of index numbers of one character.
  • the index number sequence of "whip” is, for example, "302-104-1-3-10-1-1-5-8", wherein the third index number "1" indicates its unique stroke/ radical.
  • the unique index number sequence of "whip” can be "302-104-1", “302-104-1-3", “302-104-1-3-10", “302-104-1-3- 10-1", “302-104-1-3-10-1-1” ⁇ "302-104-1-3-10-1-1-5" and "302-104-1-3-10- 1-1-5-8".
  • the unique index number sequence also includes the index number after the unique index number.
  • step S250 incomplete stroke characters of each standard character are generated based on the obtained "exclusive index number sequence”.
  • the strokes/parts corresponding to the unique index number may include at least two strokes.
  • the incomplete stroke character corresponding to the unique index number sequence includes: a stroke/part corresponding to a partial index number sequence from the start index number to an index number preceding the unique index number First, plus the incomplete stroke character of the stroke-by-stroke portion of the stroke/ radical corresponding to the unique index number.
  • step S260 according to the incomplete stroke character of each character, the incomplete stroke sample is obtained from the complete stroke sample of the standard character, thereby obtaining an incomplete stroke template of the standard character.
  • Figure 7 shows a training sample of the complete stroke character of the "whip".
  • the index number sequence in which the "whip” can be obtained is, for example, "302-104-1-3-10-1-1-5-8", wherein the third index number "1" indicates its unique stroke/head. Therefore, according to step S250, it can be determined that the index number sequence of the incomplete character of "whip” is "302-104-1", “302-104-1-3", “302-104-1-3-10", “ 302-104-1-3-10-1", “302-104-1-3-10-1-1” and "302-104-1-3-10-1-5".
  • Figure 8 shows an incomplete stroke sample of the "whip” obtained from the complete stroke sample of the "whip” shown in Figure 7.
  • FIG. 9 shows a flow chart of a handwritten character recognition method 900 in accordance with an embodiment of the present invention.
  • step S910 a handwritten trajectory input by the user is received.
  • the user inputs the handwritten trajectory using the handwriting input unit.
  • the handwritten trajectory is preprocessed.
  • the purpose of the pre-processing is to make the handwriting trajectory smoother for subsequent operations.
  • step S930 the input handwritten trajectory is matched with the stroke template to determine the degree of matching of the stroke template.
  • step S940 the standard character corresponding to the matched stroke template is output, that is, the standard character corresponding to the handwritten trajectory is recognized.
  • the system stores an incomplete stroke template. Therefore, steps S920-S940 can be performed each time a stroke input is received to adjust the recognition result. For example, when the user wants to input the word “evil", after inputting "sub”, step S930 determines that the characters corresponding to the stroke template whose matching degree is high to low are, for example, “sub”, “witch”, “evil”. “”, “ugly”, “positive”, thereby sequentially outputting “Asia”, “Witch”, “Evil”, “Ugly”, “Positive” in step S940.
  • the step S930 determines that the characters corresponding to the stroke template with the matching degree from high to low are, for example, “evil”, “seen”, “terror”, “love”, “ “Jin”, thereby sequentially outputting “evil”, “sense”, “terror”, “love”, “jin” in step S940.
  • the top ten candidates in the recognition result of the current input stroke may be output, and the most likely recognition result, that is, the character with the highest matching degree, is displayed in the background of the input stroke.
  • the user inputs an incomplete stroke of the Chinese character "whip" as shown in FIG. 10(a).
  • the handwritten character recognition system matches the current input stroke with all the incomplete stroke templates of all the characters, and finds that the incomplete stroke template "whip" is the most likely candidate, as shown in FIG. 10(b).
  • the first recognition result "whip” is displayed in the background of the handwritten track, as shown in Fig. 10(c).
  • some incomplete stroke characters similar to the full stroke character are discarded, that is, the incomplete stroke of the specific stroke/ radical is discarded from the difference between the full stroke character and the full stroke character.
  • character For example, for the character “evil”, its unique radical is 226 (heart), which means it has three incomplete stroke characters, ie However, the first incomplete stroke character May be confused with "Asia.” Because of the difference in strokes It may be misidentified by the noise or smudge on the input screen when the word "sub" is entered. Therefore, for the character "evil”, the first incomplete stroke character is discarded, leaving only the other two incomplete stroke characters.
  • Figure 11 shows two incomplete stroke samples and a complete stroke sample extracted from "evil”.
  • the specific stroke/step may be And one of the " ⁇ ”.
  • some intermediate characters are constructed as a common part of similar characters. For example, for “awake,”” ⁇ ,” and “ ⁇ ,” Is the common part of these three characters. This means that the common part can be treated as the intermediate character of the three characters and an incomplete stroke template is generated for the intermediate character, such as the template extracted from the sample shown in FIG. will As incomplete stroke characters of "awake”, “ ⁇ ” and " ⁇ ".
  • Fig. 12 shows an example of constructing intermediate characters. As shown in FIG.
  • the common part in the index number sequence of "awake”, “ ⁇ ”, and “ ⁇ ” is "293-236", which corresponds to For the public part, an incomplete stroke template is generated, and the incomplete stroke template and the public part stroke template thus generated are regarded as incomplete stroke templates of “awake”, “ ⁇ ” and “ ⁇ ”.
  • a table for identifying the maximum number of strokes for each standard character is also defined. Then, when calculating the matching degree, different weights of the number of strokes are given, so that the recognition result can be adjusted according to the difference between the number of input strokes and the number of strokes of the matching template.
  • the handwritten character recognition method and system proposed in accordance with an embodiment of the present invention can recognize characters such as Chinese, Japanese, and Korean.
  • the performance of the predictive handwritten character recognition system according to the present invention is evaluated using the recognition rate and the number of input strokes.
  • System with high recognition rate and only need to input fewer strokes It is a high performance system. Otherwise, a system with a low recognition rate or the need to input a large number of strokes is a low performance system.
  • the "number of strokes saved" is the difference between the total number of strokes and the number of strokes actually input by the user, and the total number of strokes is the total number of strokes of the standard character.
  • the handwritten character recognition method and system according to an embodiment of the present invention can be applied to, for example, an electronic whiteboard, a tablet PC, a desktop PC, a mobile phone, a PDA, and other electronic devices that support handwriting input.
  • the user can input on the screen with a finger, a stylus, etc., and the handwritten character recognition method and system according to an embodiment of the present invention outputs the recognition result on the screen accordingly.
  • the computer program product is an embodiment having a computer readable medium encoded with computer program logic, the computer program logic providing related operations when provided on a computing device to provide The above technical solution.
  • the computer program logic When executed on at least one processor of a computing system, the computer program logic causes the processor to perform the operations (methods) described in the embodiments of the present invention.
  • Such an arrangement of the present invention is typically provided as software, code and/or its set or encoded on a computer readable medium such as an optical medium (e.g., CD-ROM), floppy disk or hard disk.
  • His data structure, or other medium such as one or more ROM or RAM or firmware or microcode on a PROM chip, or an application specific integrated circuit (ASIC), or a downloadable software image, shared database in one or more modules Wait.
  • Software or firmware or such a configuration may be installed on the computing device such that one or more processors in the computing device perform the techniques described in this embodiment of the invention.
  • a software process that operates in conjunction with a computing device such as a group of data communication devices or other entities, may also provide the device in accordance with the present invention.
  • the device according to the invention may also be distributed between multiple software processes on multiple data communication devices, or all software processes running on a small set of dedicated computers, or all software processes running on a single computer.
  • embodiments of the invention may be implemented as software programs, software and hardware on a computer device, or as separate software and/or separate circuits.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Discrimination (AREA)

Abstract

本发明涉及一种手写字符识别方法和设备,其可以在输入字符的一部分笔画之后就识别所要输入的字符。根据该手写字符识别方法,首先接收用户输入的手写轨迹,然后将手写轨迹与笔画模板进行匹配,确定匹配度。最后,按照匹配度来输出与匹配的笔画模板相对应的标准字符。其中,笔画模板包括字符的完整笔画模板和不完整笔画模板,不完整笔画模板所对应的不完整笔画字符不构成其他任何字符的部分或全部。该方法易于实现,并且具有很好的应用前景。其可以用于在各种电子设备中利用触摸板来输入复杂的手写字符的情况下,以节省工作量并提高输入效率。

Description

手写字符识别方法和系统 技术领域
本发明涉及人机交互技术,更具体地,涉及手写字符识别方法和系统。
背景技术
随着计算机视觉应用的发展,人们对自然人机交互技术有了日益广泛的需求,也提出了更高的要求,手写字符识别技术也是自然人机交互的一部分。大多数手写字符识别系统在用户写完了字符的所有笔画之后进行识别,这种系统在输入有多个笔画的字符时输入速度很慢,此外,对于用户而言,输入复杂字符的所有笔画也有困难。为了解决该问题,提出了一些预测手写字符识别方法,它们可以在输入字符的一部分笔画之后就识别字符。这些方法中的一些方法基于预测字符数据库,系统根据包括给定笔画的数据库中存储的所有字符的使用频率,提供包括一个或多个给定笔画的字符候选。其他的一类方法是在用户输入字符部首之后给出预测结果。类似地,这类方法也根据使用频率来提供字符候选。
题为“基于非完整识别的词语快速输入方法”的中国专利申请No.201210284415提出了一种手写字符识别方法,用于基于预测率来给出预测候选。该方法产生每个汉字字符的完整模板和一系列派生子模板。派生子模板可以是汉字字符的字符部首或任意其他不完整部分。每个子模板基于其相对于完整汉字字符的完整性等级而具有“完整性权重”。执行输入笔画与每个汉字字符的所有子模板之间的模板匹配,之后,将所获得的匹配率与对应子模板的完整性权重相乘;然后,将加权的匹配率的最大值当做完整汉字字符的预测率;最后,基于预测率来呈现预测候选。
但是,对于现有的预测手写字符识别方法,基于使用频率来呈现 字符候选是不精确的。例如,在用户输入非常用字符时,由于非常用字符的使用频率很低,系统无法在预测候选中获得正确的识别结果。
中国专利申请No.201210284415的问题在于,没有给出如何测量“完整性权重”的明确规则。“完整性权重”对于该方法中给出正确的预测候选而言非常重要。但是,在该专利申请中并未描述如何测量并确定该权重。定义测量该权重的合理规则是模糊且复杂的,而这将极大地影响预测效果。此外,该专利申请提到了可以将每个汉字字符的完整模板划分为多个一级派生模板;一级派生模板可以划分为更多个二级派生模板;一些一级派生模板和一些二级派生模板可组合以产生新的派生模板等。按照这种方式,所有汉字字符的模板的数目将非常大,从而需要很大的存储空间来存储大的字符集合的所有模板,并且模板匹配过程为了匹配所有模板将非常繁琐并且复杂。
因此,需要一种能够精确地预测手写输入结果的方法。
发明内容
本发明提出了一种部分笔画预测手写字符识别方法和系统,其可以精确地给出预测候选,可以在用户通过手写输入了字符的一部分笔画之后识别所要输入的字符。
根据本发明的一个方面,提出了一种手写字符识别方法,该方法首先接收用户输入的手写轨迹,然后将手写轨迹与至少一个笔画模板进行匹配,以确定笔画模板的匹配度。最后,按照匹配度,输出与匹配的笔画模板相对应的标准字符。其中,所述笔画模板是标准字符的匹配模板,并且其中,至少一个标准字符的匹配模板包括所述至少一个标准字符的完整笔画模板和所述至少一个标准字符的不完整笔画标准字符的不完整笔画模板,其中所述至少一个标准字符的不完整笔画模板所对应的不完整笔画标准字符不构成任何其他标准字符的部分或全部。
优选地,在每次接收到一个笔画输入之后,执行匹配和输出步骤。
优选地,要与手写轨迹进行匹配的至少一个笔画模板是所有标准字符的所有匹配模板。
优选地,将手写轨迹与至少一个笔画模板进行匹配,以确定笔画模板的匹配度的步骤还包括:根据已输入的笔画数目与对应于进行匹配的笔画模板的标准字符的笔画数目之差,对匹配度进行加权。
优选地,还在手写轨迹的背景中显示与匹配度最高的笔画模板相对应的标准字符。
优选地,所述不完整笔画模板是通过以下步骤产生的:
定义基本笔画和部首;
给基本笔画和部首中的每个笔画/部首分配一个索引号;
用与标准字符的每个笔画/部首相对应的索引号、按照该标准字符的笔画顺序组合得到的索引号序列来表示每个标准字符;
对标准字符的索引号序列进行排序;
识别所述至少一个标准字符与其他标准字符不同的索引号;
确定所述至少一个标准字符的独有索引号序列,所述独有索引号序列是所述至少一个标准字符的索引号序列中从起始索引号到所述不同索引号的部分索引号序列;
产生与所述独有索引号序列相对应的不完整笔画标准字符,作为所述至少一个标准字符的不完整笔画标准字符;
根据所述至少一个标准字符的不完整笔画标准字符,根据所述至少一个标准字符的完整笔画样本,得到与所述不完整笔画标准字符相对应的该标准字符的不完整笔画样本;以及
根据所述至少一个标准字符的不完整笔画样本,得到所述至少一个标准字符的不完整笔画模板。
优选地,所述独有索引号序列还包括:所述不同索引号之后的索引号。
优选地,当与所述不同索引号相对应的笔画/部首包括至少两个笔画时,与所述独有索引号序列相对应的不完整笔画标准字符还包括:包括与从起始索引号到所述不同索引号之前一个索引号的部分索引号序列相对应的笔画/部首、加上与所述不同索引号相对应的笔画/部首的逐笔画部分的不完整笔画标准字符。
优选地,在所得到的不完整笔画模板中丢弃与所述不同索引号相 对应的笔画/部首为特定笔画/部首之一的不完整笔画模板。
优选地,特定笔画/部首可以是
Figure PCTCN2014092366-appb-000001
和“丶”之一。
优选地,所述至少一个标准字符的不完整笔画模板还包括通过以下步骤产生的模板:
确定至少两个标准字符的相同笔画部分,所述相同笔画部分是所述至少两个标准字符的不完整笔画部分;
将该相同笔画部分当作中间标准字符,针对所述中间标准字符,产生其不完整笔画模板;以及
将所产生的不完整笔画模板和与所述中间标准字符相对应的笔画模板作为所述至少两个标准字符中每个标准字符的不完整笔画模板。
优选地,所述标准字符可以是中文、日文和韩文之一的字符。
根据本发明的另一方面,还提出了一种手写字符识别系统,包括:手写输入单元,用于接收用户输入的手写轨迹;模板存储库,存储有标准字符的匹配模板;模板匹配单元,被配置为将手写轨迹与至少一个模板进行匹配,以确定模板的匹配度;以及输出单元,被配置为按照模板匹配单元确定的匹配度,输出与匹配的模板相对应的标准字符,其中,至少一个标准字符的匹配模板包括所述至少一个标准字符的完整笔画模板和所述至少一个标准字符的不完整笔画标准字符的不完整笔画模板,其中所述至少一个标准字符的不完整笔画模板所对应的不完整笔画标准字符不构成任何其他标准字符的部分或全部。
根据本发明实施例,对于诸如中文、日文或韩文的字符,通过一系列“笔画/部首索引号”来表示每个字符,并且记录每个字符的“独有索引(笔画/部首)”,作为用于表示该字符的独有笔画或者部首。然后,根据所述“独有索引”来产生字符的不完整笔画模板。这样,当用户输入一个字符的一个或多个笔画时,执行所输入笔画与不完整笔画模板之间的模板匹配,就可以精确地获得识别结果。
与现有技术不同,根据本发明的手写字符识别方法和系统基于字符的“独有索引”而给出预测候选,因此,每个字符的预测候选不会与其他字符混淆。而且,可以在用户输入了字符的一个或多个笔画之后就识别该字符。这样,节省了工作负担并改善了输入效率。
附图说明
通过下面结合附图说明本发明的优选实施例,将使本发明的上述及其它目的、特征和优点更加清楚,其中:
图1是示出了根据本发明实施例的手写字符识别系统的示意框图。
图2示出了根据本发明实施例的不完整笔画模板产生方法的流程图。
图3示出了根据本发明实施例的基本笔画和部首的示例。
图4示出了根据本发明实施例的“笔画索引表”的一部分;
图5示出了根据本发明实施例的字符笔画信息排序表的示例;
图6示出了图5所示的字符笔画信息排序表中相同的索引号和独有索引号;
图7示出了“鞭”的完整笔画字符的训练样本;
图8示出了从图7所示的“鞭”的完整笔画模板中得到的“鞭”的不完整笔画模板的训练样本;
图9示出了根据本发明实施例的手写字符识别方法的流程图;
图10示出了根据本发明实施例的手写字符识别方法的运行结果的一个示例,图10(a)示出了用户输入的手写轨迹,图10(b)示出了与手写轨迹高度匹配的不完整笔画模板,图10(c)示出了在手写轨迹的背景中输出了所识别的字符;
图11示出了根据本发明实施例的“恶”字的不完整笔画模板和完整笔画模板的示例;
图12示出了根据本发明实施例的构造中间字符的示例;以及
图13示出了一个示意的笔画样本。
具体实施方式
以下参照附图,对本发明的示例实施例进行详细描述。在以下描述中,一些具体实施例仅用于描述目的,而不应该理解为对本发明有任何限制,而只是本发明的示例。在可能导致对本发明的理解造成混淆时,将省略常规结构或构造。
图1是示出了根据本发明实施例的手写字符识别系统100的示意框图。该系统100包括4个单元:手写输入单元110、模板存储库120、模板匹配单元130以及输出单元140。各个单元的功能描述如下。
◆  手写输入单元110:该单元用于接收来自用户的手写输入数据。例如,当用户在触屏电子设备上进行手写输入时,该单元可以是识别用户手指与触摸板的接触的传感器。当用户利用诸如红外光等在电子设备上进行输入时,该单元可以是红外感应器。
◆  模板存储库120:该单元用于存储每个标准字符的匹配模板。根据本发明的实施例,模板存储库中存储有标准字符的完整笔画模板和不完整笔画模板。首先定义每个标准字符的不完整笔画标准字符。一些复杂的标准字符可能具有多个不完整笔画标准字符。但是,一个标准字符的不完整笔画标准字符不会构成任何其他标准字符的部分或全部。然后,对每个标准字符的不完整笔画标准字符和完整笔画标准字符进行训练,得到每个标准字符的完整笔画模板和不完整笔画模板。稍后将结合图2对不完整笔画模板的产生方法进行详细描述。
◆  模板匹配单元130:该单元通过将手写输入数据与不完整笔画模板和/或完整笔画单元进行匹配,以确定匹配模板的匹配度。
◆  输出单元140:该单元向用户输出识别结果。根据一个实施例,该单元可输出与匹配度最高的模板相对应的标准字符。可选地,该单元还执行所有匹配结果的优化,并根据预定规则对候选进行排序。例如,对于诸如移动电话之类的显示屏幕大小受限的电子设备,输出单元140仅输出匹配度最高的前十个匹配结果。在一个实施例中,输出可以是例如按照匹配度排序的。
图1还示出了该系统还可包括预处理单元150。
◆  预处理单元150:该单元用于对手写输入数据进行平滑、线性归一化和/或非线性归一化等处理。该单元的目的是使得手写输入数据更加平滑,以利于在模板匹配单元130中进行匹配。
本发明的关键技术在于不完整笔画模板的定义。对于诸如中文、日文和韩文等的字符(即,汉字、日文字等),每个字符由笔画和/或部首组成。不同的字符可能包括相同的笔画和部首。但是,每个字符 必然包含与其他字符不同的独有笔画或部首序列。根据本发明的实施例,利用这独有笔画和部首来定义每个字符的不完整笔画模板。也即,构造字符的不完整笔画模板,该不完整笔画模板所对应的不完整笔画字符包括该独有笔画/部首序列,使得该不完整笔画字符不构成任何其他字符的部分或全部,即与其他字符相区分。
以GB2312字符集中的汉字作为示例。图2示出了根据本发明实施例的不完整笔画模板产生方法200的流程图。
首先,在步骤S210,定义基本笔画和部首,如图3所示。
根据统一代码(Unicode)所定义,存在总共36个基本笔画。丢弃一些笔画,最终保留26个基本笔画,如图3中的部分A所示。图3B部分中的笔画/部首是由用户定义的,图3C部分中的笔画/部首是GB2312中的基本部首。图3总共示出了152个部首,这些部首在本发明实施例中示意为汉字GB2312所定义的基本笔画/部首。当然,可应用于本申请的部首并不局限于图3所示,而是可以根据需要使用其他部首的集合。例如,针对汉字和韩文或日文,所需的部首不同。
为了进行索引,在步骤S220,针对定义的基本笔画和部首,定义“笔画索引表”,给基本笔画和部首中的每个笔画/部首分配一个索引号。图4示出了该“笔画索引表”的一部分。在针对每个基本笔画和部首进行索引之后,每个字符可以由与该字符的每个笔画/部首相对应的索引号按照笔画顺序组合得到的一系列索引号表示。
然后,在步骤S230中,用与字符的每个笔画/部首相对应的索引号、按照该字符的笔画顺序组合得到的索引号序列来表示每个字符,并对每个字符的索引号进行排序,可以获得字符的笔画信息。因此,定义了对每个字符的笔画信息进行排序的表。图5示意地示出了该表的一部分。
然后,在步骤S240中,找到每个字符的独有笔画/部首,获得“独有索引号序列”。也即,首先识别每个字符与其他字符不同的索引号。图6示出了与图5相同的字符笔画信息排序表,但是对其中的相同部分进行了示意。
从图6的表中可以看出,左边的虚线框包含所示六个字符的共同 笔画。中间的虚线框包含前五个字符的共同笔画。此外,右边的虚线框表示“晋”和“戬”的共同部首。最后,第二、第三、第五和第六字符的最后笔画/部首(其索引号分别是211、226、233和201)表示对应字符的独有笔画/部首。因此,获得了“独有索引表”。即,对于“垩”,其索引号序列为“1-3-3-7-6-1-211”,其中“211”表示其独有笔画/部首。对于“晋”,其索引号序列为“1-3-3-7-6-1-236”,但是没有独有笔画/部首。对于“戬”,其索引号序列为“1-3-3-7-6-1-236-233”,其中“233”表示其独有笔画/部首。对于“严”,其索引号序列为“1-3-3-7-6-201”,其中“201”表示其独有笔画/部首。
然后,确定字符的独有索引号序列,独有索引号序列是一个字符的索引号序列中从起始索引号到不同索引号的部分索引号序列。从图6可以看出,六个字符中仅“垩”、“恶”和“戬”和“严”具有与其他字符不同的独有索引号。
作为另一示例,假设“鞭”的索引号序列例如为“302-104-1-3-10-1-1-5-8”,其中第三个索引号“1”表示其独有笔画/部首。“鞭”的独有索引号序列可以是“302-104-1”、“302-104-1-3”、“302-104-1-3-10”、“302-104-1-3-10-1”、“302-104-1-3-10-1-1”\“302-104-1-3-10-1-1-5”和“302-104-1-3-10-1-1-5-8”。换言之,除了“302-104-1”,独有索引号序列还包括独有索引号之后的索引号。
在步骤S250中,根据获得的“独有索引号序列”来产生每个标准字符的不完整笔画字符。
在一个实施例中,与所述独有索引号相对应的笔画/部首可能包括至少两个笔画。在该实施例中,与独有索引号序列相对应的不完整笔画字符包括:包括与从起始索引号到所述独有索引号之前一个索引号的部分索引号序列相对应的笔画/部首、加上与该独有索引号相对应的笔画/部首的逐笔画部分的不完整笔画字符。
在步骤S260中,根据每个字符的不完整笔画字符,由标准字符的完整笔画样本得到其不完整笔画样本,从而得到该标准字符的不完整笔画模板。
图7示出了“鞭”的完整笔画字符的训练样本。根据步骤S240, 可以获得“鞭”的索引号序列例如为“302-104-1-3-10-1-1-5-8”,其中第三个索引号“1”表示其独有笔画/部首。因此,根据步骤S250可以确定“鞭”的不完整字符的索引号序列为“302-104-1”、“302-104-1-3”、“302-104-1-3-10”、“302-104-1-3-10-1”、“302-104-1-3-10-1-1”和“302-104-1-3-10-1-1-5”。图8示出了从图7所示的“鞭”的完整笔画样本中得到的“鞭”的不完整笔画样本。
图9示出了根据本发明实施例的手写字符识别方法900的流程图。
如图9所示,在步骤S910处,接收用户输入的手写轨迹。在该步骤中,用户利用手写输入单元来输入手写轨迹。
然后在步骤S920处,对手写轨迹进行预处理。预处理的目的是使得手写轨迹更加平滑,以利于后续操作。
然后,在步骤S930处,输入的手写轨迹与笔画模板进行匹配,以确定笔画模板的匹配度。最后,在步骤S940处,输出与匹配的笔画模板相对应的标准字符,即,识别出了与手写轨迹相对应的标准字符。
根据本发明的实施例,系统存储有不完整笔画模板。因此,可以在每次接收到一个笔画输入之后,执行步骤S920-S940,以调整识别结果。例如,当用户想要输入“恶”字时,在输入完“亚”之后,步骤S930确定匹配度从高到低的笔画模板相对应的字符依次为例如“亚”、“巫”、“恶”、“丑”、“正”,从而在步骤S940中依次输出“亚”、“巫”、“恶”、“丑”、“正”。但是在用户接下来输入了后两笔之后,此时步骤S930确定匹配度从高到低的笔画模板相对应的字符依次为例如“恶”、“悉”、“恐”、“恋”、“晋”,从而在步骤S940中依次输出“恶”、“悉”、“恐”、“恋”、“晋”。
在一个实施例中,当用户输入笔画时,可以输出当前输入笔画的识别结果中的前十个候选,并且在输入笔画的背景中显示最可能的识别结果,即匹配度最高的字符。例如,用户输入汉字字符“鞭”的不完整笔画,如图10(a)所示。根据本发明实施例的手写字符识别系统将当前输入笔画与所有字符的所有不完整笔画模板进行匹配,发现不完整笔画模板“鞭”是最可能的候选,如图10(b)所示。最后, 在手写轨迹的背景中显示第一个识别结果“鞭”,如图10(c)所示。
在一些实施例中,为了避免不完整笔画字符与全笔画字符混淆,丢弃与全笔画字符类似的一些不完整笔画字符,即丢弃与全笔画字符的区别部分为特定笔画/部首的不完整笔画字符。例如,对于字符“恶”,其独有部首是226(心),这意味着它具有三个不完整笔画字符,即
Figure PCTCN2014092366-appb-000002
Figure PCTCN2014092366-appb-000003
但是,第一个不完整笔画字符
Figure PCTCN2014092366-appb-000004
可能会与“亚”混淆。因为区别的笔画
Figure PCTCN2014092366-appb-000005
可能是在输入“亚”字时由于输入屏幕上的噪声或污迹而错误识别到的。因此,对于字符“恶”,丢弃第一个不完整笔画字符,仅保留其他两个不完整笔画字符。图11示出了从“恶”中提取的二个不完整笔画样本和完整笔画样本。根据本发明的实施例,所述特定笔画/步骤可以是
Figure PCTCN2014092366-appb-000006
和“丶”之一。
根据本发明的一些实施例,为了有效地预测字符,构造一些中间字符,作为类似字符的公共部分。例如,对于“醒”、“醌”和“醍”,
Figure PCTCN2014092366-appb-000007
是这三个字符的公共部分。这意味着,可以将该公共部分当做这三个字符的中间字符,并对于该中间字符产生不完整笔画模板,例如由图13所示样本提取出的模板。将
Figure PCTCN2014092366-appb-000008
当做“醒”、“醌”和“醍”的不完整笔画字符。这样,在用户输入手写字符时,一旦输入了如图13所示字符,就可以识别“醒”、“醌”和“醍”。图12示出了构造中间字符的一个示例。如图12所示,“醒”、“醌”和“醍”的索引号序列中的公共部分为“293-236”,对应于针对该公共部分,产生不完整笔画模板,而将这样产生的不完整笔画模板和公共部分的笔画模板当做“醒”、“醌”和“醍”的不完整笔画模板。
根据本发明的实施例,还定义用于识别每个标准字符的最大笔画数目的表。然后,在计算匹配度时,给出笔画数目的不同权重,以便可以根据输入笔画数目与匹配模板的笔画数目之间的差距来调整识别结果。
根据本发明的实施例提出的手写字符识别方法和系统可以识别中文、日文和韩文等的字符。
下面,使用识别率和输入笔画的数目来评估根据本发明的预测手写字符识别系统的性能。具有高识别率和仅需要输入较少笔画的系统 是高性能系统。否则,具有低识别率或且需要输入大量笔画的系统是低性能系统。
为了计算预测性能,定义每个字符的预测比为:
PR=节省的笔画数/总笔画数      (1)
其中,“节省的笔画数”是“总笔画数”和用户实际输入的笔画数之差,“总笔画数”是该标准字符的总笔画数。
根据表1,可以看出对于具有大量笔画的字符,根据本发明的手写字符识别系统和方法可以节省大量输入笔画和输入时间。其中,“1选正确”表示第一个候选字符是正确的识别结果,“前2选正确”表示第一个或第二个候选字符是正确的识别结果,以此类推。
表1  GB2312中复杂字符的预测比
Figure PCTCN2014092366-appb-000010
根据本发明实施例的手写字符识别方法和系统可以应用于诸如电子白板、平板PC、桌面PC、移动电话、PDA以及支持手写输入的其他电子设备中。对于这些设备,用户可以用手指、手写笔等在屏幕上输入,根据本发明实施例的手写字符识别方法和系统相应地在屏幕上输出识别结果。
这里所公开的本发明实施例的其他设置包括执行在先概述的方法实施例的步骤和操作的软件程序。更具体地,计算机程序产品是如下的一种实施例:具有计算机可读介质,计算机可读介质上编码有计算机程序逻辑,当在计算设备上执行时,计算机程序逻辑提供相关的操作,从而提供上述技术方案。当在计算系统的至少一个处理器上执行时,计算机程序逻辑使得处理器执行本发明实施例所述的操作(方法)。本发明的这种设置典型地提供为设置或编码在例如光介质(例如CD-ROM)、软盘或硬盘等的计算机可读介质上的软件、代码和/或其 他数据结构、或者诸如一个或多个ROM或RAM或PROM芯片上的固件或微代码的其他介质、或专用集成电路(ASIC)、或一个或多个模块中的可下载的软件图像、共享数据库等。软件或固件或这种配置可安装在计算设备上,以使得计算设备中的一个或多个处理器执行本发明实施例所述的技术。结合诸如一组数据通信设备或其他实体中的计算设备进行操作的软件过程也可以提供根据本发明的设备。根据本发明的设备也可以分布在多个数据通信设备上的多个软件过程、或者在一组小型专用计算机上运行的所有软件过程、或者单个计算机上运行的所有软件过程之间。
应该理解,严格地讲,本发明的实施例可以实现为计算机设备上的软件程序、软件和硬件、或者单独的软件和/或单独的电路。
应当注意的是,在以上的描述中,仅以示例的方式,示出了本发明的技术方案,但并不意味着本发明局限于上述步骤和单元结构。在可能的情形下,可以根据需要对步骤和单元结构进行调整和取舍。因此,某些步骤和单元并非实施本发明的总体发明思想所必需的元素。因此,本发明所必需的技术特征仅受限于能够实现本发明的总体发明思想的最低要求,而不受以上具体实例的限制。
至此已经结合优选实施例对本发明进行了描述。应该理解,本领域技术人员在不脱离本发明的精神和范围的情况下,可以进行各种其它的改变、替换和添加。因此,本发明的范围不局限于上述特定实施例,而应由所附权利要求所限定。

Claims (18)

  1. 一种手写字符识别方法,包括步骤:
    接收用户输入的手写轨迹;
    将手写轨迹与至少一个笔画模板进行匹配,以确定笔画模板的匹配度;以及
    按照匹配度,输出与匹配的笔画模板相对应的标准字符,
    其中,所述笔画模板是标准字符的匹配模板,并且
    其中,至少一个标准字符的匹配模板包括所述至少一个标准字符的完整笔画模板和所述至少一个标准字符的不完整笔画标准字符的不完整笔画模板,其中所述至少一个标准字符的不完整笔画模板所对应的不完整笔画标准字符不构成任何其他标准字符的部分或全部。
  2. 根据权利要求1所述的方法,其中,在每次接收到一个笔画输入之后,执行所述匹配和输出步骤。
  3. 根据权利要求1所述的方法,其中,要与所述手写轨迹进行匹配的至少一个笔画模板是所有标准字符的所有匹配模板。
  4. 根据权利要求1所述的方法,其中,将手写轨迹与至少一个笔画模板进行匹配,以确定笔画模板的匹配度的步骤还包括:
    根据已输入的笔画数目与对应于进行匹配的笔画模板的标准字符的笔画数目之差,对匹配度进行加权。
  5. 根据权利要求1所述的方法,还包括:
    在手写轨迹的背景中显示与匹配度最高的笔画模板相对应的标准字符。
  6. 根据权利要求1-5之一所述的方法,其中,所述不完整笔画模板是通过以下步骤产生的:
    定义基本笔画和部首;
    给基本笔画和部首中的每个笔画/部首分配一个索引号;
    用与标准字符的每个笔画/部首相对应的索引号、按照该标准字符的笔画顺序组合得到的索引号序列来表示每个标准字符;
    对标准字符的索引号序列进行排序;
    识别所述至少一个标准字符与其他标准字符不同的索引号;
    确定所述至少一个标准字符的独有索引号序列,所述独有索引号序列是所述至少一个标准字符的索引号序列中从起始索引号到所述不同索引号的部分索引号序列;
    产生与所述独有索引号序列相对应的不完整笔画标准字符,作为所述至少一个标准字符的不完整笔画标准字符;
    根据所述至少一个标准字符的完整笔画样本,得到与所述不完整笔画标准字符相对应的该标准字符的不完整笔画样本;以及
    根据所述至少一个标准字符的不完整笔画样本,得到所述至少一个标准字符的不完整笔画模板。
  7. 根据权利要求6所述的方法,其中,所述独有索引号序列还包括:所述不同索引号之后的索引号。
  8. 根据权利要求7所述的方法,其中,当与所述不同索引号相对应的笔画/部首包括至少两个笔画时,与所述独有索引号序列相对应的不完整笔画标准字符还包括:
    包括与从起始索引号到所述不同索引号之前一个索引号的部分索引号序列相对应的笔画/部首、加上与所述不同索引号相对应的笔画/部首的逐笔画部分的不完整笔画标准字符。
  9. 根据权利要求6所述的方法,其中,在所得到的不完整笔画模板中丢弃与所述不同索引号相对应的笔画/部首为特定笔画/部首之一的不完整笔画模板。
  10. 根据权利要求9所述的方法,其中,所述特定笔画/部首包括
    Figure PCTCN2014092366-appb-100001
    和“丶”。
  11. 根据权利要求6所述的方法,其中,所述至少一个标准字符的不完整笔画模板还包括通过以下步骤产生的模板:
    确定至少两个标准字符的相同笔画部分,所述相同笔画部分是所述至少两个标准字符的不完整笔画部分;
    将该相同笔画部分当作中间标准字符,针对所述中间标准字符,产生其不完整笔画模板;
    以及将所产生的不完整笔画模板和与所述中间标准字符相对应的 笔画模板作为所述至少两个标准字符中每个标准字符的不完整笔画模板。
  12. 根据权利要求1-11之一所述的方法,其中,所述标准字符是以下之一中的字符:中文、日文和韩文。
  13. 一种手写字符识别系统,包括:
    手写输入单元,用于接收用户输入的手写轨迹;
    模板存储库,存储有标准字符的匹配模板;
    模板匹配单元,被配置为将手写轨迹与至少一个匹配模板进行匹配,以确定匹配度;以及
    输出单元,被配置为按照模板匹配单元确定的匹配度,输出与匹配的匹配模板相对应的标准字符,
    其中,至少一个标准字符的匹配模板包括所述至少一个标准字符的完整笔画模板和所述至少一个标准字符的不完整笔画标准字符的不完整笔画模板,其中所述至少一个标准字符的不完整笔画模板所对应的不完整笔画标准字符不构成任何其他标准字符的部分或全部。
  14. 根据权利要求13所述的手写字符识别系统,其中,所述模板匹配单元被配置为:在所述手写输入单元每接收到一个笔画输入之后,执行操作。
  15. 根据权利要求13所述的手写字符识别系统,其中,所述模板匹配单元被配置为将所述手写轨迹与所有标准字符的所有匹配模板进行匹配。
    所述输出单元被配置为:按照匹配度的高低,输出多于一个标准字符。
  16. 根据权利要求13所述的手写字符识别系统,其中,所述模板匹配单元还被配置为:根据已输入的笔画数目与对应于匹配模板的标准字符的笔画数目之差对匹配度进行加权。
  17. 根据权利要求13所述的手写字符识别系统,其中,所述输出单元还被配置为:在手写轨迹的背景中显示匹配度最高的标准字符。
  18. 根据权利要求13所述的手写字符识别系统,还包括:预处理器,被配置为对手写输入单元接收到的手写轨迹进行预处理,以将预处理后的数据输出到所述模板匹配单元。
PCT/CN2014/092366 2013-11-27 2014-11-27 手写字符识别方法和系统 WO2015078383A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2016532526A JP6275840B2 (ja) 2013-11-27 2014-11-27 手書き文字の識別方法

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310616121.4A CN104680196A (zh) 2013-11-27 2013-11-27 手写字符识别方法和系统
CN201310616121.4 2013-11-27

Publications (1)

Publication Number Publication Date
WO2015078383A1 true WO2015078383A1 (zh) 2015-06-04

Family

ID=53198379

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/092366 WO2015078383A1 (zh) 2013-11-27 2014-11-27 手写字符识别方法和系统

Country Status (3)

Country Link
JP (1) JP6275840B2 (zh)
CN (1) CN104680196A (zh)
WO (1) WO2015078383A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111143541A (zh) * 2020-03-07 2020-05-12 合肥煜极网络科技有限公司 一种基于手写字迹进行深度学习的字体生成系统
CN112215175A (zh) * 2020-10-19 2021-01-12 北京乐学帮网络技术有限公司 一种手写字符识别方法、装置、计算机设备和存储介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106355630B (zh) * 2015-07-21 2021-06-29 鸿合科技股份有限公司 基于特征的动态实体生成方法及装置
CN110969165B (zh) * 2019-11-28 2024-04-09 中国科学院半导体研究所 手写字符识别方法、装置、电子设备及存储介质
CN111310548B (zh) * 2019-12-04 2023-09-19 武汉汉德瑞庭科技有限公司 一种在线手写笔迹中笔画类型的识别方法
CN112925470B (zh) * 2021-05-10 2021-10-01 广州朗国电子科技股份有限公司 交互式电子白板的触摸控制方法、系统和可读介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10162101A (ja) * 1996-11-28 1998-06-19 Hitachi Ltd 手書き文字認識装置および手書き文字認識方法
CN1187257A (zh) * 1995-06-05 1998-07-08 摩托罗拉公司 手写体输入字符识别的方法和设备
CN101276249A (zh) * 2007-03-30 2008-10-01 北京三星通信技术研究有限公司 一种手写字符预测识别的方法和装置
CN101354749A (zh) * 2007-07-24 2009-01-28 夏普株式会社 字典制作方法、手写输入方法和设备
CN102937837A (zh) * 2012-08-10 2013-02-20 上海驿创信息技术有限公司 基于非完整识别的词语快速输入方法

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS603238B2 (ja) * 1977-11-28 1985-01-26 富士通株式会社 手書き文字オンライン識別方式
JP2924040B2 (ja) * 1990-01-25 1999-07-26 ソニー株式会社 手書き文字の認識装置
JPH0950490A (ja) * 1995-08-07 1997-02-18 Sony Corp 手書き文字認識装置
JPH09330377A (ja) * 1996-06-10 1997-12-22 Hitachi Ltd 手書き文字認識装置および手書き文字認識方法
JPH10269315A (ja) * 1997-03-26 1998-10-09 Toshiba Corp 文字認識装置、文字認識方法及び辞書登録方法
CN1881994A (zh) * 2006-05-18 2006-12-20 北京中星微电子有限公司 一种用于移动设备的手写输入及手势识别的方法和装置
CN102221976A (zh) * 2011-07-06 2011-10-19 上海驿创信息技术有限公司 基于非完整识别的词语快速输入方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1187257A (zh) * 1995-06-05 1998-07-08 摩托罗拉公司 手写体输入字符识别的方法和设备
JPH10162101A (ja) * 1996-11-28 1998-06-19 Hitachi Ltd 手書き文字認識装置および手書き文字認識方法
CN101276249A (zh) * 2007-03-30 2008-10-01 北京三星通信技术研究有限公司 一种手写字符预测识别的方法和装置
CN101354749A (zh) * 2007-07-24 2009-01-28 夏普株式会社 字典制作方法、手写输入方法和设备
CN102937837A (zh) * 2012-08-10 2013-02-20 上海驿创信息技术有限公司 基于非完整识别的词语快速输入方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XIAO, XUHONG ET AL., ON-LINE HANDWRITTEN CHINESE CHARACTER RECOGNITION GUIDED BY COMPONENTS WITH DYNAMIC TEMPLATES, vol. 24, no. 4, 31 July 1998 (1998-07-31), pages 469 - 475 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111143541A (zh) * 2020-03-07 2020-05-12 合肥煜极网络科技有限公司 一种基于手写字迹进行深度学习的字体生成系统
CN111143541B (zh) * 2020-03-07 2023-11-03 合肥煜极网络科技有限公司 一种基于手写字迹进行深度学习的字体生成系统
CN112215175A (zh) * 2020-10-19 2021-01-12 北京乐学帮网络技术有限公司 一种手写字符识别方法、装置、计算机设备和存储介质
CN112215175B (zh) * 2020-10-19 2024-01-30 北京乐学帮网络技术有限公司 一种手写字符识别方法、装置、计算机设备和存储介质

Also Published As

Publication number Publication date
JP2016537728A (ja) 2016-12-01
CN104680196A (zh) 2015-06-03
JP6275840B2 (ja) 2018-02-07

Similar Documents

Publication Publication Date Title
WO2015078383A1 (zh) 手写字符识别方法和系统
JP7126542B2 (ja) データセット処理方法、装置、電子機器及び記憶媒体
CN109241524B (zh) 语义解析方法及装置、计算机可读存储介质、电子设备
US10372328B2 (en) Intelligent touchscreen keyboard with finger differentiation
JP5211334B2 (ja) 手書き記号の認識方法及び装置
EP2698692B1 (en) System and method for implementing sliding input of text based upon on-screen soft keyboard on electronic equipment
WO2020215563A1 (zh) 用于文本分类的训练样本生成方法、装置和计算机设备
US20180300542A1 (en) Drawing emojis for insertion into electronic text-based messages
WO2017005207A1 (zh) 一种输入方法、输入装置、服务器和输入系统
WO2015070825A2 (zh) 连续滑动输入单词的方法和系统
JPH07105316A (ja) 手書き記号認定装置
US20210374576A1 (en) Medical Fact Verification Method and Apparatus, Electronic Device, and Storage Medium
CN112632226B (zh) 基于法律知识图谱的语义搜索方法、装置和电子设备
US20210350122A1 (en) Stroke based control of handwriting input
CN106708929B (zh) 视频节目的搜索方法和装置
CN108073576A (zh) 智能搜索方法、搜索装置以及搜索引擎系统
CN106570196B (zh) 视频节目的搜索方法和装置
Zhelezniakov et al. Evaluating new requirements to pen-centric intelligent user interface based on end-to-end mathematical expressions recognition
JP2013246732A (ja) 手書き文書検索装置、方法及びプログラム
Huang et al. Keyword spotting in unconstrained handwritten Chinese documents using contextual word model
WO2016192664A1 (zh) 手写表识别方法和设备
TW201512969A (zh) 增進手寫輸入效率之方法
JP2019061522A (ja) 文書推薦システム、文書推薦方法および文書推薦プログラム
KR20220132536A (ko) 필기에서의 수학 검출
JP2012141742A (ja) 文字列検索装置,文字列検索方法および文字列検索プログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14865417

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2016532526

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14865417

Country of ref document: EP

Kind code of ref document: A1