CN102402695A - Method and equipment for recognizing multilevel word combination - Google Patents

Method and equipment for recognizing multilevel word combination Download PDF

Info

Publication number
CN102402695A
CN102402695A CN2010102802367A CN201010280236A CN102402695A CN 102402695 A CN102402695 A CN 102402695A CN 2010102802367 A CN2010102802367 A CN 2010102802367A CN 201010280236 A CN201010280236 A CN 201010280236A CN 102402695 A CN102402695 A CN 102402695A
Authority
CN
China
Prior art keywords
vocabulary
candidate
multistage
combination
confidence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2010102802367A
Other languages
Chinese (zh)
Other versions
CN102402695B (en
Inventor
郑大念
孙俊
于浩
直井聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to CN201010280236.7A priority Critical patent/CN102402695B/en
Publication of CN102402695A publication Critical patent/CN102402695A/en
Application granted granted Critical
Publication of CN102402695B publication Critical patent/CN102402695B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Character Discrimination (AREA)

Abstract

The invention provides a method and equipment for recognizing multilevel word combination. The multilevel word combination comprises words at a plurality of levels, and different words of an upper level correspond to subsets of candidate sets of different words of a lower level. The method includes: respectively and independently recognizing words at each level, and determining a recognition result of the multilevel word combination according to a recognition result of the words at each level. The method and the equipment for recognizing the multilevel word combination have higher fault tolerance and can realize high recognition rate and low reject rate.

Description

Discern the method and apparatus of multistage vocabulary combination
Technical field
The application relates to character recognition and handles, and more particularly, the present invention relates to discern the method and apparatus of multistage vocabulary combination.
Background technology
In some special OCR (optical character identification) use, for example address (province+city+county) identification and product (manufacturer+kind+type) identification, identification to liking the combination of multistage vocabulary, between the vocabulary at different levels for example semantically from slightly/greatly to carefully/little.In the prior art, gather candidate's vocabulary, then to every grade of multistage vocabulary combination of selecting the highest candidate's vocabulary of degree of confidence as final identification respectively for every grade of vocabulary.This traditional multistage vocabulary combined recognising method is difficult to realize simultaneously high discrimination and low reject rate, and in addition, when the input error of non-lowermost level vocabulary maybe can't be discerned, traditional method also can't be revised automatically.
Summary of the invention
Provide hereinafter about brief overview of the present invention, so that the basic comprehension about some aspect of the present invention is provided.Should be appreciated that this general introduction is not about exhaustive general introduction of the present invention.It is not that intention is confirmed key of the present invention or pith, neither be intended to limit scope of the present invention.Its purpose only is to provide some notion with the form of simplifying, with this as the preorder in greater detail of argumentation after a while.
According to an aspect of the present invention; A kind of method of discerning multistage vocabulary combination; This multistage vocabulary combination comprises a plurality of other vocabulary of level, and different higher level's vocabulary is corresponding to the subclass of different subordinate's vocabulary Candidate Sets, and this method comprises: every grade of vocabulary of independent respectively identification; Confirm the recognition result of multistage vocabulary combination according to the recognition result of every grade of vocabulary.
According to a further aspect in the invention; A kind of equipment of discerning multistage vocabulary combination, this multistage vocabulary combination comprises a plurality of other vocabulary of level, different higher level's vocabulary is corresponding to the subclass of different subordinate's vocabulary Candidate Sets; This equipment comprises: the vocabulary recognition unit; Be configured to every grade of vocabulary of independent respectively identification, vocabulary combination recognition unit is configured to confirm the recognition result that multistage vocabulary makes up according to the recognition result of every grade of vocabulary.
The method and apparatus of the multistage vocabulary combination of identification according to the present invention has stronger fault-tolerance, can realize the high discrimination and the low reject rate of multistage vocabulary combination identification.
Description of drawings
With reference to below in conjunction with the explanation of accompanying drawing, can understand above and other purpose, characteristics and advantage of the present invention to the embodiment of the invention with being more prone to.Parts in the accompanying drawing are just in order to illustrate principle of the present invention.In the accompanying drawings, identical or similar techniques characteristic or parts will adopt identical or similar Reference numeral to represent.In the accompanying drawing:
Fig. 1 is a process flow diagram of discerning the method for multistage vocabulary combination according to an embodiment of the invention;
Fig. 2 discerns in the method for multistage vocabulary combination the process flow diagram of a realization of confirming the step of the recognition result that multistage vocabulary makes up according to the recognition result of every grade of vocabulary according to an embodiment of the invention;
Fig. 3 is the block diagram of the equipment of the multistage vocabulary combination of identification according to the present invention;
Fig. 4 is the schematic block diagram that can be used for implementing according to the computing machine of the method and apparatus of the embodiment of the invention; And
Fig. 5 a-5i is an examples of screen displays of utilizing the equipment of the multistage vocabulary combination of the identification of embodiments of the invention that the address of input is discerned.
Embodiment
Embodiments of the invention are described with reference to the accompanying drawings.Element of in an accompanying drawing of the present invention or a kind of embodiment, describing and characteristic can combine with element and the characteristic shown in one or more other accompanying drawing or the embodiment.Should be noted that for purpose clearly, omitted the parts that have nothing to do with the present invention, those of ordinary skills are known and the expression and the description of processing in accompanying drawing and the explanation.
Following partial content still should be appreciated that with the example of hand-written Address Recognition as the multistage vocabulary combination of identification, the invention is not restricted to this.The present invention also goes for the identification with the similar multistage vocabulary combination in address, the for example identification of product.
The following embodiment that describes the method for the multistage vocabulary combination of identification according to the present invention with reference to Fig. 1.This embodiment is used to discern multistage vocabulary combination.Multistage vocabulary combination comprises a plurality of other vocabulary of level, and different higher level's vocabulary is corresponding to the subclass of different subordinate's vocabulary Candidate Sets.With identifying object is that the address of the pattern in " province+city+county " is an example, and " Huaibin County, Xinyang, Henan " is exactly a multistage vocabulary combination.This multistage vocabulary combination comprises three grades of vocabulary: " Henan " is first order vocabulary, and " Xinyang " is second level vocabulary, and " Huaibin County " is third level vocabulary.Each grade all has candidate's level of this grade vocabulary; For example the Candidate Set of first order vocabulary comprises vocabulary such as " Henan ", " Gansu ", " Shanxi ", and the Candidate Set of second level vocabulary comprises vocabulary such as " Xinyang ", " Zhengzhou ", " Jiuquan ", " Lanzhou ", " Taiyuan ", " Datong District ".The different first order vocabulary of subclass of different subordinate's vocabulary Candidate Sets; For example the subclass of " Henan " pairing subordinate vocabulary Candidate Set (second level vocabulary Candidate Set just) comprises vocabulary such as " Xinyang ", " Zhengzhou ", and the subclass of " Gansu " pairing subordinate vocabulary Candidate Set (second level vocabulary Candidate Set just) comprises vocabulary such as " Jiuquan ", " Lanzhou ".It should be noted that each level of not considering how to divide as in the multistage vocabulary combination of identifying object here, can think that each level divides in advance, perhaps thinking has tangible interval between the vocabulary at different levels, be easy to divide.
As shown in Figure 1, in step 102, can distinguish every grade of vocabulary of independent identification.In step S104, can confirm the recognition result of multistage vocabulary combination according to the recognition result of every grade of vocabulary.
In one example, in step 102, can calculate every grade of candidate's vocabulary degree of confidence tabulation.Wherein, the tabulation of every grade of vocabulary degree of confidence comprises the degree of confidence of each the candidate's vocabulary in the Candidate Set of every grade of vocabulary.With identifying object is that the address of the pattern in " province+city+county " is an example; For first order vocabulary as identifying object; Each candidate's vocabulary in the calculating first order vocabulary Candidate Set is tabulated thereby form first order candidate vocabulary degree of confidence with respect to the degree of confidence of this first order vocabulary.Likewise, for the second level vocabulary as identifying object, each the candidate's vocabulary in the vocabulary Candidate Set of the calculating second level is tabulated thereby form second level candidate's vocabulary degree of confidence with respect to the degree of confidence of this second level vocabulary; For the third level vocabulary as identifying object, each the candidate's vocabulary in the calculating third level vocabulary Candidate Set is tabulated thereby form third level candidate vocabulary degree of confidence with respect to the degree of confidence of this third level vocabulary.Here, every grade of candidate's vocabulary degree of confidence is tabulated and can be used as the recognition result of every grade of vocabulary.Calculating about degree of confidence is that those skilled in the art can realize, is not described in detail here.
In one example, in step 104, can confirm the tabulation of the multistage vocabulary combination of candidate degree of confidence according to every grade of candidate's vocabulary degree of confidence tabulation.Wherein, the tabulation of the multistage vocabulary combination of candidate degree of confidence can comprise the degree of confidence of the multistage vocabulary combination of each candidate in the multistage vocabulary combination Candidate Set.The multistage vocabulary combination of candidate Candidate Set can comprise all possible multistage vocabulary combination.With identifying object is that the address of the pattern in " province+city+county " is an example, and the multistage vocabulary combination of candidate Candidate Set can comprise multistage vocabulary combination such as " Huaibin County, Xinyang, Henan ", " Guazhou County, Jiuquan ", " Haiyang, Yantai, Shandong ".The mean value of degree of confidence that in one example, can the multistage vocabulary of candidate be made up pairing every grade of candidate's vocabulary is as the degree of confidence of the multistage vocabulary of this candidate combination.In one example, multistage candidate's vocabulary combination that can degree of confidence is the highest is as recognition result.
In one example; Owing to considering that the quantity of information that lowermost level vocabulary is comprised is the abundantest; In step 104; The recognition result that multistage vocabulary makes up is confirmed as in the highest multistage vocabulary combination of candidate of degree of confidence of lowermost level candidate's vocabulary that the multistage vocabulary combination selection of N candidate that can degree of confidence is the highest from multistage vocabulary combination Candidate Set is corresponding, and wherein N is the integer more than or equal to 1.With identifying object is that the address of the pattern in " province+city+county " is an example; Suppose that N is 2; Suppose that the multistage vocabulary combination of 2 candidates that degree of confidence is the highest in the multistage vocabulary combination of sets of candidate is " Laixi, Qingdao " and " Laizhou, Yantai, Shandong "; The degree of confidence of lowermost level candidate vocabulary " Laixi " and " Laizhou " if the degree of confidence in " Laixi " is high, was then confirmed as recognition result with " Laixi, Qingdao " during then relatively the multistage vocabulary of these two candidates made up; If the degree of confidence of " Laizhou " is high, then recognition result is confirmed as in " Laizhou, Yantai, Shandong ".
In one example; In step 104; Make up the degree of confidence tabulation according to every grade of multistage vocabulary of candidate's vocabulary degree of confidence tabulation calculated candidate and can comprise the degree of confidence that finds out every grade of candidate's vocabulary of the multistage vocabulary combination of each candidate correspondence the multistage vocabulary combination Candidate Set from every grade of candidate's vocabulary degree of confidence tabulation, and calculate the degree of confidence of the degree of confidence weighted mean value of every grade of corresponding candidate's vocabulary of the multistage vocabulary combination of each candidate as the multistage vocabulary combination of this candidate.Under the situation that every grade of vocabulary is made up of character; The number of the character that can be comprised with every grade of corresponding candidate's vocabulary of the multistage vocabulary of each candidate combination as weights, calculates the degree of confidence weighted mean value of every grade of corresponding candidate's vocabulary of the multistage vocabulary combination of each candidate divided by the number of the included character of the multistage vocabulary combination of each candidate.With the multistage vocabulary combination of candidate " 5 constant virtues, Harbin, Heilungkiang " is example; The number of the character that this vocabulary combination is included is 8; The number of the character that " Heilungkiang " " Harbin " " 5 constant virtues " is included is respectively 3,3,2; Then the corresponding weights in " Heilungkiang " " Harbin " " 5 constant virtues " are respectively 3/8,3/8 and 1/4.
In one example; Under the situation that every grade of vocabulary is made up of character; Calculate the tabulation of every grade of candidate's vocabulary degree of confidence and can comprise the degree of confidence of calculating the included character of every grade of vocabulary, and the degree of confidence of the included character of every grade of vocabulary is averaged the degree of confidence as every grade of candidate's vocabulary.
A following realization describing step S104 in the method for discerning multistage vocabulary combination according to an embodiment of the invention with reference to Fig. 2.
As shown in Figure 2, in step 202, can from multistage vocabulary combination Candidate Set, select the multistage vocabulary combination of N the highest candidate of degree of confidence to export as candidate's recognition result.In step 204, whether the degree of confidence that can judge the lowermost level candidate's vocabulary in the multistage vocabulary combination of a highest candidate of degree of confidence is more than or equal to predetermined threshold.If the degree of confidence of the lowermost level candidate's vocabulary in the combination of the multistage vocabulary of the candidate that degree of confidence is the highest is more than or equal to predetermined threshold (in step 204 for being); Then in step 206, the multistage vocabulary combination of this candidate is confirmed as the recognition result of multistage vocabulary combination.If the degree of confidence of the lowermost level candidate's vocabulary in the multistage vocabulary combination of the candidate that degree of confidence is the highest then refuses to know less than predetermined threshold (being not in step 204).Wherein N is the integer more than or equal to 1.
In one example, after refusing knowledge, can judge by artificial candidate's recognition result output in step 202.
The following equipment 300 of describing the multistage vocabulary combination of identification according to the present invention with reference to Fig. 3.
As shown in Figure 3; The equipment 300 of discerning multistage vocabulary combination can comprise: vocabulary recognition unit 302; Be configured to every grade of vocabulary of independent respectively identification, vocabulary combination recognition unit 304 is configured to confirm the recognition result that multistage vocabulary makes up according to the recognition result of every grade of vocabulary.
In one example, vocabulary recognition unit 302 can calculate the tabulation of every grade of candidate's vocabulary degree of confidence, and every grade of vocabulary degree of confidence tabulation comprises the degree of confidence of each the candidate's vocabulary in the Candidate Set of every grade of vocabulary.Vocabulary combination recognition unit 304 can be confirmed the tabulation of the multistage vocabulary combination of candidate degree of confidence according to every grade of candidate's vocabulary degree of confidence tabulation.Wherein, the tabulation of the multistage vocabulary combination of candidate degree of confidence comprises the degree of confidence of the multistage vocabulary combination of each candidate in the multistage vocabulary combination Candidate Set.
In another example; Vocabulary combination recognition unit 304 selects the multistage vocabulary combination of N the highest candidate of degree of confidence as the output of candidate's recognition result from multistage vocabulary combination Candidate Set, and whether the degree of confidence of judging the lowermost level candidate vocabulary of the multistage vocabulary of a highest candidate of degree of confidence in making up more than or equal to predetermined threshold, if; Then the multistage vocabulary combination of this candidate is confirmed as the recognition result of multistage vocabulary combination; If not, then refuse to know, wherein N is the integer more than or equal to 1.
In another example; The recognition result of multistage vocabulary combination is confirmed as in the multistage vocabulary combination of candidate that the degree of confidence of lowermost level candidate's vocabulary that the vocabulary combination recognition unit 304 multistage vocabulary combination selection of N candidate that degree of confidence is the highest from multistage vocabulary combination Candidate Set is corresponding is the highest, and wherein N is the integer more than or equal to 1.
Optional; Vocabulary combination recognition unit finds out the degree of confidence of every grade of candidate's vocabulary of the multistage vocabulary combination of each candidate correspondence the multistage vocabulary combination Candidate Set from every grade of candidate's vocabulary degree of confidence tabulation, and calculates the degree of confidence of the degree of confidence weighted mean value of every grade of corresponding candidate's vocabulary of the multistage vocabulary combination of each candidate as the multistage vocabulary combination of this candidate.
Optional, every grade of vocabulary is made up of character, the degree of confidence of the character that every grade of vocabulary of vocabulary recognition unit computes is included, and the degree of confidence of the included character of every grade of vocabulary averaged the degree of confidence as every grade of candidate's vocabulary.
Optional; The number of the character that vocabulary combination recognition unit is comprised with every grade of corresponding candidate's vocabulary of the multistage vocabulary combination of each candidate as weights, calculates the degree of confidence weighted mean value of every grade of corresponding candidate's vocabulary of the multistage vocabulary combination of each candidate divided by the number of the included character of the multistage vocabulary combination of each candidate.
Method about the operation of each parts of the equipment 300 of discerning the combination of multistage vocabulary and function can make up with reference to the multistage vocabulary of the identification of describing with reference to Fig. 1 and 2 repeats no more here.
Fig. 4 shows the schematic block diagram that can be used for implementing according to the computing machine of the method and apparatus of the embodiment of the invention.In Fig. 4, CPU (CPU) 401 carries out various processing according to program stored among ROM (read-only memory) (ROM) 402 or from the program that storage area 408 is loaded into random-access memory (ram) 403.In RAM 403, also store data required when CPU 401 carries out various processing or the like as required.CPU 401, ROM 402 and RAM 403 are connected to each other via bus 504.Input/output interface 405 also is connected to bus 404.
Following parts are connected to input/output interface 405: importation 406 (comprising keyboard, mouse or the like), output 407 (comprise display; Such as cathode ray tube (CRT), LCD (LCD) etc. and loudspeaker etc.), storage area 408 (comprising hard disk etc.), communications portion 409 (comprising that NIC is such as LAN card, modulator-demodular unit etc.).Communications portion 409 is handled such as the Internet executive communication via network.As required, driver 410 also can be connected to input/output interface 405.Detachable media 411 can be installed on the driver 410 such as disk, CD, magneto-optic disk, semiconductor memory or the like as required, makes the computer program of therefrom reading be installed to as required in the storage area 408.
Realizing through software under the situation of above-mentioned series of processes, such as detachable media 411 program that constitutes software is being installed such as the Internet or storage medium from network.
It will be understood by those of skill in the art that this storage medium is not limited to shown in Figure 4 wherein having program stored therein, distribute so that the detachable media 411 of program to be provided to the user with equipment with being separated.The example of detachable media 411 comprises disk (comprising floppy disk (registered trademark)), CD (comprising compact disc read-only memory (CD-ROM) and digital universal disc (DVD)), magneto-optic disk (comprising mini-disk (MD) (registered trademark)) and semiconductor memory.Perhaps, storage medium can be hard disk that comprises in ROM 402, the storage area 408 or the like, computer program stored wherein, and be distributed to the user with the equipment that comprises them.
Fig. 5 a-5i is an examples of screen displays of utilizing the equipment of the multistage vocabulary combination of the identification of embodiments of the invention that the multilevel addressing of input is discerned, wherein the predetermined threshold of degree of confidence is set at 0.40, one multilevel addressing and comprises three grades in province, city, county.
Can see that each grade of the multilevel addressing of importing among Fig. 5 a all has good quality, therefore can correctly discern the multilevel addressing of input; Each grade of the multilevel addressing of importing among Fig. 5 b second-rate, but lowermost level wherein, be that the degree of confidence of address at county level is 0.43, greater than predetermined threshold 0.40, therefore still can correctly discern the multilevel addressing of input; The intergrade of importing among Fig. 5 c, be city-level address poor quality, almost can't discern, but since the degree of confidence of the address at county level of input greater than predetermined threshold, the equipment of embodiments of the invention still correctly identifies the multilevel addressing of expectation input.Can see; Adopt the method for the multistage vocabulary combination of identification of embodiments of the invention to have robustness for the lower multilevel addressing input of quality; And can be difficult in non-lowermost level (first order or the second level) address of input discern (as; Input error or disappearance) situation under, rely on the degree of confidence of multistage vocabulary composite entity, identify the multilevel addressing of being imported.
In Fig. 5 d, do not import address at county level, in this case, can only identify provincial and the city-level address.In Fig. 5 e, as the input of intergrade address, equipment of the present invention still can correctly identify the multilevel addressing of expectation input to mistake with address at county level.In Fig. 5 f, imported wrong city-level address, equipment of the present invention still can correctly identify the multilevel addressing of expectation.Therefore, adopt the method and apparatus of the multistage vocabulary combination of identification of embodiments of the invention to have higher tolerance limit for mistake input.
In Fig. 5 g-5i, correctly do not identify the multilevel addressing of input, because the degree of confidence of lowermost level address is less than predetermined threshold, the method and apparatus of the multistage vocabulary combination of the identification of embodiments of the invention is refused to know to the multilevel addressing of input.
Concrete example through above Address Recognition can find out that the method and apparatus of discerning multistage vocabulary combination according to an embodiment of the invention has stronger fault-tolerance, can realize high discrimination and low reject rate.
Above embodiments more of the present invention have been carried out detailed description.Accessible like those of ordinary skill in the art institute; Whole or any step or the parts of method and apparatus of the present invention; Can be in the network of any computing equipment (comprising processor, storage medium etc.) or computing equipment; Realize that with hardware, firmware, software or their combination this is that those of ordinary skills' their basic programming skill of utilization under the situation of understanding content of the present invention just can be realized, does not therefore need to specify at this.
In addition, it is obvious that, when relating to possible peripheral operation in the superincumbent explanation, will use any display device and any input equipment, corresponding interface and the control program that link to each other with any computing equipment undoubtedly.Generally speaking; The hardware of the various operations in the related hardware in computing machine, computer system or the computer network, software and the realization preceding method of the present invention, firmware, software or their combination promptly constitute equipment of the present invention and each building block thereof.
Therefore, based on above-mentioned understanding, the object of the invention can also be realized through program of operation or batch processing on any messaging device.Said messaging device can be known common apparatus.Therefore, the object of the invention also can be only through providing the program product that comprises the program code of realizing said method or equipment to realize.That is to say that such program product also constitutes the present invention, and storage or the medium that transmits such program product also constitute the present invention.Obviously, said storage or transmission medium can be well known by persons skilled in the art, and therefore the storage or the transmission medium of any kind that is perhaps developed in the future also there is no need at this various storages or transmission medium to be enumerated one by one.
In equipment of the present invention and method, obviously, after can decomposing, make up and/or decompose, each parts or each step reconfigure.These decomposition and/or reconfigure and to be regarded as equivalents of the present invention.The step that also it is pointed out that the above-mentioned series of processes of execution can order following the instructions naturally be carried out in chronological order, but does not need necessarily to carry out according to time sequencing.Some step can walk abreast or carry out independently of one another.Simultaneously; In the above in the description to the specific embodiment of the invention; Characteristic to a kind of embodiment is described and/or illustrated can be used in one or more other embodiment with identical or similar mode; Combined with the characteristic in other embodiment, or substitute the characteristic in other embodiment.
Should stress that term " comprises/comprise " existence that when this paper uses, refers to characteristic, key element, step or assembly, but not get rid of the existence of one or more further feature, key element, step or assembly or additional.
Though specified the present invention and advantage thereof, be to be understood that and under not exceeding, can carry out various changes, alternative and conversion the situation of the appended the spirit and scope of the present invention that claim limited.And the application's scope is not limited only to the specific embodiment of the described process of instructions, equipment, means, method and step.The one of ordinary skilled in the art will readily appreciate that from disclosure of the present invention, can use according to the present invention and carry out and process, equipment, means, method or step said essentially identical function of corresponding embodiment or acquisition result essentially identical with it, existing and that will be developed in the future.Therefore, appended claim is intended in their scope, comprise such process, equipment, means, method or step.
About comprising the embodiment of above each embodiment, the remarks below also disclosing.
Remarks
1. discern the method that multistage vocabulary makes up for one kind, said multistage vocabulary combination comprises a plurality of other vocabulary of level, and different higher level's vocabulary is corresponding to the subclass of different subordinate's vocabulary Candidate Sets, and said method comprises:
Every grade of vocabulary of independent respectively identification;
Confirm the recognition result of multistage vocabulary combination according to the recognition result of every grade of vocabulary.
2. like remarks 1 described method, wherein to discern every grade of vocabulary and comprise and calculate every grade of candidate's vocabulary degree of confidence tabulation, said every grade of vocabulary degree of confidence tabulation comprises the degree of confidence of each the candidate's vocabulary in the Candidate Set of every grade of vocabulary;
The recognition result that calculates multistage vocabulary combination according to the recognition result of every grade of vocabulary comprises according to every grade of candidate's vocabulary degree of confidence tabulation confirms the tabulation of the multistage vocabulary combination of candidate degree of confidence, and the tabulation of the multistage vocabulary combination of said candidate degree of confidence comprises the degree of confidence of the multistage vocabulary combination of each candidate in the multistage vocabulary combination Candidate Set.
3. like remarks 2 described methods; Wherein confirm that according to the recognition result of every grade of vocabulary the recognition result of multistage vocabulary combination comprises that from said multistage vocabulary combination Candidate Set, selecting the multistage vocabulary of N the highest candidate of degree of confidence to make up exports as candidate's recognition result; And whether the degree of confidence of judging the lowermost level candidate's vocabulary in the combination of the multistage vocabulary of a highest candidate of degree of confidence is more than or equal to predetermined threshold; If then the recognition result of said multistage vocabulary combination is confirmed as in the multistage vocabulary combination of this candidate, if deny; Then refuse to know, wherein N is the integer more than or equal to 1.
4. like remarks 2 described methods; Wherein confirm that according to the recognition result of every grade of vocabulary the recognition result of multistage vocabulary combination comprises that the highest multistage vocabulary of candidate of degree of confidence of lowermost level candidate's vocabulary of the multistage vocabulary combination selection correspondence of N candidate that degree of confidence is the highest from said multistage vocabulary combination Candidate Set makes up the recognition result of confirming as said multistage vocabulary combination, wherein N is the integer more than or equal to 1.
5. like remarks 2 described methods; Wherein make up the degree of confidence tabulation and comprise the degree of confidence that finds out every grade of candidate's vocabulary of the multistage vocabulary combination of each candidate correspondence the multistage vocabulary combination Candidate Set from every grade of candidate's vocabulary degree of confidence tabulation, and calculate the degree of confidence of the degree of confidence weighted mean value of every grade of corresponding candidate's vocabulary of the multistage vocabulary combination of each candidate as the multistage vocabulary combination of this candidate according to every grade of multistage vocabulary of candidate's vocabulary degree of confidence tabulation calculated candidate.
6. like remarks 5 described methods; Wherein said every grade of vocabulary is made up of character; Every grade of candidate's vocabulary of said calculating degree of confidence tabulation comprises the degree of confidence of calculating the included character of every grade of vocabulary, and the degree of confidence of the included character of every grade of vocabulary is averaged the degree of confidence as every grade of candidate's vocabulary.
7. like remarks 6 described methods; The degree of confidence weighted mean value that wherein calculates every grade of corresponding candidate's vocabulary of the multistage vocabulary of each candidate combination comprises the character that is comprised with every grade of corresponding candidate's vocabulary of the multistage vocabulary combination of each candidate as the degree of confidence of the multistage vocabulary combination of this candidate number as weights, calculates the degree of confidence weighted mean value of every grade of corresponding candidate's vocabulary of the multistage vocabulary combination of each candidate divided by the number of the included character of the multistage vocabulary combination of each candidate.
8. discern the equipment that multistage vocabulary makes up for one kind, said multistage vocabulary combination comprises a plurality of other vocabulary of level, and different higher level's vocabulary is corresponding to the subclass of different subordinate's vocabulary Candidate Sets, and said equipment comprises:
The vocabulary recognition unit is configured to every grade of vocabulary of independent respectively identification,
Vocabulary combination recognition unit is configured to confirm the recognition result that multistage vocabulary makes up according to the recognition result of every grade of vocabulary.
9. like remarks 8 described equipment, the tabulation of every grade of candidate's vocabulary of wherein said vocabulary recognition unit computes degree of confidence, said every grade of vocabulary degree of confidence tabulation comprise the degree of confidence of each the candidate's vocabulary in the Candidate Set of every grade of vocabulary;
Said vocabulary combination recognition unit is confirmed the tabulation of the multistage vocabulary combination of candidate degree of confidence according to every grade of candidate's vocabulary degree of confidence tabulation, and the tabulation of the multistage vocabulary combination of said candidate degree of confidence comprises the degree of confidence of the multistage vocabulary combination of each candidate in the multistage vocabulary combination Candidate Set.
10. like remarks 9 described equipment; Wherein said vocabulary combination recognition unit selects the multistage vocabulary combination of N the highest candidate of degree of confidence to export as candidate's recognition result from said multistage vocabulary combination Candidate Set; And whether the degree of confidence of judging the lowermost level candidate's vocabulary in the combination of the multistage vocabulary of a highest candidate of degree of confidence is more than or equal to predetermined threshold; If then the recognition result of said multistage vocabulary combination is confirmed as in the multistage vocabulary combination of this candidate, if deny; Then refuse to know, wherein N is the integer more than or equal to 1.
11. like remarks 9 described equipment; The recognition result of said multistage vocabulary combination is confirmed as in the multistage vocabulary combination of candidate that the degree of confidence of lowermost level candidate's vocabulary that the wherein said vocabulary combination recognition unit multistage vocabulary combination selection of N candidate that degree of confidence is the highest from said multistage vocabulary combination Candidate Set is corresponding is the highest, and wherein N is the integer more than or equal to 1.
12. like remarks 9 described equipment; Wherein said vocabulary combination recognition unit finds out the degree of confidence of every grade of candidate's vocabulary of the multistage vocabulary combination of each candidate correspondence the multistage vocabulary combination Candidate Set from every grade of candidate's vocabulary degree of confidence tabulation, and calculates the degree of confidence of the degree of confidence weighted mean value of every grade of corresponding candidate's vocabulary of the multistage vocabulary combination of each candidate as the multistage vocabulary combination of this candidate.
13. like remarks 12 described equipment; Wherein said every grade of vocabulary is made up of character; The degree of confidence of the character that every grade of vocabulary of said vocabulary recognition unit computes is included, and the degree of confidence of the included character of every grade of vocabulary averaged the degree of confidence as every grade of candidate's vocabulary.
14. like remarks 13 described equipment; The number of the character that wherein said vocabulary combination recognition unit is comprised with every grade of corresponding candidate's vocabulary of the multistage vocabulary combination of each candidate as weights, calculates the degree of confidence weighted mean value of every grade of corresponding candidate's vocabulary of the multistage vocabulary combination of each candidate divided by the number of the included character of the multistage vocabulary combination of each candidate.

Claims (10)

1. the method for the multistage vocabulary combination of identification is characterized in that, said multistage vocabulary combination comprises a plurality of other vocabulary of level, and different higher level's vocabulary is corresponding to the subclass of different subordinate's vocabulary Candidate Sets, and said method comprises:
Every grade of vocabulary of independent respectively identification;
Confirm the recognition result of multistage vocabulary combination according to the recognition result of every grade of vocabulary.
2. the method for claim 1 is wherein discerned every grade of vocabulary and is comprised and calculate every grade of candidate's vocabulary degree of confidence tabulation, and said every grade of vocabulary degree of confidence tabulation comprises the degree of confidence of each the candidate's vocabulary in the Candidate Set of every grade of vocabulary;
The recognition result that calculates multistage vocabulary combination according to the recognition result of every grade of vocabulary comprises according to every grade of candidate's vocabulary degree of confidence tabulation confirms the tabulation of the multistage vocabulary combination of candidate degree of confidence, and the tabulation of the multistage vocabulary combination of said candidate degree of confidence comprises the degree of confidence of the multistage vocabulary combination of each candidate in the multistage vocabulary combination Candidate Set.
3. method as claimed in claim 2; Wherein confirm that according to the recognition result of every grade of vocabulary the recognition result of multistage vocabulary combination comprises that from said multistage vocabulary combination Candidate Set, selecting the multistage vocabulary of N the highest candidate of degree of confidence to make up exports as candidate's recognition result; And whether the degree of confidence of judging the lowermost level candidate's vocabulary in the combination of the multistage vocabulary of a highest candidate of degree of confidence is more than or equal to predetermined threshold; If then the recognition result of said multistage vocabulary combination is confirmed as in the multistage vocabulary combination of this candidate, if deny; Then refuse to know, wherein N is the integer more than or equal to 1.
4. method as claimed in claim 2; Wherein confirm that according to the recognition result of every grade of vocabulary the recognition result of multistage vocabulary combination comprises that the highest multistage vocabulary of candidate of degree of confidence of lowermost level candidate's vocabulary of the multistage vocabulary combination selection correspondence of N candidate that degree of confidence is the highest from said multistage vocabulary combination Candidate Set makes up the recognition result of confirming as said multistage vocabulary combination, wherein N is the integer more than or equal to 1.
5. method as claimed in claim 2; Wherein make up the degree of confidence tabulation and comprise the degree of confidence that finds out every grade of candidate's vocabulary of the multistage vocabulary combination of each candidate correspondence the multistage vocabulary combination Candidate Set from every grade of candidate's vocabulary degree of confidence tabulation, and calculate the degree of confidence of the degree of confidence weighted mean value of every grade of corresponding candidate's vocabulary of the multistage vocabulary combination of each candidate as the multistage vocabulary combination of this candidate according to every grade of multistage vocabulary of candidate's vocabulary degree of confidence tabulation calculated candidate.
6. the equipment of the multistage vocabulary combination of identification is characterized in that, said multistage vocabulary combination comprises a plurality of other vocabulary of level, and different higher level's vocabulary is corresponding to the subclass of different subordinate's vocabulary Candidate Sets, and said equipment comprises:
The vocabulary recognition unit is configured to every grade of vocabulary of independent respectively identification,
Vocabulary combination recognition unit is configured to confirm the recognition result that multistage vocabulary makes up according to the recognition result of every grade of vocabulary.
7. equipment as claimed in claim 6, every grade of candidate's vocabulary of wherein said vocabulary recognition unit computes degree of confidence tabulation, said every grade of vocabulary degree of confidence tabulation comprise the degree of confidence of each the candidate's vocabulary in the Candidate Set of every grade of vocabulary;
Said vocabulary combination recognition unit is confirmed the tabulation of the multistage vocabulary combination of candidate degree of confidence according to every grade of candidate's vocabulary degree of confidence tabulation, and the tabulation of the multistage vocabulary combination of said candidate degree of confidence comprises the degree of confidence of the multistage vocabulary combination of each candidate in the multistage vocabulary combination Candidate Set.
8. equipment as claimed in claim 7; Wherein said vocabulary combination recognition unit selects the multistage vocabulary combination of N the highest candidate of degree of confidence to export as candidate's recognition result from said multistage vocabulary combination Candidate Set; And whether the degree of confidence of judging the lowermost level candidate's vocabulary in the combination of the multistage vocabulary of a highest candidate of degree of confidence is more than or equal to predetermined threshold; If then the recognition result of said multistage vocabulary combination is confirmed as in the multistage vocabulary combination of this candidate, if deny; Then refuse to know, wherein N is the integer more than or equal to 1.
9. equipment as claimed in claim 7; The recognition result of said multistage vocabulary combination is confirmed as in the multistage vocabulary combination of candidate that the degree of confidence of lowermost level candidate's vocabulary that the wherein said vocabulary combination recognition unit multistage vocabulary combination selection of N candidate that degree of confidence is the highest from said multistage vocabulary combination Candidate Set is corresponding is the highest, and wherein N is the integer more than or equal to 1.
10. equipment as claimed in claim 7; Wherein said vocabulary combination recognition unit finds out the degree of confidence of every grade of candidate's vocabulary of the multistage vocabulary combination of each candidate correspondence the multistage vocabulary combination Candidate Set from every grade of candidate's vocabulary degree of confidence tabulation, and calculates the degree of confidence of the degree of confidence weighted mean value of every grade of corresponding candidate's vocabulary of the multistage vocabulary combination of each candidate as the multistage vocabulary combination of this candidate.
CN201010280236.7A 2010-09-09 2010-09-09 Method and equipment for recognizing multilevel word combination Expired - Fee Related CN102402695B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010280236.7A CN102402695B (en) 2010-09-09 2010-09-09 Method and equipment for recognizing multilevel word combination

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010280236.7A CN102402695B (en) 2010-09-09 2010-09-09 Method and equipment for recognizing multilevel word combination

Publications (2)

Publication Number Publication Date
CN102402695A true CN102402695A (en) 2012-04-04
CN102402695B CN102402695B (en) 2014-05-14

Family

ID=45884883

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010280236.7A Expired - Fee Related CN102402695B (en) 2010-09-09 2010-09-09 Method and equipment for recognizing multilevel word combination

Country Status (1)

Country Link
CN (1) CN102402695B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001054054A1 (en) * 2000-01-19 2001-07-26 California Institute Of Technology Word recognition using silhouette bar codes
CN101359373A (en) * 2007-08-03 2009-02-04 富士通株式会社 Method and device for recognizing degraded character
CN101645134A (en) * 2005-07-29 2010-02-10 富士通株式会社 Integral place name recognition method and integral place name recognition device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001054054A1 (en) * 2000-01-19 2001-07-26 California Institute Of Technology Word recognition using silhouette bar codes
CN101645134A (en) * 2005-07-29 2010-02-10 富士通株式会社 Integral place name recognition method and integral place name recognition device
CN101359373A (en) * 2007-08-03 2009-02-04 富士通株式会社 Method and device for recognizing degraded character

Also Published As

Publication number Publication date
CN102402695B (en) 2014-05-14

Similar Documents

Publication Publication Date Title
CN107844417B (en) Test case generation method and device
CN107295039B (en) Data access processing method and device
US8751550B2 (en) Freeform mathematical computations
US20230385333A1 (en) Method and system for building training database using automatic anomaly detection and automatic labeling technology
CN101887390A (en) Method and device for evaluating rating of application software
CN106294105A (en) Brush amount tool detection method and apparatus
CN110689084B (en) Abnormal user identification method and device
CN111931047B (en) Artificial intelligence-based black product account detection method and related device
CN105718072A (en) Character output method and mobile terminal
CN104064182A (en) A voice recognition system and method based on classification rules
CN107341202B (en) Business datum table corrects appraisal procedure, device and the storage medium of danger level
CN114169439A (en) Abnormal communication number identification method and device, electronic equipment and readable medium
CN105574480A (en) Information processing method and apparatus and terminal
CN110633318A (en) Data extraction processing method, device, equipment and storage medium
CN103929499A (en) Internet of things heterogeneous identification recognition method and system
CN105590167A (en) Method and device for analyzing electric field multivariate operating data
CN102402695A (en) Method and equipment for recognizing multilevel word combination
CN110598112A (en) Topic recommendation method and device, terminal equipment and storage medium
CN101901355A (en) Character recognition method and device based on maximum entropy
CN114444514A (en) Semantic matching model training method, semantic matching method and related device
CN111539576B (en) Risk identification model optimization method and device
CN109785099B (en) Method and system for automatically processing service data information
CN103092752B (en) Error identification method of instrument attributes
CN112632132A (en) Method, device and equipment for processing abnormal import data
CN112860671A (en) Production factor data abnormity diagnosis method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140514

Termination date: 20210909

CF01 Termination of patent right due to non-payment of annual fee