CN106940798A - The modification method and terminal of a kind of Text region - Google Patents

The modification method and terminal of a kind of Text region Download PDF

Info

Publication number
CN106940798A
CN106940798A CN201710135955.1A CN201710135955A CN106940798A CN 106940798 A CN106940798 A CN 106940798A CN 201710135955 A CN201710135955 A CN 201710135955A CN 106940798 A CN106940798 A CN 106940798A
Authority
CN
China
Prior art keywords
word
modified
character
character features
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201710135955.1A
Other languages
Chinese (zh)
Inventor
江克俊
刘海强
曹晓杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Jinli Communication Equipment Co Ltd
Original Assignee
Shenzhen Jinli Communication Equipment Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Jinli Communication Equipment Co Ltd filed Critical Shenzhen Jinli Communication Equipment Co Ltd
Priority to CN201710135955.1A priority Critical patent/CN106940798A/en
Publication of CN106940798A publication Critical patent/CN106940798A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/1475Inclination or skew detection or correction of characters or of image to be recognised
    • G06V30/1478Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • G06V10/235Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on user input or interaction

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)

Abstract

The embodiment of the invention discloses a kind of modification method of Text region and terminal, methods described includes |:Generation amendment request, amendment request includes word to be modified, and word to be modified, which is selected from, utilizes the Text region result that OCR identification prestores the pictograph on picture and generates;Obtain the character features of pictograph corresponding with word to be modified in the picture that prestores;The character features of the character features of pictograph and the word that prestores are contrasted to obtain some candidate characters, the similarity of the character features of some candidate characters and the character features of corresponding pictograph is higher than preset standard value;Show that some candidate characters select for user;If detect user have selected candidate character from some candidate characters, by the word to be modified in selected candidate character Alternate text recognition result.The embodiment of the present invention can reduce the operation of user by the above method, improve the efficiency of amendment.

Description

The modification method and terminal of a kind of Text region
Technical field
The present invention relates to the modification method and terminal of electronic technology field, more particularly to a kind of Text region.
Background technology
OCR is referred to as OCR (Optical Character Recognition) identification technology, is one Plant the image recognition technology that word is identified by optical technology.OCR has been widely applied to The field of automatic identification technology research.For example when setting up library online, it is stored in the form of a file after paper book is scanned Electronic recognition, then by being shown after OCR Text regions in the form of text.Often occurs identification when being recognized due to OCR wrong Situation, occurs after the situation of identity confusion, and identification error, usually by user's hand by mistake when especially recognizing similar word Dynamic edit-modify, i.e. user input correct word in keyboard, but the correcting mode needs the cumbersome operation of user, and lead Cause the efficiency of amendment low.
The content of the invention
The embodiment of the present invention provides the modification method and terminal of a kind of Text region, it is possible to reduce the operation of user, carries The efficiency of height amendment.
In a first aspect, the embodiments of the invention provide a kind of modification method of Text region, method includes:
Amendment request is generated, amendment request includes word to be modified, and word to be modified, which is selected from, utilizes optical character The Text region result that identification technology identification prestores the pictograph on picture and generated;Obtain in the picture that prestores with it is to be modified The corresponding pictograph of word character features;The character features of the character features of pictograph and the word that prestores are carried out Contrast to obtain some candidate characters, the phase of the character features of some candidate characters and the character features of corresponding pictograph It is higher than preset standard value like degree;Show that some candidate characters select for user;If detecting user from some candidate characters Candidate character is have selected, selected candidate character is replaced into the word to be modified in the Text region result.
On the other hand, the embodiments of the invention provide a kind of terminal, terminal includes:Lead-out unit, acquiring unit, contrast are single Member, display unit and replacement unit.
Wherein, lead-out unit is used to generate amendment request, and amendment request includes word to be modified, word choosing to be modified From in recognizing the Text region result that prestores the pictograph on picture and generate using OCR;Acquiring unit Character features for obtaining corresponding with word to be modified pictograph in the picture that prestores;Comparison unit is used for will figure As the character features of word and the character features for the word that prestores are contrasted and are obtained some candidate characters in the word that prestores, if The similarity of the character features of dry candidate character and the character features of corresponding pictograph is remaining literary higher than in the word that prestores The similarity of word and the character features of corresponding pictograph;Display unit is used to show some candidate characters, for user Selection;If replacement unit have selected candidate character for detecting user from some candidate characters, by selected candidate text Word replaces the word to be modified in the Text region result.
The modification method of a kind of Text region disclosed in the embodiment of the present invention, by correcting acquisition request to be modified Word, then obtain the character features of corresponding with word to be modified pictograph in the picture that prestores;And by pictograph Character features and the character features of word of prestoring contrasted to obtain some candidate characters;Show some candidate characters for User selects;The candidate character being easily selected by a user is replaced into the word to be modified in the Text region result again.Due to inciting somebody to action Show that candidate character is selected for user, therefore user no longer needs input through keyboard amendment word, reduces the amendment operation of user, Improve amendment efficiency.
Brief description of the drawings
Technical scheme in order to illustrate more clearly the embodiments of the present invention, embodiment will be described below needed for be used Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are some embodiments of the present invention, general for this area For logical technical staff, on the premise of not paying creative work, other accompanying drawings can also be obtained according to these accompanying drawings.
Fig. 1 is a kind of schematic flow diagram of the modification method of Text region provided in an embodiment of the present invention;
Fig. 2 a are the first display interfaces of current screen provided in an embodiment of the present invention;
Fig. 2 b are second of display interfaces of current screen provided in an embodiment of the present invention;
Fig. 3 a are the third display interfaces of current screen provided in an embodiment of the present invention;
Fig. 3 b are the 4th kind of display interfaces of current screen provided in an embodiment of the present invention;
Fig. 4 is the schematic flow diagram of the sub-process in Fig. 1;
Fig. 5 is a kind of schematic block diagram of the first embodiment of terminal provided in an embodiment of the present invention;
Fig. 6 is a kind of schematic block diagram of comparison unit in Fig. 5 provided in an embodiment of the present invention;
Fig. 7 is a kind of schematic block diagram of the second embodiment of terminal provided in an embodiment of the present invention;
Fig. 8 is a kind of schematic block diagram of the 3rd embodiment of terminal provided in an embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is a part of embodiment of the invention, rather than whole embodiments.Based on this hair Embodiment in bright, the every other implementation that those of ordinary skill in the art are obtained under the premise of creative work is not made Example, belongs to the scope of protection of the invention.
It should be appreciated that ought be in this specification and in the appended claims in use, term " comprising " and "comprising" be indicated Described feature, entirety, step, operation, the presence of element and/or component, but be not precluded from one or several further features, Entirety, step, operation, element, component and/or its presence or addition for gathering.It is also understood that in this description of the invention Used term is not intended to limit the present invention merely for the sake of the purpose of description specific embodiment.Such as in explanation of the invention As used in book and appended claims, unless context clearly indicates other situations, otherwise singulative " one ", " one " and "the" are intended to include plural form.
A kind of modification method of Text region provided in an embodiment of the present invention runs on terminal, and terminal includes but do not limited In any a electronic equipment that can be with user's progress man-machine interaction, such as smart mobile phone (such as Android phone, ios hands Machine, Windows Phone mobile phones etc.), panel computer, palm PC, notebook computer, mobile internet device etc..Above electricity Sub- equipment is only citing, and non exhaustive, and terminal provided in an embodiment of the present invention includes but is not restricted to above-mentioned electronic equipment.But It is that the terminal provided in an embodiment of the present invention for needing explanation has optical character identification (Optical Character Recognition) function, can carry out OCR identifications.
The reason for usually being gone wrong when ORC identification technologies are identified, a kind of text provided in an embodiment of the present invention The modification method of word identification, can correct the mistake in the Text region result after ORC Text regions, reduce user amendment behaviour Make, improve amendment efficiency.
Fig. 1 is refer to, is a kind of schematic flow diagram of the modification method of Text region provided in an embodiment of the present invention, is such as schemed Shown, this method includes S101~S105:
S101, generation amendment request, amendment request includes word to be modified.
Specifically, word can be Chinese character, letter or numeral, the present invention is to this without specific restriction.It is to be modified Word be selected from and prestore the pictograph on picture using OCR identification and generate Text region result.I.e. The pictograph prestored on picture is identified first with OCR identification technologies and is generated Text region result and is known word Other result is shown on screen, then is selected word to be modified from Text region result and generated amendment request, is selected to be repaired The mode of positive word is not limited to operate the mode for being selected or being automatically selected according to user.
S102, obtains the character features of pictograph corresponding with word to be modified in the picture that prestores.
Specifically, because word to be modified selected from OCR recognizes the Text region result of picture of prestoring, and Text region Result is corresponded with the pictograph in the picture that prestores, therefore is existed in the picture that prestores corresponding with word to be modified Pictograph.Such as word to be modified is " jump the string 1 bird ", then corresponding pictograph is the image text of " badly frightened person " Word.
It should be noted that the character features of word are the recognition factors in OCR identification technologies, character features are included but not It is limited to stroke feature, stroke feature includes but is not limited to the area distribution of the rule of stroke, the relative position of stroke and stroke Etc. the factor.
It should also be noted that, the text word to be repaired obtained can be single word or phrase.If word to be modified It is single word, the character features of the corresponding pictograph of single word can be obtained;If word to be modified is phrase, obtain Take the character features of each word in the corresponding pictograph of phrase.
S103, the character features of the character features of pictograph and the word that prestores are contrasted to obtain some candidate's texts Word, the similarity of the character features of some candidate characters and the character features of corresponding described image word is higher than preset standard Value.
Specifically, because OCR identification technologies are to recognize word according to the character features of word.Therefore by the image of acquisition The character features of the character features of word and the word that prestores can find out candidate character.Candidate character should be understood to the text that prestores The word more similar to the pictograph of acquisition, the i.e. similarity of character features are higher than the word that prestores of preset standard value in word. For example when pictograph is " it ", the word of candidate is " 1 ", " I " and " i " etc..
It should be appreciated that if word to be modified is single word, some candidate characters obtained are single word;If treating Correction tape word is phrase, then some candidate characters obtained are phrase.
S104, shows that some candidate characters select for user.
S105, if detect user have selected candidate character from some candidate characters, selected candidate character is replaced The word to be modified changed in the Text region result.
It should be noted that as shown in Figure 2 a, current screen preferably is included into the He of the first viewing area 31 in the present embodiment Second viewing area 32, the first viewing area 31 is used to show Text region result, and the second viewing area 32 is used to show candidate Word.
Specifically, 2a and Fig. 2 b are please see Figure, as illustrated, when word to be modified in Fig. 2 a is " jump the string 1 bird ", if display Dry candidate character is " jump the string 1 bird " " jump the string i birds " " badly frightened person ".Fig. 2 b show user and have selected " shying in candidate character After the bird of bow ", " jump the string 1 bird " in word recognition result in the first viewing area 31 is replaced with " badly frightened person ".
It should be appreciated that Text region result be although identified using OCR identification technologies and be that OCR is identified and schemed As word similarity highest word, but it is due to the not equal reason of writing that there is font difference or different user, therefore The similarity highest word that OCR is identified is not necessarily correct word, and then causes OCR to recognize mistake.The embodiment of the present invention A kind of modification method of the Text region provided will be obtained by obtaining the character features of the corresponding pictograph of text word to be repaired The character features of the character features of the pictograph taken and the word that prestores are contrasted to obtain some candidate characters for user Selection, and candidate character is more similar to pictograph, and then user is when being modified, it is only necessary to selected from candidate character Correct candidate character is that revision can be achieved, and user need not input correct word in keyboard and amendment can be achieved, and reduce The amendment operation of OCR Text regions, improves amendment efficiency.On the one hand, the embodiment of the present invention can provide the selection behaviour of user The word to be modified that elects and generate amendment request;On the other hand, the embodiment of the present invention can also automatically select to be modified Word and generate amendment request, amendment efficiency can be effectively improved.
Preferably, S101, generation amendment request, including:
Detect whether there is the operation that user selects word from Text region result, if there is the operation of selection word, Generation amendment request, amendment request includes selected word, and selected word is word to be modified;If in the absence of selection The operation of word, does not generate amendment request, and flow terminates.
Touching screen taps or double-click specifically, the operation of selection word can be user or sliding selection word Or user utilizes the physical buttons such as mouse selection word.As shown in Figure 3 a and Figure 3 b shows, Fig. 3 a show OCR and recognize the picture that prestores The Text region result generated afterwards, and be shown on touch screen.Fig. 3 b, which show user and clicked on screen is touched, " jumps the string 1 Bird " generates amendment request for word to be modified.
It should be noted that in further embodiments, S101, generation amendment request, including:
Whether there is default word in detection Text region result, default word is pre-set based on optical character Error frequency is higher than the word of particular value during identification;
If there is default word, amendment request is generated, the amendment request includes default word, and default word is to treat The word of amendment;If the word of Non-precondition, amendment request is not generated, flow terminates.
Specifically, preferably particular value is in the high word of error frequency, the present embodiment when default word is considered as OCR identifications Calculated by some data samples.Default word is set to such as by " 1 " " i " " it ", when detecting Text region As a result there is default word in, then automatically generate amendment request.
Similarly, in other feasible embodiments, it can also be prompted the user whether by having detected whether default word Revised, if user's selection is modified, generation amendment request, default word is word to be modified.
Preferably, a kind of modification method of Text region provided in an embodiment of the present invention also includes:
Some grades are divided according to the stroke quantity of word to be modified, the corresponding preset standard value of different grades is not Together.
Specifically, this, which is the complexity based on word, will influence whether OCR recognition results.The simple quantity of such as stroke is few Word all the more easily occur OCR identification mistake situation.Therefore preferably carried out in the present embodiment according to the stroke quantity of word Grade classification is to improve the accuracy rate of amendment.For example the corresponding preset standard value of the few grade of stroke quantity is less than stroke quantity The corresponding preset standard value of many grades, and then ensure that the few word of stroke quantity can be increased with candidate character.
Therefore, S103 is contrasted the character features of the character features of pictograph and the word that prestores to obtain some times Before selection word, a kind of above-mentioned modification method of Text region also includes:
The stroke quantity of identification word to be modified;And corresponding preset standard value is obtained according to stroke quantity.
If it should be noted that when word to be modified is single word or phrase, by the text of pictograph in S103 The mode that word feature is contrasted to obtain some candidate characters with the character features for the word that prestores will be different.
If specifically, word to be modified is single word, by the character features of pictograph during acquisition candidate character Contrasted to obtain the word that prestores that similarity is higher than preset standard value with the character features for the word that prestores, similarity is higher than pre- If the word that prestores of standard value is candidate character.
If word to be modified is phrase, it please see Figure the character features of pictograph in 4, the present embodiment in preferred S103 Contrasted to obtain some candidate characters with the character features for the word that prestores, including:
S401, by the character features of the corresponding pictograph of each word in word to be modified respectively with the word that prestores Character features are contrasted to obtain the corresponding candidate character of each word.
, will wherein " jump the string 1 bird " each corresponding pictograph of word if the word of such as amendment is " jump the string 1 bird " " frightened ", " bow ", " it ", the character features of the character features word respectively with prestoring of " bird " are contrasted to obtain each word Corresponding candidate character.The candidate character preferably obtained in the present embodiment is the phase with the character features of corresponding pictograph The word that prestores like degree higher than similarity preset value.Wherein similarity preset value can be identical with preset standard value, and similarity is pre- If value can also be different from preset standard value.
S402, is combined to form some phrases according to the corresponding candidate character of each word in word to be modified.
It should be appreciated that according to the rule of permutation and combination, if the quantity of the corresponding candidate character of each word is more, acquisition The combination of candidate character also will be more.
S403, obtains the similarity of each word and the character features of corresponding pictograph in some phrases.
For example have 6 phrases after candidate character is combined, then calculate each word in 6 phrases in each phrase with The similarity of the character features of corresponding pictograph.
S404, calculates the similarity of each phrase in some phrases, and the similarity of each phrase is each word obtained Each word and the average value of the similarity of the character features of corresponding pictograph in group.
If specifically, there is 4 words in 1 phrase, calculate this 4 words respectively with corresponding pictograph The average value of the similarity of character features, the average value is the similarity of 1 phrase.
S405, the phrase by the similarity of phrase in some phrases higher than preset standard value is set to candidate character.
From above-mentioned known, if word to be modified in which is phrase, phrase is subjected to fractionation identification, then carry out group Close, which can get candidate character more fully hereinafter, and then improve the accuracy rate of amendment.In other feasible embodiments In, if word to be modified is phrase, directly phrase is identified as an entirety to obtain candidate character, i.e., directly The character features of the character features of the candidate character of phrase mode and the pictograph of corresponding phrase mode are contrasted.
It refer to Fig. 5, a kind of schematic block diagram of the first embodiment of terminal provided in an embodiment of the present invention, such as figure institute Show, the terminal 50 includes lead-out unit 501, acquiring unit 502, comparison unit 503, display unit 504 and replacement unit 505.
Wherein, lead-out unit 501, for generating amendment request, amendment request includes word to be modified.
Specifically, word can be Chinese character, letter or numeral.Word to be modified is selected from and utilizes optical character identification Technology, which is recognized, to prestore the pictograph on picture and generates Text region result.
Acquiring unit 502, the word for obtaining corresponding with word to be modified pictograph in the picture that prestores Feature.
Specifically, because word to be modified selected from OCR recognizes the Text region result of picture of prestoring, and Text region Result is corresponded with the pictograph in the picture that prestores, therefore is existed in the picture that prestores corresponding with word to be modified Pictograph.The character features of word are the recognition factors in OCR identification technologies, and the character features of word include the rule of stroke Then, the factor such as the relative position of stroke and the area distribution of stroke.
It should also be noted that, the text word to be repaired obtained can be single word or phrase.If word to be modified It is single word, the character features of the corresponding pictograph of single word can be obtained;If word to be modified is phrase, obtain Take the character features of each word in the corresponding pictograph of phrase.
Comparison unit 503, is obtained for the character features of pictograph to be contrasted with the character features for the word that prestores Take some candidate characters, the similarity of the character features of some candidate characters and the character features of corresponding described image word Higher than preset standard value.
Specifically, because OCR identification technologies are to recognize word according to the character features of word.Therefore by the image of acquisition The character features of the character features of word and the word that prestores can find out candidate character.Candidate character should be understood to the text that prestores The word more similar to the pictograph of acquisition, the i.e. similarity of character features are higher than the word that prestores of preset standard value in word. It should be appreciated that if word to be modified is single word, some candidate characters obtained are single word;If band text to be modified Word is phrase, then some candidate characters obtained are phrase.
Display unit 504, for showing that some candidate characters select for user.
Replacement unit 505, if having selected candidate character for detecting user from some candidate characters, is waited selected Selection word replaces the word to be modified in the Text region result.
It should be noted that as shown in Figure 2 a, in the present embodiment preferably by current screen be divided into the first viewing area and Second viewing area, the first viewing area is used to show Text region result, and the second viewing area is used to show candidate character.
Preferably, a kind of terminal 50 provided in an embodiment of the present invention also includes:Division unit 508, identification unit 509 and Harvest unit 510.
Division unit 508, for dividing some grades, different grade correspondences according to the stroke quantity of word to be modified Preset standard value it is different.
Specifically, this, which is the complexity based on word, will influence whether OCR recognition results.The simple quantity of such as stroke is few Word all the more easily occur OCR identification mistake situation.Therefore preferably carried out in the present embodiment according to the stroke quantity of word Grade classification is to improve the accuracy rate of amendment.For example the corresponding preset standard value of the few grade of stroke quantity is less than stroke quantity The corresponding preset standard value of many grades, and then ensure that the few word of stroke quantity can be increased with candidate character.
Therefore, comparison unit 503 is contrasted the character features of pictograph with the character features for the word that prestores to obtain Take before some candidate characters, identify unit 509, the stroke quantity for recognizing the word to be modified;Harvest unit 510, for obtaining corresponding preset standard value according to stroke quantity.
If it should be noted that when word to be modified is single word or phrase, comparison unit 503 is by pictograph The character features mode that is contrasted to obtain some candidate characters with the character features for the word that prestores will be different.
If specifically, word to be modified is single word, by the character features of pictograph during acquisition candidate character Contrasted to obtain the word that prestores that similarity is higher than preset standard value with the character features for the word that prestores, similarity is higher than pre- If the word that prestores of standard value is candidate character.
If word to be modified is phrase, please see Figure preferred comparison unit 503 in 6, the present embodiment includes:Discriminating unit 601st, assembled unit 602, take-up unit 603, computing unit 604 and setting unit 605.
Discriminating unit 601, for the character features of the corresponding pictograph of each word in word to be modified to be distinguished Contrasted to obtain the corresponding candidate character of each word with the character features for the word that prestores.
, will wherein " jump the string 1 bird " each corresponding pictograph of word if the word of such as amendment is " jump the string 1 bird " " frightened ", " bow ", " it ", the character features of the character features word respectively with prestoring of " bird " are contrasted to obtain each word Corresponding candidate character.The candidate character preferably obtained in the present embodiment is the phase with the character features of corresponding pictograph The word that prestores like degree higher than similarity preset value.Wherein similarity preset value can be preset with preset standard value phase, similarity Value can also be different from preset standard value.
Assembled unit 602, for being combined according to the corresponding candidate character of each word in word to be modified with shape Into some phrases.
It should be appreciated that according to the rule of permutation and combination, if the quantity of the corresponding candidate character of each word is more, acquisition The combination of candidate character also will be more.
Take-up unit 603, for obtaining each word and the character features of corresponding pictograph in some phrases Similarity.
For example have 6 phrases after candidate character is combined, then calculate each word in 6 phrases in each phrase with The similarity of the character features of corresponding pictograph.
Computing unit 604, the similarity for calculating each phrase in some phrases, the similarity of each phrase is to obtain Each word and the average value of the similarity of the character features of corresponding pictograph in each phrase taken.
If specifically, there is 4 words in 1 phrase, calculate this 4 words respectively with corresponding pictograph The average value of the similarity of character features, the average value is the similarity of 1 phrase.
Setting unit 605, for being set to the similarity of phrase in some phrases higher than the phrase of preset standard value to wait Selection word.
Continuing with seeing Fig. 5, it is preferable that lead-out unit 501 includes:First detection unit 506 and the first generation unit 507.
First detection unit 506, for detecting whether there is the operation that user selects word from Text region result, the One generation unit 507, if for there is the operation of selection word, generation amendment request, amendment request includes selected word, Selected word is word to be modified;If in the absence of the operation of selection word, not generating amendment request, flow terminates.
Touching screen taps or double-click specifically, the operation of selection word can be user or sliding selection word Or user utilizes the physical buttons such as mouse selection word.
In further embodiments, it please see Figure the second embodiment that 7, Fig. 7 is a kind of terminal provided in an embodiment of the present invention Schematic block diagram.Terminal 70 includes lead-out unit 701, acquiring unit 702 in the present embodiment.Comparison unit 703, display unit 704th, replacement unit 705, division unit 708, identification unit 709 and harvest unit 710, specifically refer in first embodiment Terminal corresponding description, will not be described in great detail in the present embodiment.But it should be recognized that in the present embodiment, lead-out unit 701 Including:Second detection unit 706 and the second generation unit 707.
Wherein, the second detection unit 706, for detecting whether there is default word in Text region result, default text Word be pre-set based on optical character identification when error frequency be higher than particular value word;
Second generation unit 707, if for there is default word, generation amendment request, the amendment request includes default Word, default word be word to be modified;If the word of Non-precondition, amendment request is not generated, flow terminates.
Specifically, preferably particular value is in the high word of error frequency, the present embodiment when default word is considered as OCR identifications Calculated by some data samples.Similarly, in other feasible embodiments, it can also be examined by the second detection unit 706 Whether survey has default word to prompt the user whether to be revised, if user's selection is modified, generation amendment request, in advance If word be word to be modified.
It is a kind of schematic block diagram of the 3rd embodiment of terminal provided in an embodiment of the present invention referring to Fig. 8.As depicted Terminal in the present embodiment can include one or several processors 801, one or several input units 802, one Individual or several output devices 803 and memory 804.Above-mentioned processor 801, input unit 802, output device 803 with And memory 804 is connected by bus.
Input unit 802 is used for the information for receiving user's operation input.In implementing, the input dress of the embodiment of the present invention Putting 802 may include keyboard, mouse, photoelectricity input unit, acoustic input dephonoprojectoscope, touch input unit, scanner etc..
Output device 803 is used for external output information to user.In implementing, the output device of the embodiment of the present invention 803 may include display, loudspeaker, printer etc..
Memory 804 is used to store the routine data with various functions, in implementing, the storage of the embodiment of the present invention Device 804 can be system storage, such as, volatile (such as RAM), non-volatile (such as ROM, flash memory etc.), or Both combinations.In implementing, the memory 204 of the embodiment of the present invention can also be the external memory storage outside system, than Such as, disk, CD, tape etc..
Processor 801 is used for the instruction for calling the routine data stored in memory 804 to perform the storage of memory 804, And perform following operation:
Generation amendment request, amendment request includes word to be modified;Obtain in the picture that prestores with word to be modified The character features of corresponding pictograph;The character features of the character features of pictograph and the word that prestores are contrasted to come Some candidate characters are obtained, the character features of some candidate characters are similar to the character features of corresponding described image word Degree is higher than preset standard value;Show that some candidate characters select for user;If detecting user to select from some candidate characters Candidate character has been selected, selected candidate character has been replaced into the word to be modified in the Text region result.
During generation amendment request, processor 801 also performs following operation:
Detect whether there is the operation that user selects word from Text region result, if there is the operation of selection word, Generation amendment request, amendment request includes selected word, and selected word is word to be modified;If in the absence of selection The operation of word, does not generate amendment request.
In further embodiments, during generation amendment request, processor 801 also performs following operation:Detect Text region As a result whether have default word in, default word be pre-set based on optical character identification when error frequency higher than spy The word of definite value;If there is default word, generation amendment request, the amendment request includes default word, default word For word to be modified;If the word of Non-precondition, amendment request is not generated.
Processor 801 also performs following operation:Some grades are divided according to the stroke quantity of word to be modified, it is different The corresponding preset standard value of grade is different.
Therefore, the character features of the character features of pictograph and the word that prestores are contrasted to obtain some candidate's texts Before word, processor 801 also performs following operation:
Recognize the stroke quantity of the word to be modified;And the corresponding pre- bidding is obtained according to stroke quantity Quasi- value.
If word to be modified is phrase, processor 801 is special by the word of the character features of pictograph and the word that prestores Levy and contrasted to obtain some candidate characters, including:
By the word of the character features of the corresponding pictograph of each word in word to be modified respectively with the word that prestores Feature is contrasted to obtain the corresponding candidate character of each word;According to the corresponding candidate of each word in word to be modified Word is combined as some phrases;Obtain the phase of each word and the character features of corresponding pictograph in some phrases Like degree;The similarity of each phrase in some phrases is calculated, the similarity of each phrase is each in each phrase of acquisition Word and the average value of the similarity of the character features of corresponding pictograph;And by the similarity of phrase in some phrases Phrase higher than preset standard value sets candidate character.
It should be appreciated that in embodiments of the present invention, alleged processor 801 can be central first processing units (Central Processing Unit, CPU), the processor 801 can also be other general processors, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-Programmable GateArray, FPGA) or other programmable logic devices Part, discrete gate or transistor logic, discrete hardware components etc..General processor can be microprocessor or the processing Device can also be any conventional processor etc..
It should be noted that the step in present invention method can the adjustment of carry out order, conjunction according to actual needs And and delete.
Unit in embodiment of the present invention terminal can be combined, divided and deleted according to actual needs.
It is apparent to those skilled in the art that, for convenience of description and succinctly, the end of foregoing description End and the specific work process of unit, may be referred to the corresponding process in preceding method embodiment, will not be repeated here.In this Shen In the several embodiments please provided, it should be understood that disclosed terminal and method, it can realize by another way. For example, device embodiment described above is only schematical, for example, the division of the unit, only a kind of logic Function is divided, and can have other dividing mode when actually realizing, such as several units or component can combine or can be with Another system is integrated into, or some features can be ignored, or do not perform.In addition, shown or discussed coupling each other Conjunction or direct-coupling or communication connection can be INDIRECT COUPLING or the communication connection by some interfaces, device or unit, also may be used To be electric, mechanical or other forms are connected.
In addition, each functional unit in each embodiment of the invention can be integrated in a first processing units, Can be that unit is individually physically present or two or more units are integrated in a unit.Above-mentioned collection Into unit can both have been realized in the form of hardware, it would however also be possible to employ the form of SFU software functional unit is realized.It is described integrated If unit in the form of SFU software functional unit to realize and as independent production marketing or in use, can be stored in terms of one In calculation machine read/write memory medium.Understood based on such, technical scheme is substantially done to prior art in other words Go out the part of contribution, or all or part of the technical scheme can be embodied in the form of software product, the computer Software product is stored in a storage medium, including some instructions are make it that a computer equipment (can be personal meter Calculation machine, server, or network equipment etc.) perform all or part of step of each of the invention embodiment methods described.And it is preceding The storage medium stated includes:USB flash disk, mobile hard disk, read-only storage (ROM, Read-Only Memory), random access memory (RAM, RandomAccess Memory), magnetic disc or CD etc. are various can be with the medium of store program codes.
The above, is the embodiment of the present invention, but protection scope of the present invention is not limited thereto, any ripe Know those skilled in the art the invention discloses technical scope in, various equivalent modifications can be readily occurred in or replaced Change, these modifications or replacement should be all included within the scope of the present invention.Therefore, protection scope of the present invention should be with right It is required that protection domain be defined.

Claims (10)

1. a kind of modification method of Text region, it is characterised in that including:
Amendment request is generated, the amendment request includes word to be modified, and the word to be modified is selected from and utilizes optics The Text region result that character recognition technologies identification prestores the pictograph on picture and generated;
Obtain the character features of pictograph corresponding with the word to be modified in the picture that prestores;
The character features of the character features of described image word and the word that prestores are contrasted to obtain some candidate characters, institute The similarity of the character features and the character features of corresponding described image word of stating some candidate characters is higher than preset standard Value;
Show that some candidate characters select for user;
If detect user have selected candidate character from some candidate characters, selected candidate character is replaced described Word to be modified in Text region result.
2. method according to claim 1, it is characterised in that the generation amendment request includes:
Detect whether there is the operation that user selects word from the Text region result;
If there is the operation of selection word, generation amendment request, the amendment request includes selected word, described to be chosen Word be the word to be modified.
3. according to the method described in claim 1, it is characterised in that the generation amendment request includes:
Detect whether there is default word in the Text region result, the default word is pre-set based on optics Error frequency is higher than the word of particular value during character recognition;
If there is default word, generation amendment request, the amendment request includes the default word, the default word For the word to be modified.
4. according to the method described in claim 1, it is characterised in that methods described also includes:
Some grades are divided according to the stroke quantity of text word to be repaired, the corresponding preset standard value of different grades is different;
By the character features of described image word and prestore word character features contrasted obtain some candidate characters it Before, methods described also includes:
Recognize the stroke quantity of the word to be modified;
The corresponding preset standard value is obtained according to stroke quantity.
5. according to the method described in claim 1, it is characterised in that described by the figure if word to be modified includes phrase As the character features of word and the character features for the word that prestores are contrasted and are obtained some candidate characters, including:
By the word of the character features of the corresponding described image word of each word in word to be modified respectively with the word that prestores Feature is contrasted to obtain the corresponding candidate character of each word;
Each corresponding candidate character of word is combined to form some phrases according to the word to be modified;
Obtain the similarity of each word and the character features of corresponding pictograph in some phrases;
The similarity of each phrase in some phrases is calculated, the similarity of each phrase is each phrase obtained In each word and the average value of the similarity of the character features of corresponding pictograph;
Phrase by the similarity of phrase in some phrases higher than preset standard value is set to the candidate character.
6. a kind of terminal, it is characterised in that the terminal includes:
Lead-out unit, for generating amendment request, the amendment request includes word to be modified, the word choosing to be modified From in recognizing the Text region result that prestores the pictograph on picture and generate using OCR;
Acquiring unit, the word for obtaining corresponding with the word to be modified pictograph in the picture that prestores Feature;
Comparison unit, for the character features of the character features of described image word and the word that prestores to be contrasted to obtain State some candidate characters in the word that prestores, the character features of some candidate characters and corresponding described image word The similarity of character features is similar to the character features of corresponding described image word higher than remaining word in the word that prestores Degree;
Display unit, for showing some candidate characters, so that user selects;
Replacement unit, if having selected candidate character for detecting user from some candidate characters, is waited selected Selection word replaces the word to be modified in the Text region result.
7. terminal according to claim 6, it is characterised in that the lead-out unit includes:
First detection unit, for detecting whether there is the operation that user selects word from the Text region result;
First generation unit, if for there is the operation of selection word, generation amendment request, the amendment request includes quilt The word of selection, the selected word is the word to be modified.
8. terminal according to claim 6, it is characterised in that the lead-out unit includes:
Second detection unit, for detecting whether there is default word in the Text region result, the default word is Pre-set based on optical character identification when error frequency be higher than particular value word;
Second generation unit, if for there is default word, generation amendment request, the amendment request includes the default text Word, the default word is the word to be modified.
9. terminal according to claim 6, it is characterised in that the terminal also includes:
Division unit, for dividing some grades according to the stroke quantity of text word to be repaired, different grades is corresponding described pre- If standard value is different;
Identify unit, the stroke quantity for recognizing the word to be modified;
Unit is harvested, for obtaining the corresponding preset standard value according to stroke quantity.
10. terminal according to claim 6, it is characterised in that if word to be modified includes phrase, the comparison unit Including:
Discriminating unit, for by the character features of the corresponding described image word of each word in word to be modified respectively with advance The character features for depositing word are contrasted to obtain the corresponding candidate character of each word;
Assembled unit, is combined with shape for each corresponding candidate character of word according to the word to be modified Into some phrases;
Take-up unit, it is similar to the character features of corresponding pictograph for obtaining each word in some phrases Degree;
Computing unit, the similarity for calculating each phrase in some phrases, the similarity of each phrase is Each word and the average value of the similarity of the character features of corresponding pictograph in each phrase obtained;
Setting unit, the time is set to for the phrase by the similarity of phrase in some phrases higher than preset standard value Selection word.
CN201710135955.1A 2017-03-08 2017-03-08 The modification method and terminal of a kind of Text region Withdrawn CN106940798A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710135955.1A CN106940798A (en) 2017-03-08 2017-03-08 The modification method and terminal of a kind of Text region

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710135955.1A CN106940798A (en) 2017-03-08 2017-03-08 The modification method and terminal of a kind of Text region

Publications (1)

Publication Number Publication Date
CN106940798A true CN106940798A (en) 2017-07-11

Family

ID=59468950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710135955.1A Withdrawn CN106940798A (en) 2017-03-08 2017-03-08 The modification method and terminal of a kind of Text region

Country Status (1)

Country Link
CN (1) CN106940798A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633250A (en) * 2017-09-11 2018-01-26 畅捷通信息技术股份有限公司 A kind of Text region error correction method, error correction system and computer installation
CN107728783A (en) * 2017-09-25 2018-02-23 联想(北京)有限公司 Artificial intelligence process method and its system
CN109345451A (en) * 2017-09-27 2019-02-15 牛毅 System and mobile terminal for realizing image Chinese character splicing word formation
CN110276331A (en) * 2019-06-28 2019-09-24 四川长虹电器股份有限公司 The inspection modification method of Identification of Images result
CN110287910A (en) * 2019-06-28 2019-09-27 北京百度网讯科技有限公司 For obtaining the method and device of information
CN111260623A (en) * 2020-01-14 2020-06-09 广东小天才科技有限公司 Picture evaluation method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101211243A (en) * 2006-12-25 2008-07-02 卡西欧计算机株式会社 Handwritten character input device
CN101881999A (en) * 2010-06-21 2010-11-10 安阳师范学院 Oracle video input system and implementation method
CN101887519A (en) * 2010-08-16 2010-11-17 同方知网(北京)技术有限公司 Character recognition and modification method
CN104461057A (en) * 2014-12-26 2015-03-25 安徽寰智信息科技股份有限公司 Character input method based on lip shape image recognition
CN106326888A (en) * 2016-08-16 2017-01-11 北京旷视科技有限公司 Image recognition method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101211243A (en) * 2006-12-25 2008-07-02 卡西欧计算机株式会社 Handwritten character input device
CN101881999A (en) * 2010-06-21 2010-11-10 安阳师范学院 Oracle video input system and implementation method
CN101887519A (en) * 2010-08-16 2010-11-17 同方知网(北京)技术有限公司 Character recognition and modification method
CN104461057A (en) * 2014-12-26 2015-03-25 安徽寰智信息科技股份有限公司 Character input method based on lip shape image recognition
CN106326888A (en) * 2016-08-16 2017-01-11 北京旷视科技有限公司 Image recognition method and device

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633250A (en) * 2017-09-11 2018-01-26 畅捷通信息技术股份有限公司 A kind of Text region error correction method, error correction system and computer installation
CN107633250B (en) * 2017-09-11 2023-04-18 畅捷通信息技术股份有限公司 Character recognition error correction method, error correction system and computer device
CN107728783A (en) * 2017-09-25 2018-02-23 联想(北京)有限公司 Artificial intelligence process method and its system
CN107728783B (en) * 2017-09-25 2021-05-18 联想(北京)有限公司 Artificial intelligence processing method and system
CN109345451A (en) * 2017-09-27 2019-02-15 牛毅 System and mobile terminal for realizing image Chinese character splicing word formation
CN110276331A (en) * 2019-06-28 2019-09-24 四川长虹电器股份有限公司 The inspection modification method of Identification of Images result
CN110287910A (en) * 2019-06-28 2019-09-27 北京百度网讯科技有限公司 For obtaining the method and device of information
CN111260623A (en) * 2020-01-14 2020-06-09 广东小天才科技有限公司 Picture evaluation method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN106940798A (en) The modification method and terminal of a kind of Text region
US11176141B2 (en) Preserving emotion of user input
US9653073B2 (en) Voice input correction
WO2016095689A1 (en) Recognition and searching method and system based on repeated touch-control operations on terminal interface
WO2019024692A1 (en) Speech input method and device, computer equipment and storage medium
US9524428B2 (en) Automated handwriting input for entry fields
US9229543B2 (en) Modifying stylus input or response using inferred emotion
CN105511732A (en) Method for displaying page entry icons and device
EP3166012A1 (en) Audio input of field entries
CN103324674A (en) Method and device for selecting webpage content
CN104423800A (en) Electronic device and method of executing application thereof
CN105678141A (en) Information exhibiting method and device and terminal
US9460067B2 (en) Automatic sentence punctuation
US20170083499A1 (en) Identifying and modifying specific user input
US20130219257A1 (en) Textual and formatted data presentation
US10592096B2 (en) Cursor indicator for overlay input applications
US10032071B2 (en) Candidate handwriting words using optical character recognition and spell check
CN105426823A (en) Fingerprint identification method, device and terminal
CN105094562A (en) Information processing method and terminal
CN105589650A (en) Page navigation method and device
CN108052212A (en) A kind of method, terminal and computer-readable medium for inputting word
US10037137B2 (en) Directing input of handwriting strokes
CN106096963A (en) The generation method and apparatus of a kind of identifying code and verification method and device
US20150022460A1 (en) Input character capture on touch surface using cholesteric display
CN105700813A (en) Information processing method and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20170711