CN101667251B - OCR recognition method and device with auxiliary positioning function - Google Patents
OCR recognition method and device with auxiliary positioning function Download PDFInfo
- Publication number
- CN101667251B CN101667251B CN200810215861.6A CN200810215861A CN101667251B CN 101667251 B CN101667251 B CN 101667251B CN 200810215861 A CN200810215861 A CN 200810215861A CN 101667251 B CN101667251 B CN 101667251B
- Authority
- CN
- China
- Prior art keywords
- image
- text
- user
- text filed
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Character Input (AREA)
- Character Discrimination (AREA)
- Telephone Function (AREA)
Abstract
The invention relates to an OCR recognition method with an auxiliary positioning function. The method comprises the following steps: shooting a target and capturing an image containing characters; searching areas of the image and detecting out one or more text areas; selecting a specific text area; and recognizing characters in the selected specific text area. By using the method and a device of the invention, a user can automatically obtain the text areas in the image, and obtain the interested text areas in an interactive mode so as to carry out applications such as character recognition, translation, and the like. The invention can be applied to ordinary character scenes such as the automatic recognition and the translation of guideboards, public notices, newspaper, and the like, and is particularly suitable for a mobile terminal with the function of a camera. The invention can facilitate the use of the user without complicated auxiliary operations and interactions, reduces the searching range of the image, automatically obtains the text areas interested by the users, reduces the calculating operation time of a system and can improve the accuracy rate of positioning.
Description
Technical field
The present invention relates to text detection and location in image processing and area of pattern recognition, particularly video and natural scene, character recognition.
Background technology
Present OCR technology is applied to more and more with on the equipment such as the mobile intelligent terminal of image scanning (or shooting) function and PDA, but because background is often comparatively complicated during as video image, in text orientation problem before OCR, also exist certain technological difficulties, cause the result of text location to occur deviation, the character of required identification can not easily and accurately be detected, or be text filedly divided into mistakenly a plurality of relevant text subregions one, affect continuity and the computing cost of OCR recognition result, add word discrimination on the low side, cause final result (as translation) not bery desirable, thereby the mode that at this time just need to carry out some auxiliary positioning improves text location accuracy rate and recognition accuracy.
The basic process of current image (or video) text identification, first by the text image to collected (or certain two field picture) in video, carry out the pre-service (boostfiltering etc.) of image, and the analysis of the space of a whole page and understanding, with this, detect and orient text filed, again each is text filedly carried out to character recognition, further can do to recognition result the operations such as postprocessing correction, " text filed location " wherein directly affects final recognition result, and the counting yield of whole system.
Existing OCR functional mobile phone, by camera scan text word, carries out Sino-British intertranslation, first needs the camera on mobile phone to aim at word center when user uses, and mobile phone and word vertical range are more than 10 centimetres; User focuses by navigation key on mobile phone; Need to guarantee that the height of word to be identified is higher than the height that shows focusing symbol "+"; If the Chinese text of vertical setting of types need to be selected " vertical setting of types text " in menu.In the interface of operation, there will be the bar of " highlighted " to bring to be identified text filed in location, the word in this banded zone is identified and translated.The method adopts the bar of " highlighted " to bring to be identified text filed of auxiliary positioning, need user that the camera on mobile phone is aimed to word center, and need that mobile phone is vertical with word to be maintained a certain distance, if need user to do special setting while identifying vertical setting of types text filed, have a lot of restrictions to user's operation, system can not be carried out text filed location automatically, and the time of operation is long.
[CN 1804858 A] are a kind of for the mobile terminal with camera, implement the assistant positioning system for word to be identified of OCR function, the method makes to there will be on screen a tracking cross, user moves cursor, can make the initial point of cursor be positioned to be identified text filed, with this, carry out auxiliary positioning, can adjust the base of character zone to be identified and the transverse axis of tracking cross simultaneously, the base of character zone to be identified and the longitudinal axis of tracking cross are perpendicular, be used for preventing taking, improve discrimination.The method adopts tracking cross, come auxiliary positioning to be identified text filed, the transverse axis of adjusting tracking cross is parallel to each other with vertical with the longitudinal axis and the base of character zone to be identified, be used for preventing the inclination of word, need user carefully to adjust the position of cursor, and can only locate at every turn one text filed, the working time of whole location and identification is longer.
[CN 1685358 A] propose a kind of in image the automatic method in localization of text region, the step handlebar digital picture comprising is converted into bianry image; Locate possible text filed; Select actual text region; Its feature in text filed positioning step is, applied morphology mask, and with to the operation of bianry image applied morphology, then according to some rules, to generate sealing piece in image, thus localization of text region.The method adopts in whole image-regions, searches for localization of text region, and calculated amount is large and there will be the location of some errorss and omissions.
[US 7171046] propose to identify in a kind of image gathering the method for word, and the step comprising has the image that uses portable set collection to have text message; Text filed in detected image in real time; Adjust the result in text detection region, application OCR technology is carried out word identification; Supplement relevant extrinsic information, comprise travel information, transport information etc.; With dictionary technique, improve the result of OCR identification, the text of output identification and supplementary information, or translate further, and adopting the pictograph detection and Identification system of the method to realize in a portable equipment.The result of the method text filed location of manual setting before identification, needs user's direct intervention, inconvenient user's direct use.
Summary of the invention
The object of this invention is to provide a kind of OCR recognition methods and device that possesses auxiliary positioning function.
According to an aspect of of the present present invention, a kind of OCR recognition methods that possesses auxiliary positioning function, comprises step:
Target is taken and is captured the image that comprises word;
Described image-region is searched for, detected one or more text filed;
Select specific text area;
Word in selecteed specific text area is identified.
According to another aspect of the present invention, a kind of OCR recognition methods that possesses auxiliary positioning function, comprises step:
Click on screen and comprise text filed one or more points;
To including the image-region of click place, take;
Photographic images is carried out to text filed detection and location, obtain candidate text filed;
Word in text filed to candidate carries out OCR identification.
According to another aspect of the present invention, a kind of OCR recognition device that possesses auxiliary positioning function, comprising:
Image acquisition units, for obtaining text image or the video that comprises word;
Text detection positioning unit, for detection of with orient text filed in image;
Word recognition unit, for identifying the word in selected region;
Display unit, for showing the text image of collection, the result of user's input, text detection location and word identification;
Storage unit, moves required related data for storing unit.
The method and apparatus of the application of the invention, user can automatically obtain text filed in image, obtains user interested text filed by mutual mode, carries out the application such as word identification and translation with this.The present invention can apply to common word scene, such as guideboard, and bulletin, automatic identification and the translation of newspapers etc., be particularly suitable for the mobile terminal with camera function.The present invention can be user-friendly, do not need complicated non-productive operation and mutual, and the hunting zone of downscaled images, automatically obtains user interested text filed, reduces the calculating working time of system, and the accuracy rate that can improve location.
Accompanying drawing explanation
Fig. 1 is the OCR recognition device that possesses auxiliary positioning function;
Fig. 2 is the process flow diagram that user selects text filed OCR recognition methods;
Fig. 3 is the process flow diagram that user clicks the OCR recognition methods that contains character area.
Embodiment
The inventive system comprises interactive unit, operation processing unit and storage unit forms, interactive unit is wherein text image or the video that gathers required identification, receive and show that user clicks the relevant information of operations such as selecting, the user's input information receiving is sent to operation processing unit, and receive and show information from operation processing unit, comprise image acquisition units, display unit and user's input detection unit; Operation processing unit is to text image and user's input information from interactive unit input, carry out text filed detection and location, and the word in text filed is identified, comprise text detection positioning unit and word recognition unit, wherein the text filed circumscribed rectangular region that comprises one or more character blocks that refers to.
Image acquisition units is text image or the video that gathers required identification, such as camera, with the mobile phone of camera function, notebook etc.;
Display unit is that user clicks the relevant information of selection for showing text image to be identified or video, text filed detection and positioning result, and the result of word identification;
User's input detection unit is to click for receiving user the relevant information of selecting to wait operation;
Detection and location unit is according to the information receiving from interactive unit, carries out text filed detection and location, exports corresponding text filed location coordinate information to word recognition unit;
Word recognition unit is according to the text image and the location coordinate information that receive from interactive unit and detection and location unit, the word in text filed is identified, and exported to display unit;
Storage unit is moved required relevant information for storing unit, and it comprises: text image to be identified, user click the relevant information of the operations such as selection, text filed positioning result, the apparatus and method information needed such as result of word identification.
In implementation process, based on user, select text filed OCR recognition methods to comprise: start image pickup mode, take and capture the image that comprises word, this image can be low-resolution image; Image-region is searched for, carried out text filed detection and location, automatically the text filed of the candidate who obtains tipped out; Text filed for the candidate who provides, user selects text filed by the mode of click or moving focal point; Word in text filed to selecteed candidate carries out OCR identification.
In implementation process, based on user, click the OCR recognition methods that contains character area and comprise: start image pickup mode, user, by clicking screen, takes and capture character image, and this image can be low-resolution image; To including the image-region of click place, carry out text filed detection and location; What to user prompting, marked is text filed, and user is by clicking or the mode of moving focal point, selects to be identified text filed; Word in text filed to selecteed candidate, or the word in text filed to candidate carries out OCR identification.
Below, describe with reference to the accompanying drawings embodiments of the invention in detail.In the following description, for clear and for simplicity, omitted the detailed description to known function or structure.
This instructions for embodiment be only application one of specific embodiments of the invention, and do not mean that enforcement of the present invention is only confined to this kind of form.
In this manual, comprise claim, the term of use " unit " is by module composition, and " assembly " refers to the entity relevant to system of the present invention, or hardware, the combination of hardware and software, software, or executory software.For example, assembly can be, but is not limited to, and operates in thread, program and the computing machine of process on processor, processor, object, the thing that can carry out, execution.As example, the application program operating on mobile terminal can be assembly.Assembly can comprise one or more assemblies in addition.
Term " comprises ", " comprising " or the similar terms meaning is that nonexclusion comprises, thereby comprises that the method for a row assembly or equipment not only comprise these assemblies, also comprise the assembly that other are unlisted.
Figure mono-is the first advantageous embodiment of the present invention, possesses the OCR recognition device of auxiliary positioning function, and the input equipment of this device is video capture device, output device be can display graphics interface display device, Identification display is touch-screen in the present embodiment.
Image acquisition units 111 major functions are to gather text image or video, such as camera, mobile phone with camera function, notebook etc., when starting OCR recognition device, user can enable image acquisition units 111, image acquisition units 111 is exported after obtaining image or video on display unit 112, and photograph taking and selection that user controls image acquisition units 111 by user's input detection unit 113 enter OCR identification or again take.
The image that image acquisition units 111 is obtained, by text detection positioning unit 121, text filed in detected image, or by user's input detection unit 113, click the region of containing word, by text detection positioning unit 121, detection includes text filed in click place image, and the result detecting is exported on display unit 112, conventionally text filed testing result represents with surrounding text filed rectangle, user by graphical interfaces to rectangle position, size, the editor of shape revises text filed testing result.
By user's input detection unit 113, select the text filed of candidates, user can have and clicks or the mode of moving focal point, selects text filedly, and can select a plurality of text filed; What detect is text filed through word recognition unit 122 identifications, be converted into the machine code of corresponding language, such as Unicode, and on the graphical interfaces of display unit 112, show corresponding recognition result, user can by graphical interfaces to recognition result delete accordingly, add, the operation such as modification, further can carry out the translation of relational language.
Figure bis-is process flow diagrams that user selects text filed OCR recognition methods, and the step comprising is as follows:
1) under image pickup mode (S201), user presses shutter and starts focusing automatically, and camera carries out the operation of focusing automatically, takes and capture the image (S202) that contains word, and this character image can be low-resolution image;
2) character image obtaining is above carried out to the search in global image region, detect text filed (S203), and automatically by the text filed user of being prompted to of the candidate who detects (S204), wherein adopt low-resolution image to carry out detection and location, by the experiment test on 6350 width images, the different resolution image of contrast 400*300 and 1024*768, approximately only have the latter's 20% the former operation time, improved the travelling speed of device;
3) candidate who has detected to user's prompting is text filed, and user, by the mode of click or moving focal point, selects text filed (S205), and supports to select a plurality of text filed;
4) that according to user, selects is text filed, word is wherein carried out to OCR identification (S206), and can further translate.
Wherein step S201 and S202 can carry out in image acquisition units 111, step S203 can carry out in detection and location unit 121, step S204 can carry out in display unit 112, step S205 can carry out in user's input detection unit 113, and step S206 can carry out in word recognition unit 122.
Figure tri-is process flow diagrams that user clicks the OCR recognition methods that contains character area, and the step comprising is as follows:
1) under image pickup mode (S301), user comprises a text filed point or a plurality of point (S302) by clicking on screen, press shutter and start focusing automatically, camera carries out the operation of focusing automatically, take and capture the image (S303) that contains word, this character image can be low-resolution image, and shutter can click screen by user and start, and user can click a plurality of regions of screen simultaneously;
2) according to user, click the position coordinates of screen, the image obtaining is above processed: can start search from whole image-region, detect and contain the text filed of click coordinate; Also can carry out text filed detection (S304) to including the image-region centered by several clicks place, and automatically by the text filed user of being prompted to of the candidate who detects (S305);
3) candidate who has detected to user's prompting is text filed, and user, by the mode of click or moving focal point, selects text filed (S306), and supports to select a plurality of text filed;
4) text filed (S306) selecting according to user, or detection and location text filed (S305) that arrive, device carries out OCR identification (S307) to word wherein, and can further translate.
Wherein step S301 and S303 can carry out in image acquisition units 111, step S304 can carry out in detection and location unit 121, step S305 can carry out in display unit 112, step S302 and S306 can carry out in user's input detection unit 113, and step S307 can carry out in word recognition unit 122.
Claims (7)
1. possess an OCR recognition methods for auxiliary positioning function, comprise step:
User starts image pickup mode;
Click on screen and comprise text filed a point or a plurality of point;
Shutter starts and focusing automatically;
Take and capture the image that contains word;
According to the position coordinates of clicking screen, from whole image-region, start search, to detect, contain the text filed of click coordinate, or the image-region including centered by several clicks place is carried out to text filed detection;
The candidate who has detected to user's prompting is text filed;
User is by clicking or the mode of moving focal point is carried out from candidate is text filed selection one or more is text filed;
To what select, text filedly carry out OCR identification.
2. method according to claim 1, is characterized in that, the image capturing is low-resolution image.
3. method according to claim 1, is characterized in that, also comprises: the word after identification is translated.
4. an OCR recognition device that possesses auxiliary positioning function, comprising:
Image acquisition units, for obtaining text image or the video that comprises word under image pickup mode;
User's input detection unit, clicks on screen and comprises and text filed a point or a plurality of point make image acquisition units start shutter focusing automatically for user, takes and capture the image that contains word;
Text detection positioning unit, for according to the position coordinates of clicking screen, starts search from whole image-region, to detect, contains the text filed of click coordinate, or the image-region including centered by several clicks place is carried out to text filed detection;
Wherein, the candidate that described user's input detection unit has also detected to user prompting is text filed, so that user is by clicking or the mode of moving focal point is carried out from candidate is text filed selection one or more is text filed; Described recognition device also comprises:
Word recognition unit, for text filedly carrying out OCR identification to what select;
Display unit, for showing the text image of collection, the result of user's input, text detection location and word identification;
Storage unit, moves required related data for storing unit.
5. recognition device according to claim 4, characterized by further comprising graphical interfaces, detects word identification and translation result for display text on display device.
6. recognition device according to claim 4, is characterized in that described recognition device is mobile phone, PDA, intelligent terminal, camera or translater.
7. recognition device according to claim 4, is characterized in that described display unit is LCD display or touch-screen.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200810215861.6A CN101667251B (en) | 2008-09-05 | 2008-09-05 | OCR recognition method and device with auxiliary positioning function |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200810215861.6A CN101667251B (en) | 2008-09-05 | 2008-09-05 | OCR recognition method and device with auxiliary positioning function |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101667251A CN101667251A (en) | 2010-03-10 |
CN101667251B true CN101667251B (en) | 2014-07-23 |
Family
ID=41803867
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN200810215861.6A Active CN101667251B (en) | 2008-09-05 | 2008-09-05 | OCR recognition method and device with auxiliary positioning function |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101667251B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109190629A (en) * | 2018-08-28 | 2019-01-11 | 传化智联股份有限公司 | A kind of electronics waybill generation method and device |
Families Citing this family (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103186589A (en) * | 2011-12-30 | 2013-07-03 | 牟颖 | Method for quickly judging authenticity of medicament and alarming through mobile phone |
EP2803013A1 (en) * | 2012-01-09 | 2014-11-19 | Qualcomm Incorporated | Ocr cache update |
KR20140043644A (en) * | 2012-10-02 | 2014-04-10 | 엘지전자 주식회사 | Mobile terminal and control method for the mobile terminal |
CN103902994A (en) * | 2012-12-28 | 2014-07-02 | 联想(北京)有限公司 | Processing method and electric equipment |
CN104252475B (en) * | 2013-06-27 | 2018-03-27 | 腾讯科技(深圳)有限公司 | Position the method and device of text in picture information |
CN103488630B (en) * | 2013-09-29 | 2016-06-08 | 小米科技有限责任公司 | The processing method of a kind of image, device and terminal |
CN104598289B (en) * | 2013-10-31 | 2018-04-27 | 联想(北京)有限公司 | A kind of recognition methods and a kind of electronic equipment |
US9436682B2 (en) * | 2014-06-24 | 2016-09-06 | Google Inc. | Techniques for machine language translation of text from an image based on non-textual context information from the image |
CN105740863A (en) * | 2014-12-08 | 2016-07-06 | 阿里巴巴集团控股有限公司 | Information processing method and device |
CN105516590B (en) * | 2015-12-11 | 2019-04-16 | Oppo广东移动通信有限公司 | A kind of image processing method and device |
CN105760867A (en) * | 2016-02-19 | 2016-07-13 | 深圳市润农科技有限公司 | Method and device for extracting two-dimension code and/or text in image |
CN105739832A (en) * | 2016-03-10 | 2016-07-06 | 联想(北京)有限公司 | Information processing method and electronic equipment |
CN107305446B (en) * | 2016-04-25 | 2020-08-14 | 北京字节跳动网络技术有限公司 | Method and device for acquiring keywords in pressure sensing area |
CN105955626B (en) * | 2016-04-29 | 2019-04-09 | 广东小天才科技有限公司 | Photographing search method and device |
WO2018107566A1 (en) * | 2016-12-16 | 2018-06-21 | 华为技术有限公司 | Processing method and mobile device |
CN107885449B (en) * | 2017-11-09 | 2020-01-03 | 广东小天才科技有限公司 | Photographing search method and device, terminal equipment and storage medium |
CN108628858A (en) * | 2018-04-20 | 2018-10-09 | 广东科学技术职业学院 | The operating method and system of textual scan identification translation on line based on mobile terminal |
CN109583443B (en) * | 2018-11-15 | 2022-10-18 | 四川长虹电器股份有限公司 | Video content judgment method based on character recognition |
CN111381683A (en) * | 2018-12-28 | 2020-07-07 | 薛康泰华 | Photographing recognition input method and software |
CN109887349B (en) * | 2019-04-12 | 2021-05-11 | 广东小天才科技有限公司 | Dictation auxiliary method and device |
CN110245572A (en) * | 2019-05-20 | 2019-09-17 | 平安科技(深圳)有限公司 | Region content identification method, device, computer equipment and storage medium |
CN112668886A (en) * | 2020-12-29 | 2021-04-16 | 深圳前海微众银行股份有限公司 | Method, device and equipment for monitoring risks of rental business and readable storage medium |
CN116152814A (en) * | 2022-12-20 | 2023-05-23 | 华为技术有限公司 | Image recognition method and related equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1484165A (en) * | 2002-07-26 | 2004-03-24 | ��ʿͨ��ʽ���� | File information input apparatus, input method, input program and recording medium |
CN1685358A (en) * | 2002-07-31 | 2005-10-19 | 里昂中央理工学院 | Method and system for automatically locating text areas in an image |
CN1932802A (en) * | 2005-09-16 | 2007-03-21 | 三星电子株式会社 | Host device having extraction function of text and extraction method thereof |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1606030A (en) * | 2004-11-12 | 2005-04-13 | 无敌科技(西安)有限公司 | Electronic photography translation paraphrasing method and apparatus |
-
2008
- 2008-09-05 CN CN200810215861.6A patent/CN101667251B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1484165A (en) * | 2002-07-26 | 2004-03-24 | ��ʿͨ��ʽ���� | File information input apparatus, input method, input program and recording medium |
CN1685358A (en) * | 2002-07-31 | 2005-10-19 | 里昂中央理工学院 | Method and system for automatically locating text areas in an image |
CN1932802A (en) * | 2005-09-16 | 2007-03-21 | 三星电子株式会社 | Host device having extraction function of text and extraction method thereof |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109190629A (en) * | 2018-08-28 | 2019-01-11 | 传化智联股份有限公司 | A kind of electronics waybill generation method and device |
Also Published As
Publication number | Publication date |
---|---|
CN101667251A (en) | 2010-03-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101667251B (en) | OCR recognition method and device with auxiliary positioning function | |
CN107885430B (en) | Audio playing method and device, storage medium and electronic equipment | |
US9542612B2 (en) | Using extracted image text | |
KR100942346B1 (en) | Mobile device and transmission system | |
EP2041678B1 (en) | Recognizing text in images | |
US7953295B2 (en) | Enhancing text in images | |
US8988543B2 (en) | Camera based method for text input and keyword detection | |
US6473523B1 (en) | Portable text capturing method and device therefor | |
US7403657B2 (en) | Method and apparatus for character string search in image | |
US20100331041A1 (en) | System and method for language-independent manipulations of digital copies of documents through a camera phone | |
EP1783681A1 (en) | Retrieval system and retrieval method | |
KR20170061631A (en) | Method and device for region identification | |
JP2014504400A (en) | How to crop a text image | |
CN105808542B (en) | Information processing method and information processing apparatus | |
CN1292377C (en) | Method for selecting treating object in character identification of portable terminal and portable terminal | |
CN110781195B (en) | System, method and device for updating point of interest information | |
CN113010738B (en) | Video processing method, device, electronic equipment and readable storage medium | |
KR20040010364A (en) | Document information input program, document information input apparatus and document information input method | |
CN107491778B (en) | Intelligent device screen extraction method and system based on positioning image | |
JPH10254901A (en) | Method and device for retrieving image | |
CN112183149B (en) | Graphic code processing method and device | |
JP2007011762A (en) | Area extraction apparatus and area extraction method | |
CN114155547A (en) | Chart identification method, device, equipment and storage medium | |
CN105975621B (en) | Method and device for identifying search engine in browser page | |
CN107438160A (en) | A kind of preview image scales the method and device into line character inquiry manually |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |