CN101667251B - OCR recognition method and device with auxiliary positioning function - Google Patents

OCR recognition method and device with auxiliary positioning function Download PDF

Info

Publication number
CN101667251B
CN101667251B CN200810215861.6A CN200810215861A CN101667251B CN 101667251 B CN101667251 B CN 101667251B CN 200810215861 A CN200810215861 A CN 200810215861A CN 101667251 B CN101667251 B CN 101667251B
Authority
CN
China
Prior art keywords
image
text
user
text filed
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN200810215861.6A
Other languages
Chinese (zh)
Other versions
CN101667251A (en
Inventor
陈又新
李斌
王�华
王炎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Samsung Telecommunications Technology Research Co Ltd
Samsung Electronics Co Ltd
Original Assignee
Beijing Samsung Telecommunications Technology Research Co Ltd
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Samsung Telecommunications Technology Research Co Ltd, Samsung Electronics Co Ltd filed Critical Beijing Samsung Telecommunications Technology Research Co Ltd
Priority to CN200810215861.6A priority Critical patent/CN101667251B/en
Publication of CN101667251A publication Critical patent/CN101667251A/en
Application granted granted Critical
Publication of CN101667251B publication Critical patent/CN101667251B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Character Input (AREA)
  • Character Discrimination (AREA)
  • Telephone Function (AREA)

Abstract

The invention relates to an OCR recognition method with an auxiliary positioning function. The method comprises the following steps: shooting a target and capturing an image containing characters; searching areas of the image and detecting out one or more text areas; selecting a specific text area; and recognizing characters in the selected specific text area. By using the method and a device of the invention, a user can automatically obtain the text areas in the image, and obtain the interested text areas in an interactive mode so as to carry out applications such as character recognition, translation, and the like. The invention can be applied to ordinary character scenes such as the automatic recognition and the translation of guideboards, public notices, newspaper, and the like, and is particularly suitable for a mobile terminal with the function of a camera. The invention can facilitate the use of the user without complicated auxiliary operations and interactions, reduces the searching range of the image, automatically obtains the text areas interested by the users, reduces the calculating operation time of a system and can improve the accuracy rate of positioning.

Description

The OCR recognition methods and the device that possess auxiliary positioning function
Technical field
The present invention relates to text detection and location in image processing and area of pattern recognition, particularly video and natural scene, character recognition.
Background technology
Present OCR technology is applied to more and more with on the equipment such as the mobile intelligent terminal of image scanning (or shooting) function and PDA, but because background is often comparatively complicated during as video image, in text orientation problem before OCR, also exist certain technological difficulties, cause the result of text location to occur deviation, the character of required identification can not easily and accurately be detected, or be text filedly divided into mistakenly a plurality of relevant text subregions one, affect continuity and the computing cost of OCR recognition result, add word discrimination on the low side, cause final result (as translation) not bery desirable, thereby the mode that at this time just need to carry out some auxiliary positioning improves text location accuracy rate and recognition accuracy.
The basic process of current image (or video) text identification, first by the text image to collected (or certain two field picture) in video, carry out the pre-service (boostfiltering etc.) of image, and the analysis of the space of a whole page and understanding, with this, detect and orient text filed, again each is text filedly carried out to character recognition, further can do to recognition result the operations such as postprocessing correction, " text filed location " wherein directly affects final recognition result, and the counting yield of whole system.
Existing OCR functional mobile phone, by camera scan text word, carries out Sino-British intertranslation, first needs the camera on mobile phone to aim at word center when user uses, and mobile phone and word vertical range are more than 10 centimetres; User focuses by navigation key on mobile phone; Need to guarantee that the height of word to be identified is higher than the height that shows focusing symbol "+"; If the Chinese text of vertical setting of types need to be selected " vertical setting of types text " in menu.In the interface of operation, there will be the bar of " highlighted " to bring to be identified text filed in location, the word in this banded zone is identified and translated.The method adopts the bar of " highlighted " to bring to be identified text filed of auxiliary positioning, need user that the camera on mobile phone is aimed to word center, and need that mobile phone is vertical with word to be maintained a certain distance, if need user to do special setting while identifying vertical setting of types text filed, have a lot of restrictions to user's operation, system can not be carried out text filed location automatically, and the time of operation is long.
[CN 1804858 A] are a kind of for the mobile terminal with camera, implement the assistant positioning system for word to be identified of OCR function, the method makes to there will be on screen a tracking cross, user moves cursor, can make the initial point of cursor be positioned to be identified text filed, with this, carry out auxiliary positioning, can adjust the base of character zone to be identified and the transverse axis of tracking cross simultaneously, the base of character zone to be identified and the longitudinal axis of tracking cross are perpendicular, be used for preventing taking, improve discrimination.The method adopts tracking cross, come auxiliary positioning to be identified text filed, the transverse axis of adjusting tracking cross is parallel to each other with vertical with the longitudinal axis and the base of character zone to be identified, be used for preventing the inclination of word, need user carefully to adjust the position of cursor, and can only locate at every turn one text filed, the working time of whole location and identification is longer.
[CN 1685358 A] propose a kind of in image the automatic method in localization of text region, the step handlebar digital picture comprising is converted into bianry image; Locate possible text filed; Select actual text region; Its feature in text filed positioning step is, applied morphology mask, and with to the operation of bianry image applied morphology, then according to some rules, to generate sealing piece in image, thus localization of text region.The method adopts in whole image-regions, searches for localization of text region, and calculated amount is large and there will be the location of some errorss and omissions.
[US 7171046] propose to identify in a kind of image gathering the method for word, and the step comprising has the image that uses portable set collection to have text message; Text filed in detected image in real time; Adjust the result in text detection region, application OCR technology is carried out word identification; Supplement relevant extrinsic information, comprise travel information, transport information etc.; With dictionary technique, improve the result of OCR identification, the text of output identification and supplementary information, or translate further, and adopting the pictograph detection and Identification system of the method to realize in a portable equipment.The result of the method text filed location of manual setting before identification, needs user's direct intervention, inconvenient user's direct use.
Summary of the invention
The object of this invention is to provide a kind of OCR recognition methods and device that possesses auxiliary positioning function.
According to an aspect of of the present present invention, a kind of OCR recognition methods that possesses auxiliary positioning function, comprises step:
Target is taken and is captured the image that comprises word;
Described image-region is searched for, detected one or more text filed;
Select specific text area;
Word in selecteed specific text area is identified.
According to another aspect of the present invention, a kind of OCR recognition methods that possesses auxiliary positioning function, comprises step:
Click on screen and comprise text filed one or more points;
To including the image-region of click place, take;
Photographic images is carried out to text filed detection and location, obtain candidate text filed;
Word in text filed to candidate carries out OCR identification.
According to another aspect of the present invention, a kind of OCR recognition device that possesses auxiliary positioning function, comprising:
Image acquisition units, for obtaining text image or the video that comprises word;
Text detection positioning unit, for detection of with orient text filed in image;
Word recognition unit, for identifying the word in selected region;
Display unit, for showing the text image of collection, the result of user's input, text detection location and word identification;
Storage unit, moves required related data for storing unit.
The method and apparatus of the application of the invention, user can automatically obtain text filed in image, obtains user interested text filed by mutual mode, carries out the application such as word identification and translation with this.The present invention can apply to common word scene, such as guideboard, and bulletin, automatic identification and the translation of newspapers etc., be particularly suitable for the mobile terminal with camera function.The present invention can be user-friendly, do not need complicated non-productive operation and mutual, and the hunting zone of downscaled images, automatically obtains user interested text filed, reduces the calculating working time of system, and the accuracy rate that can improve location.
Accompanying drawing explanation
Fig. 1 is the OCR recognition device that possesses auxiliary positioning function;
Fig. 2 is the process flow diagram that user selects text filed OCR recognition methods;
Fig. 3 is the process flow diagram that user clicks the OCR recognition methods that contains character area.
Embodiment
The inventive system comprises interactive unit, operation processing unit and storage unit forms, interactive unit is wherein text image or the video that gathers required identification, receive and show that user clicks the relevant information of operations such as selecting, the user's input information receiving is sent to operation processing unit, and receive and show information from operation processing unit, comprise image acquisition units, display unit and user's input detection unit; Operation processing unit is to text image and user's input information from interactive unit input, carry out text filed detection and location, and the word in text filed is identified, comprise text detection positioning unit and word recognition unit, wherein the text filed circumscribed rectangular region that comprises one or more character blocks that refers to.
Image acquisition units is text image or the video that gathers required identification, such as camera, with the mobile phone of camera function, notebook etc.;
Display unit is that user clicks the relevant information of selection for showing text image to be identified or video, text filed detection and positioning result, and the result of word identification;
User's input detection unit is to click for receiving user the relevant information of selecting to wait operation;
Detection and location unit is according to the information receiving from interactive unit, carries out text filed detection and location, exports corresponding text filed location coordinate information to word recognition unit;
Word recognition unit is according to the text image and the location coordinate information that receive from interactive unit and detection and location unit, the word in text filed is identified, and exported to display unit;
Storage unit is moved required relevant information for storing unit, and it comprises: text image to be identified, user click the relevant information of the operations such as selection, text filed positioning result, the apparatus and method information needed such as result of word identification.
In implementation process, based on user, select text filed OCR recognition methods to comprise: start image pickup mode, take and capture the image that comprises word, this image can be low-resolution image; Image-region is searched for, carried out text filed detection and location, automatically the text filed of the candidate who obtains tipped out; Text filed for the candidate who provides, user selects text filed by the mode of click or moving focal point; Word in text filed to selecteed candidate carries out OCR identification.
In implementation process, based on user, click the OCR recognition methods that contains character area and comprise: start image pickup mode, user, by clicking screen, takes and capture character image, and this image can be low-resolution image; To including the image-region of click place, carry out text filed detection and location; What to user prompting, marked is text filed, and user is by clicking or the mode of moving focal point, selects to be identified text filed; Word in text filed to selecteed candidate, or the word in text filed to candidate carries out OCR identification.
Below, describe with reference to the accompanying drawings embodiments of the invention in detail.In the following description, for clear and for simplicity, omitted the detailed description to known function or structure.
This instructions for embodiment be only application one of specific embodiments of the invention, and do not mean that enforcement of the present invention is only confined to this kind of form.
In this manual, comprise claim, the term of use " unit " is by module composition, and " assembly " refers to the entity relevant to system of the present invention, or hardware, the combination of hardware and software, software, or executory software.For example, assembly can be, but is not limited to, and operates in thread, program and the computing machine of process on processor, processor, object, the thing that can carry out, execution.As example, the application program operating on mobile terminal can be assembly.Assembly can comprise one or more assemblies in addition.
Term " comprises ", " comprising " or the similar terms meaning is that nonexclusion comprises, thereby comprises that the method for a row assembly or equipment not only comprise these assemblies, also comprise the assembly that other are unlisted.
Figure mono-is the first advantageous embodiment of the present invention, possesses the OCR recognition device of auxiliary positioning function, and the input equipment of this device is video capture device, output device be can display graphics interface display device, Identification display is touch-screen in the present embodiment.
Image acquisition units 111 major functions are to gather text image or video, such as camera, mobile phone with camera function, notebook etc., when starting OCR recognition device, user can enable image acquisition units 111, image acquisition units 111 is exported after obtaining image or video on display unit 112, and photograph taking and selection that user controls image acquisition units 111 by user's input detection unit 113 enter OCR identification or again take.
The image that image acquisition units 111 is obtained, by text detection positioning unit 121, text filed in detected image, or by user's input detection unit 113, click the region of containing word, by text detection positioning unit 121, detection includes text filed in click place image, and the result detecting is exported on display unit 112, conventionally text filed testing result represents with surrounding text filed rectangle, user by graphical interfaces to rectangle position, size, the editor of shape revises text filed testing result.
By user's input detection unit 113, select the text filed of candidates, user can have and clicks or the mode of moving focal point, selects text filedly, and can select a plurality of text filed; What detect is text filed through word recognition unit 122 identifications, be converted into the machine code of corresponding language, such as Unicode, and on the graphical interfaces of display unit 112, show corresponding recognition result, user can by graphical interfaces to recognition result delete accordingly, add, the operation such as modification, further can carry out the translation of relational language.
Figure bis-is process flow diagrams that user selects text filed OCR recognition methods, and the step comprising is as follows:
1) under image pickup mode (S201), user presses shutter and starts focusing automatically, and camera carries out the operation of focusing automatically, takes and capture the image (S202) that contains word, and this character image can be low-resolution image;
2) character image obtaining is above carried out to the search in global image region, detect text filed (S203), and automatically by the text filed user of being prompted to of the candidate who detects (S204), wherein adopt low-resolution image to carry out detection and location, by the experiment test on 6350 width images, the different resolution image of contrast 400*300 and 1024*768, approximately only have the latter's 20% the former operation time, improved the travelling speed of device;
3) candidate who has detected to user's prompting is text filed, and user, by the mode of click or moving focal point, selects text filed (S205), and supports to select a plurality of text filed;
4) that according to user, selects is text filed, word is wherein carried out to OCR identification (S206), and can further translate.
Wherein step S201 and S202 can carry out in image acquisition units 111, step S203 can carry out in detection and location unit 121, step S204 can carry out in display unit 112, step S205 can carry out in user's input detection unit 113, and step S206 can carry out in word recognition unit 122.
Figure tri-is process flow diagrams that user clicks the OCR recognition methods that contains character area, and the step comprising is as follows:
1) under image pickup mode (S301), user comprises a text filed point or a plurality of point (S302) by clicking on screen, press shutter and start focusing automatically, camera carries out the operation of focusing automatically, take and capture the image (S303) that contains word, this character image can be low-resolution image, and shutter can click screen by user and start, and user can click a plurality of regions of screen simultaneously;
2) according to user, click the position coordinates of screen, the image obtaining is above processed: can start search from whole image-region, detect and contain the text filed of click coordinate; Also can carry out text filed detection (S304) to including the image-region centered by several clicks place, and automatically by the text filed user of being prompted to of the candidate who detects (S305);
3) candidate who has detected to user's prompting is text filed, and user, by the mode of click or moving focal point, selects text filed (S306), and supports to select a plurality of text filed;
4) text filed (S306) selecting according to user, or detection and location text filed (S305) that arrive, device carries out OCR identification (S307) to word wherein, and can further translate.
Wherein step S301 and S303 can carry out in image acquisition units 111, step S304 can carry out in detection and location unit 121, step S305 can carry out in display unit 112, step S302 and S306 can carry out in user's input detection unit 113, and step S307 can carry out in word recognition unit 122.

Claims (7)

1. possess an OCR recognition methods for auxiliary positioning function, comprise step:
User starts image pickup mode;
Click on screen and comprise text filed a point or a plurality of point;
Shutter starts and focusing automatically;
Take and capture the image that contains word;
According to the position coordinates of clicking screen, from whole image-region, start search, to detect, contain the text filed of click coordinate, or the image-region including centered by several clicks place is carried out to text filed detection;
The candidate who has detected to user's prompting is text filed;
User is by clicking or the mode of moving focal point is carried out from candidate is text filed selection one or more is text filed;
To what select, text filedly carry out OCR identification.
2. method according to claim 1, is characterized in that, the image capturing is low-resolution image.
3. method according to claim 1, is characterized in that, also comprises: the word after identification is translated.
4. an OCR recognition device that possesses auxiliary positioning function, comprising:
Image acquisition units, for obtaining text image or the video that comprises word under image pickup mode;
User's input detection unit, clicks on screen and comprises and text filed a point or a plurality of point make image acquisition units start shutter focusing automatically for user, takes and capture the image that contains word;
Text detection positioning unit, for according to the position coordinates of clicking screen, starts search from whole image-region, to detect, contains the text filed of click coordinate, or the image-region including centered by several clicks place is carried out to text filed detection;
Wherein, the candidate that described user's input detection unit has also detected to user prompting is text filed, so that user is by clicking or the mode of moving focal point is carried out from candidate is text filed selection one or more is text filed; Described recognition device also comprises:
Word recognition unit, for text filedly carrying out OCR identification to what select;
Display unit, for showing the text image of collection, the result of user's input, text detection location and word identification;
Storage unit, moves required related data for storing unit.
5. recognition device according to claim 4, characterized by further comprising graphical interfaces, detects word identification and translation result for display text on display device.
6. recognition device according to claim 4, is characterized in that described recognition device is mobile phone, PDA, intelligent terminal, camera or translater.
7. recognition device according to claim 4, is characterized in that described display unit is LCD display or touch-screen.
CN200810215861.6A 2008-09-05 2008-09-05 OCR recognition method and device with auxiliary positioning function Active CN101667251B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200810215861.6A CN101667251B (en) 2008-09-05 2008-09-05 OCR recognition method and device with auxiliary positioning function

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200810215861.6A CN101667251B (en) 2008-09-05 2008-09-05 OCR recognition method and device with auxiliary positioning function

Publications (2)

Publication Number Publication Date
CN101667251A CN101667251A (en) 2010-03-10
CN101667251B true CN101667251B (en) 2014-07-23

Family

ID=41803867

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200810215861.6A Active CN101667251B (en) 2008-09-05 2008-09-05 OCR recognition method and device with auxiliary positioning function

Country Status (1)

Country Link
CN (1) CN101667251B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109190629A (en) * 2018-08-28 2019-01-11 传化智联股份有限公司 A kind of electronics waybill generation method and device

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103186589A (en) * 2011-12-30 2013-07-03 牟颖 Method for quickly judging authenticity of medicament and alarming through mobile phone
EP2803013A1 (en) * 2012-01-09 2014-11-19 Qualcomm Incorporated Ocr cache update
KR20140043644A (en) * 2012-10-02 2014-04-10 엘지전자 주식회사 Mobile terminal and control method for the mobile terminal
CN103902994A (en) * 2012-12-28 2014-07-02 联想(北京)有限公司 Processing method and electric equipment
CN104252475B (en) * 2013-06-27 2018-03-27 腾讯科技(深圳)有限公司 Position the method and device of text in picture information
CN103488630B (en) * 2013-09-29 2016-06-08 小米科技有限责任公司 The processing method of a kind of image, device and terminal
CN104598289B (en) * 2013-10-31 2018-04-27 联想(北京)有限公司 A kind of recognition methods and a kind of electronic equipment
US9436682B2 (en) * 2014-06-24 2016-09-06 Google Inc. Techniques for machine language translation of text from an image based on non-textual context information from the image
CN105740863A (en) * 2014-12-08 2016-07-06 阿里巴巴集团控股有限公司 Information processing method and device
CN105516590B (en) * 2015-12-11 2019-04-16 Oppo广东移动通信有限公司 A kind of image processing method and device
CN105760867A (en) * 2016-02-19 2016-07-13 深圳市润农科技有限公司 Method and device for extracting two-dimension code and/or text in image
CN105739832A (en) * 2016-03-10 2016-07-06 联想(北京)有限公司 Information processing method and electronic equipment
CN107305446B (en) * 2016-04-25 2020-08-14 北京字节跳动网络技术有限公司 Method and device for acquiring keywords in pressure sensing area
CN105955626B (en) * 2016-04-29 2019-04-09 广东小天才科技有限公司 Photographing search method and device
WO2018107566A1 (en) * 2016-12-16 2018-06-21 华为技术有限公司 Processing method and mobile device
CN107885449B (en) * 2017-11-09 2020-01-03 广东小天才科技有限公司 Photographing search method and device, terminal equipment and storage medium
CN108628858A (en) * 2018-04-20 2018-10-09 广东科学技术职业学院 The operating method and system of textual scan identification translation on line based on mobile terminal
CN109583443B (en) * 2018-11-15 2022-10-18 四川长虹电器股份有限公司 Video content judgment method based on character recognition
CN111381683A (en) * 2018-12-28 2020-07-07 薛康泰华 Photographing recognition input method and software
CN109887349B (en) * 2019-04-12 2021-05-11 广东小天才科技有限公司 Dictation auxiliary method and device
CN110245572A (en) * 2019-05-20 2019-09-17 平安科技(深圳)有限公司 Region content identification method, device, computer equipment and storage medium
CN112668886A (en) * 2020-12-29 2021-04-16 深圳前海微众银行股份有限公司 Method, device and equipment for monitoring risks of rental business and readable storage medium
CN116152814A (en) * 2022-12-20 2023-05-23 华为技术有限公司 Image recognition method and related equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1484165A (en) * 2002-07-26 2004-03-24 ��ʿͨ��ʽ���� File information input apparatus, input method, input program and recording medium
CN1685358A (en) * 2002-07-31 2005-10-19 里昂中央理工学院 Method and system for automatically locating text areas in an image
CN1932802A (en) * 2005-09-16 2007-03-21 三星电子株式会社 Host device having extraction function of text and extraction method thereof

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1606030A (en) * 2004-11-12 2005-04-13 无敌科技(西安)有限公司 Electronic photography translation paraphrasing method and apparatus

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1484165A (en) * 2002-07-26 2004-03-24 ��ʿͨ��ʽ���� File information input apparatus, input method, input program and recording medium
CN1685358A (en) * 2002-07-31 2005-10-19 里昂中央理工学院 Method and system for automatically locating text areas in an image
CN1932802A (en) * 2005-09-16 2007-03-21 三星电子株式会社 Host device having extraction function of text and extraction method thereof

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109190629A (en) * 2018-08-28 2019-01-11 传化智联股份有限公司 A kind of electronics waybill generation method and device

Also Published As

Publication number Publication date
CN101667251A (en) 2010-03-10

Similar Documents

Publication Publication Date Title
CN101667251B (en) OCR recognition method and device with auxiliary positioning function
CN107885430B (en) Audio playing method and device, storage medium and electronic equipment
US9542612B2 (en) Using extracted image text
KR100942346B1 (en) Mobile device and transmission system
EP2041678B1 (en) Recognizing text in images
US7953295B2 (en) Enhancing text in images
US8988543B2 (en) Camera based method for text input and keyword detection
US6473523B1 (en) Portable text capturing method and device therefor
US7403657B2 (en) Method and apparatus for character string search in image
US20100331041A1 (en) System and method for language-independent manipulations of digital copies of documents through a camera phone
EP1783681A1 (en) Retrieval system and retrieval method
KR20170061631A (en) Method and device for region identification
JP2014504400A (en) How to crop a text image
CN105808542B (en) Information processing method and information processing apparatus
CN1292377C (en) Method for selecting treating object in character identification of portable terminal and portable terminal
CN110781195B (en) System, method and device for updating point of interest information
CN113010738B (en) Video processing method, device, electronic equipment and readable storage medium
KR20040010364A (en) Document information input program, document information input apparatus and document information input method
CN107491778B (en) Intelligent device screen extraction method and system based on positioning image
JPH10254901A (en) Method and device for retrieving image
CN112183149B (en) Graphic code processing method and device
JP2007011762A (en) Area extraction apparatus and area extraction method
CN114155547A (en) Chart identification method, device, equipment and storage medium
CN105975621B (en) Method and device for identifying search engine in browser page
CN107438160A (en) A kind of preview image scales the method and device into line character inquiry manually

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant