WO2007082534A1 - Unité mobile comportant une caméra et un système de reconnaissance de caractères optique, éventuellement destinée à convertir un texte sous forme d'image en discours compréhensible - Google Patents

Unité mobile comportant une caméra et un système de reconnaissance de caractères optique, éventuellement destinée à convertir un texte sous forme d'image en discours compréhensible Download PDF

Info

Publication number
WO2007082534A1
WO2007082534A1 PCT/DK2006/000527 DK2006000527W WO2007082534A1 WO 2007082534 A1 WO2007082534 A1 WO 2007082534A1 DK 2006000527 W DK2006000527 W DK 2006000527W WO 2007082534 A1 WO2007082534 A1 WO 2007082534A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
mobile unit
computer
image
database
Prior art date
Application number
PCT/DK2006/000527
Other languages
English (en)
Inventor
Flemming Ast
Lars Ballieu Christensen
John Robert Christensen
Original Assignee
Flemming Ast
Lars Ballieu Christensen
John Robert Christensen
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Flemming Ast, Lars Ballieu Christensen, John Robert Christensen filed Critical Flemming Ast
Publication of WO2007082534A1 publication Critical patent/WO2007082534A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • the present invention relates to a mobile unit with a computer and a camera, the computer being configured to receive captured images as digital data from the camera and to extract text in the captured images by optical character recognition (OCR) routine and to convert the text from an image, for example for subsequent conversion into comprehensible speech.
  • OCR optical character recognition
  • Dyslexia can have different degrees of missing ability to read and write.
  • dyslexic persons may be able to read and even write, though having difficulty with correct spelling.
  • Modern aids, such as spell check in computers have helped many dyslexic people to live without severe difficulties.
  • the result may be an inability to move around and travel without the assistance of a person who can read. This lack of ability to read and write, often creates frustration and reduced self-esteem with an aggressive or unsure behaviour in daily life.
  • OCR optical character recognition
  • a mobile unit with a computer and a camera
  • the computer being configured to receive captured images as digital data from the camera and to extract text in the captured images by an optical character recognition (OCR) routine and to convert the text from an image format into a text format.
  • OCR optical character recognition
  • the mobile unit further comprises a text database with text words, and the computer is configured to compare the converted text with words in the text database and only to accept the converted text as resembling the imaged text in case of agreement with words in the database.
  • the text is translated into synthetic speech using a text-to-speech engine.
  • the invention is preferably implemented in a mobile telephone having a camera and a generator of synthetic speech.
  • the invention is of more general character and can be implemented in other mobile units, such as a PDA without mobile telephone.
  • the advantage of the mobile unit according to the invention is the additional routine of checking potential text extracted from images against words in a database.
  • the term words also includes one-letter words or parts of longer word, such as syllables.
  • text has been extracted from an image, it may be that some of the characters in the text have been recognised erroneously, rendering the final text is meaningless.
  • the camera according to the invention is implemented in a mobile telephone. This implies that a dyslexic person does not need to carry additional equipment apart from the mobile phone, which is carried along most of the time anyway. Also, use of a mobile phone for photographing text or signs would not be recognised as something remarkable.
  • the mobile unit according to the invention may comprise a synthetic speech generator for submitting the extracted text to the user by a synthetic voice. The synthetic speech can be listened to through earphones, which already are widely used in connection with mobile phones.
  • the earphones may be wireless, for example by utilising Bluetooth technology.
  • the dyslexic may use an important aid without the risk of being revealed as being disabled.
  • the computer may comprise routines that check whether the converted text as a whole makes sense, for example, whether the grammar is correct and whether the words are related to each other.
  • names of products or companies may imply words that are not found in the database, but which the user nevertheless is familiar with.
  • the mobile unit may be configured to request the user to indicate whether a phrase despite a missing counterpart in the database shall be accepted nevertheless.
  • the mobile unit according to the invention may be configured in case of missing acceptance to amend the initially converted text slightly in order to make it fit to existing words, letters, sequences of words, combined words, and/or parts of sentences in the database.
  • the initially converted text and the amended converted text may be presented to the user as options among which the user may choose the apparently most correct version.
  • the mobile unit may present several possibilities and request the user to indicate, whether there is a version, which seems to resemble the text in the image. The selection is subsequently stored in the database.
  • the mobile unit may be configured to base such proposals or to base the amendment or acceptance of the text on earlier choices by the user.
  • the photographed text may contain special words such as technical terms. This may induce problems, if these terms are specialised in certain fields, where the meaning of the terms is different from the normal meaning of the word. For example, a word as "leg” means a different thing for medical treatment than for mechanical fittings.
  • the database may contain special kind of technical dictionaries with such kind of special terms. The user may indicate, whether the special dictionaries shall be used during the comparison with the converted text. Special words or names may among others belong to traffic signs, which is one of the important entities for the dyslexic to read. Therefore, in a further embodiment, the unit also comprises a database with traffic sign text, and where the mobile unit is configured upon specific request from a user of the mobile unit to compare the extracted text from a captured image with the traffic sign text in the database.
  • the mobile unit is configured upon conversion of the extracted text to request an action from a user of the mobile unit for storing the converted text in a database or data memory in text format.
  • the mobile unit is configured in case of missing accept of the converted text to request an indication from a user as to whether the converted text is to be stored in the database or data memory.
  • the optics is typically of a quality with image distortions near the image edge. This implies that the photographed text may be curved, making the recognition by the software more difficult. Therefore, in a further embodiment, the camera is configured with an image distortion correction routine to correct the image in such a way that distortions are reduced, especially, curved parts of the images are straightened out.
  • the camera may be configured to perform the distortion correction with an algorithm performed on every taken image.
  • the algorithm may be constructed such that the correction is performed in dependence of the performance of the optics. If the optics is known, the algorithm can be adjusted to perform the correction in a specific way related to the specific type of optics.
  • the image resolution is not very high, which is due to the number of pixels in the CCD chip of the camera and partly due to the limited amount of memory available which causes the software of the camera to store images in lower resolution format. If necessary, the number of pixels may artificially be increased in a software routine.
  • the camera according to the invention is configured, by suitable software routines, for example by using Fourier analysis and/or high pass filtering, to compensate for low resolution in the image due to defocusing. If application of the corresponding software routine does not result in a satisfactory image, a new image has to be taken.
  • the invention may be based on a Windows® CE platform.
  • This widely used platform for handheld units for example mobile telephones or PDA (personal digital assistant)
  • PDA personal digital assistant
  • Windows® programs on stationary computers which in turn is widely used, as well.
  • an implementation of the OCR program in a mobile unit according to the invention is s further help for dyslexic people in as much as they are not forced to learn a new platform, which is a much more tedious task for dyslexic people than for others.
  • images may contain text passages that are partly obscured by objects such as dirt or rain on text boards or on traffic signs containing the text.
  • the camera according to the invention includes a routine that corrects image obstructions due to such kind of objects.
  • Images may be captured, where the text is not truly horizontal but deviates by a certain angle from the horizontal.
  • Commercially available software programs are configured to recognise letters nevertheless. This is also implemented in the invention. However, when letters in the image are deviating by angles of more than 40 degrees from the horizontal, proper letter recognition often fails, because in this case, the software is configured to assume a vertical text instead.
  • the camera may be programmed to rotate the entire captured image successively by a certain angle, for example 30 degrees or 45 degrees, if a proper extraction fails. After each rotation about this predetermined angle, a new attempt for extraction is performed, until the image has been rotated by 360 degrees. Alternatively, the image is rotated in one, two or three 90 degrees steps.
  • the mobile unit according to the invention is combined with route planner and/or navigation software, for example a commercial product Tom- Tom®.
  • the implemented program comprises routines that use the recognised text in a captured image, such as a text indicative for a location, for example a sign of the road and a house number, in combination with the route planner and/or navigation software.
  • the dyslexic person may image a road sign and a building number and as a result receive a synthetic voice message explaining the way from the actual location to a certain other location, for example the home of the person.
  • the route planner in a mobile unit according to the invention may be configured to show the location and the route on a map, or even to explain the route by means of buildings which the dyslexic finds on the route.
  • a GPS Global Positioning System
  • location routine may be used for finding possible location names at the actual GPS location, for example in a digital name database or in a digital map. By comparison of the imaged name converted in to text with the possible location names, the correct location name may be found quickly.
  • the mobile unit may comprise a GPS (global positioning system) such that the dyslexic person can be guided to the desired location.
  • GPS global positioning system
  • the dyslexic may have photographed - for example from a separate tourist brochure - a number of names of locations to visit on a tour.
  • the text recognition routine stores the location names in a memory, after which on request, the location names are matched by a built-in route planner in the mobile unit such that, automatically, a route is planned and by display on a map or by synthetic speech presented to the dyslexic.
  • the GPS system in the mobile unit keeps track of the actual location of the dyslexic and the route planner guides the dyslexic along the planned route and back to the point of origin for the tour or to another final point of interest.
  • the mobile unit according to the invention may in the database comprise a number of dictionaries with different languages.
  • One additional function may be the translation of imaged text, for example as disclosed in US patent application No. 2001/0056342.
  • the mobile unit has a high advantage for dyslexic people, it may also be of interest for non-disabled people.
  • the mobile unit may comprise a route planner but no synthetic speech. This would still be of high interest in certain cases as illustrated in the following.
  • a photographing of a street sign, for example with Chinese characters may be imaged and extracted by using Chinese character setup and a Chinese dictionary.
  • the location may be indicated in the display and a route proposed.
  • a synthetic speech generator may be an additional convenient, but not absolutely necessary feature. The advantage would be that a person not familiar with Chinese characters would still be able to find his way through a town in China.
  • the mobile unit according to the invention comprises a microphone to record voice messages from the user.
  • the mobile unit may comprise a routine for phonetic translation. Words and Phrases are stored as audio files in a database and are translated into other languages. This means for the person using the apparatus according to the invention that the person may speak into the microphone and get this speech translated into another language, either as an audio data file with the message spoken in another language or as a text file. The phonetic translation may be performed simultaneously with the speaking of the person.
  • the apparatus according to the invention can be used to simplify daily arrangements.
  • a system may be arranged, where brochures and other information may be ordered by sending an SMS (Short Message System) with a certain code from a mobile telephone to a pre-selected telephone number.
  • SMS Short Message System
  • this may be simplified by imaging the code, for example from an advertisement, and sending the converted code as characters/digits by SMS to the pre-selected telephone number.
  • the device of the present invention is useful not only for dyslexic persons as exemplified above, but is also intended for use by persons which lack the ability of reading or recognizing text or symbols for other reasons, such as visually impaired persons, but also persons with various brain damages, illiterates, young children etc.
  • the present invention further relates to a computer programme product, which when installed on a data storage means in a mobile unit with a computer and a camera, the data storage means being readable by the computer, will configure the mobile unit to comprise the technical features of the mobile unit of the present invention as disclosed above.
  • the computer programme product may be stored on a computer-readable data carrier.
  • the present invention relates furthermore to the method of operating a mobile unit in accordance with the present invention.
  • the method of operating a mobile unit having a computer and a camera comprises the steps of receiving by means of the computer a captured image as digital data from the camera, extract text in the captured images by an optical character recognition (OCR) routine and to convert the text from an image format into a text format, comparing the converted text with words in the text database comprised in the mobile unit, and accepting the converted text as resembling the imaged text in case of agreement with words in the database.
  • OCR optical character recognition
  • the method may further comprises the steps pertaining to the operation of the technical features included in the mobile unit as disclosed herein with reference to the present invention.
  • FIG. 1 is a flow diagram describing the overall functioning of the mobile unit in a concrete embodiment according to the invention
  • FIG. 2 is an illustration of compensation for blurred images
  • FIG. 3 is an illustration of image rotation
  • FIG. 4 is an illustration of the cleanup effect
  • FIG. 5 is an illustration of correction of curvature
  • FIG. 6 is an illustration of angular compensation.
  • the mobile unit according to the invention may be configured to have several modes.
  • One of the modes is the image capture, text conversion mode and speech mode.
  • ATR/TTS mode where the abbreviations refer to automated text recognition (ATR) and test-to-speech process (TTS process).
  • FIG. 1 is a flow diagram illustrating the overall functioning of the mobile unit according to the invention in concrete embodiment of the invention within the ATR/TTS mode.
  • the ATR/TTS process is divided into three major phases:
  • Steps IA, IB, 1C, ID as illustrated in FIG. 2 belong to the Launch Phase.
  • the ATR/TTS process is launched by one of four potential user events: - The user has clicked on the camera release whilst the device is in ATR/TTS -mode (step IA);
  • step IB the user has clicked the scalable on-screen release whilst the device is in ATR/TTS- mode
  • step IB the user has clicked the scalable on-screen release whilst the device is in ATR/TTS- mode
  • step 1C the user has activated the on-screen release using a voice command whilst the device is in ATR/TTS-mode
  • step ID the user has opened an image from the device store using the image browser.
  • the ATR/TTS process has successfully recognised text within the image, has produced a speech file and has played the speech file (step 18); or
  • the ATR/TTS process has failed to recognise text within the image and has played a pre-recorded error message back to the user (step 19).
  • the user interface used to control the ATR/TTS application is menu-driven.
  • the menu may be based on images and/or sound indications such that each function in the menu has its own sound, for example a voice message reading the name of function.
  • the user interface furthermore, may comprise icon-based menu items that can be activated by clicking on the display or by using a corresponding set of voice-commands entered into the mobile unit by the user through a microphone.
  • the icon-based menu interface is scalable and may be resized to accommodate user preferences.
  • the Recognition Phase comprises steps 2-15 and step 19.
  • Steps 2 ⁇ 3 The ATR/TTS application will immediately attempt to recognise text within the image using the OCR module (Optical Character Recognition - step 2). On success (step 3), the ATR/TTS process will resume at the Clean-up 16 and Text-to- Speech Phase 17. Otherwise (step 3), the ATR/TTS process will continue at step 4 to improve the image.
  • Step 4 ⁇ 5 ⁇ 6 Once the initial recognition has failed, the ATR/TTS process will attempt to improve the image quality using a variety of image manipulation techniques and technologies.
  • the image is processed by the simulated auto focus module (step 4) resulting in a clearer image, which also is illustrated in FIG. 2.
  • This software routine may, for instance, use high pass filtering and Fourier transformation in order to make optical edges sharper.
  • the number of pixels in the image is do not fulfil the requirements for handling by the OCR module, the number of pixels is subsequently increased artificially in order to match the requirements of the OCR module (step 5 and step 6).
  • Step 7 ⁇ 8 — > 9 ⁇ 10 Once the image quality has been improved, the ATR/TTS process will make another attempt to recognise text within the image (step 7). On success (step 8), the ATR/TTS process will resume at the Clean-up 16 and Text-to- Speech Phase 17. Otherwise (step 8), the ATR/TTS process will continue through a succession of 90° image rotations (step 9), each time attempting to recognise text within the image (step 7) until the image has been rotated by a total of 90°, 180° and 270° (step 10). This is illustrated in more detail in FIG. 3, where the original image not correctly oriented and the first rotation results in an image that is upside down.
  • Step 11 If the simulated auto focus and increase in the image resolution fail to render an image that can be successfully processed, the ATR/TTS process will attempt to increase the contrast between the text and the background to the extend of dividing the image into two parts using scaled binary threshold value: (1) Text; and (2) everything else. This is illustrated further in FIG. 4, where spots in the image are removed to make the image more clear and increase the contrast. Furthermore, the ATR/TTS process will attempt to compensate for (a) any optical curving, which is illustrated in FIG. 5, and/or (b) any other optical distortions caused as a result of the image not being captured at an angle of 90°, which is illustrated in FIG. 6.
  • Step 12 - ⁇ 13 ⁇ 14 ⁇ 15 Once the image quality has been improved, the ATR/TTS process will make another attempt to recognise text within the image (step 12). On success (step 13), the ATR/TTS process will resume at the Clean-up and Text-tol2 Speech Phase. Otherwise (step 13), the ATR/TTS process will continue through a succession of 90° image rotations (step 14), each time attempting to recognise text within the image (step 12) until the image has been rotated by a total of 90°, 180° and 270° (step 15).
  • Step 19 If this does not result in successful recognition, the ATR/TTS process is terminated with an error message (step 19).
  • the Clean-up and Text-To-Speech Phase comprises step 16-18.
  • Step 16 Once text has been recognised in one of the recognition attempts, the text is passed on for clean-up (step 16).
  • the clean-up task will remove non-printable characters and other characters not part of the current language setting; furthermore, the text will be matched against a database with common words and parts of words, and names of locations to increase the quality of the text.
  • step 16 the text is converted from the image format into a text format, which can be used in other applications, for example as shown in Step 17, where the cleaned up text is subsequently passed on to the text-to-speech engine and the resulting synthetic speech is stored in an audio file (step 17).
  • Step 18 Finally, the audio file is played back and control passed back to the user (step 18).
  • the text file may be used in other applications as well, for example for translation into other languages or for interaction with a route planner in order to display locations on a map, to show routes on a map, or by synthetic voice to guide a person around in the environment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Discrimination (AREA)

Abstract

L'invention concerne une unité mobile comportant un ordinateur et une caméra, l'ordinateur étant configuré pour recevoir des images capturées en tant que données numériques provenant de la caméra, et extraire du texte des images capturées par un processus de reconnaissance de caractères optiques (OCR) et convertir le texte d'un format image vers un format texte. L'unité mobile comporte également une base de données de texte contenant des mots de texte et l'ordinateur est configuré pour comparer le texte converti à des mots de la base de données de texte et accepter uniquement le texte converti en cas de correspondance avec des mots de la base de données. Le texte converti peut être transformé en discours de synthèse.
PCT/DK2006/000527 2006-01-17 2006-09-27 Unité mobile comportant une caméra et un système de reconnaissance de caractères optique, éventuellement destinée à convertir un texte sous forme d'image en discours compréhensible WO2007082534A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US75938706P 2006-01-17 2006-01-17
US60/759,387 2006-01-17

Publications (1)

Publication Number Publication Date
WO2007082534A1 true WO2007082534A1 (fr) 2007-07-26

Family

ID=37547713

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/DK2006/000527 WO2007082534A1 (fr) 2006-01-17 2006-09-27 Unité mobile comportant une caméra et un système de reconnaissance de caractères optique, éventuellement destinée à convertir un texte sous forme d'image en discours compréhensible

Country Status (1)

Country Link
WO (1) WO2007082534A1 (fr)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010105186A1 (fr) * 2009-03-13 2010-09-16 Qualcomm Incorporated Techniques assistées par l'homme pour fournir des cartes locales et des données annotées spécifiques de lieux
FR2945695A1 (fr) * 2009-05-13 2010-11-19 Raoul Parienti Appareil telephonique portable a ecran tactile particulierement adapte pour des utilisateurs malvoyants
WO2012168405A3 (fr) * 2011-06-07 2013-01-31 Marcus Regensburger Procédé pour déterminer des signaux vocaux acoustiques
US8938211B2 (en) 2008-12-22 2015-01-20 Qualcomm Incorporated Providing and utilizing maps in location determination based on RSSI and RTT data
US9080882B2 (en) 2012-03-02 2015-07-14 Qualcomm Incorporated Visual OCR for positioning
US20150324640A1 (en) * 2009-02-10 2015-11-12 Kofax, Inc. Systems, methods and computer program products for determining document validity
EP2399385A4 (fr) * 2009-02-18 2016-08-24 Google Inc Informations de capture automatique telles que des informations de capture utilisant un dispositif prenant en charge des documents
US9569701B2 (en) 2015-03-06 2017-02-14 International Business Machines Corporation Interactive text recognition by a head-mounted device
US9747504B2 (en) 2013-11-15 2017-08-29 Kofax, Inc. Systems and methods for generating composite images of long documents using mobile video data
US9760788B2 (en) 2014-10-30 2017-09-12 Kofax, Inc. Mobile document detection and orientation based on reference object characteristics
US9769354B2 (en) 2005-03-24 2017-09-19 Kofax, Inc. Systems and methods of processing scanned data
US9767354B2 (en) 2009-02-10 2017-09-19 Kofax, Inc. Global geographic information retrieval, validation, and normalization
US9779296B1 (en) 2016-04-01 2017-10-03 Kofax, Inc. Content-based detection and three dimensional geometric reconstruction of objects in image and video data
US9819825B2 (en) 2013-05-03 2017-11-14 Kofax, Inc. Systems and methods for detecting and classifying objects in video captured using mobile devices
US9946954B2 (en) 2013-09-27 2018-04-17 Kofax, Inc. Determining distance between an object and a capture device based on captured image data
US9996741B2 (en) 2013-03-13 2018-06-12 Kofax, Inc. Systems and methods for classifying objects in digital images captured using mobile devices
US10146803B2 (en) 2013-04-23 2018-12-04 Kofax, Inc Smart mobile application development platform
US10146795B2 (en) 2012-01-12 2018-12-04 Kofax, Inc. Systems and methods for mobile image capture and processing
US10242285B2 (en) 2015-07-20 2019-03-26 Kofax, Inc. Iterative recognition-guided thresholding and data extraction
US10467465B2 (en) 2015-07-20 2019-11-05 Kofax, Inc. Range and/or polarity-based thresholding for improved data extraction
US10657600B2 (en) 2012-01-12 2020-05-19 Kofax, Inc. Systems and methods for mobile image capture and processing
US10803350B2 (en) 2017-11-30 2020-10-13 Kofax, Inc. Object detection and image cropping using a multi-detector approach
WO2022020167A1 (fr) * 2020-07-22 2022-01-27 Optum, Inc. Systèmes et procédés pour corriger automatiquement l'orientation d'une image de document
RU2784678C1 (ru) * 2021-11-27 2022-11-29 Альберт Владимирович Федотов Детское устройство для озвучивания текста

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010056342A1 (en) * 2000-02-24 2001-12-27 Piehn Thomas Barry Voice enabled digital camera and language translator
EP0774729B1 (fr) * 1995-11-15 2002-09-11 Hitachi, Ltd. Système de reconnaissance et traduction de caractères
US20020191847A1 (en) * 1998-05-06 2002-12-19 Xerox Corporation Portable text capturing method and device therefor
US20050221856A1 (en) * 2001-12-10 2005-10-06 Takashi Hirano Cellular terminal image processing system, cellular terminal, and server
US20050286493A1 (en) * 2004-06-25 2005-12-29 Anders Angelhag Mobile terminals, methods, and program products that generate communication information based on characters recognized in image data
US20060006235A1 (en) * 2004-04-02 2006-01-12 Kurzweil Raymond C Directed reading mode for portable reading machine

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0774729B1 (fr) * 1995-11-15 2002-09-11 Hitachi, Ltd. Système de reconnaissance et traduction de caractères
US20020191847A1 (en) * 1998-05-06 2002-12-19 Xerox Corporation Portable text capturing method and device therefor
US20010056342A1 (en) * 2000-02-24 2001-12-27 Piehn Thomas Barry Voice enabled digital camera and language translator
US20050221856A1 (en) * 2001-12-10 2005-10-06 Takashi Hirano Cellular terminal image processing system, cellular terminal, and server
US20060006235A1 (en) * 2004-04-02 2006-01-12 Kurzweil Raymond C Directed reading mode for portable reading machine
US20050286493A1 (en) * 2004-06-25 2005-12-29 Anders Angelhag Mobile terminals, methods, and program products that generate communication information based on characters recognized in image data

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LIANG J ET AL: "CAMERA-BASED ANALYSIS OF TEXT AND DOCUMENTS: A SURVEY", INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, SPRINGER, HEIDELBERG, DE, vol. 7, no. 2/3, July 2005 (2005-07-01), pages 84 - 104, XP001233445, ISSN: 1433-2833 *
YING ZHANG ET AL: "AUTOMATIC SIGN TRANSLATION", ICSLP 2002 : 7TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING. DENVER, COLORADO, SEPT. 16 - 20, 2002, INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING. (ICSLP), ADELAIDE : CAUSAL PRODUCTIONS, AU, vol. VOL. 4 OF 4, 16 September 2002 (2002-09-16), pages 645, XP007011704, ISBN: 1-876346-40-X *
ZANDIFAR A ET AL: "A video based interface to textual information for the visually impaired", MULTIMODAL INTERFACES, 2002. PROCEEDINGS. FOURTH IEEE INTERNATIONAL CONFERENCE ON 14-16 OCT. 2002, PISCATAWAY, NJ, USA,IEEE, 14 October 2002 (2002-10-14), pages 325 - 330, XP010624336, ISBN: 0-7695-1834-6 *

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9769354B2 (en) 2005-03-24 2017-09-19 Kofax, Inc. Systems and methods of processing scanned data
US8938211B2 (en) 2008-12-22 2015-01-20 Qualcomm Incorporated Providing and utilizing maps in location determination based on RSSI and RTT data
US20150324640A1 (en) * 2009-02-10 2015-11-12 Kofax, Inc. Systems, methods and computer program products for determining document validity
US9767354B2 (en) 2009-02-10 2017-09-19 Kofax, Inc. Global geographic information retrieval, validation, and normalization
US9576272B2 (en) * 2009-02-10 2017-02-21 Kofax, Inc. Systems, methods and computer program products for determining document validity
US10013766B2 (en) 2009-02-18 2018-07-03 Google Inc. Automatically capturing information such as capturing information using a document-aware device
EP2399385A4 (fr) * 2009-02-18 2016-08-24 Google Inc Informations de capture automatique telles que des informations de capture utilisant un dispositif prenant en charge des documents
US8938355B2 (en) 2009-03-13 2015-01-20 Qualcomm Incorporated Human assisted techniques for providing local maps and location-specific annotated data
CN102341672A (zh) * 2009-03-13 2012-02-01 高通股份有限公司 用于提供本地地图和因位置而异的注释数据的人工辅助技术
US20100235091A1 (en) * 2009-03-13 2010-09-16 Qualcomm Incorporated Human assisted techniques for providing local maps and location-specific annotated data
WO2010105186A1 (fr) * 2009-03-13 2010-09-16 Qualcomm Incorporated Techniques assistées par l'homme pour fournir des cartes locales et des données annotées spécifiques de lieux
FR2945695A1 (fr) * 2009-05-13 2010-11-19 Raoul Parienti Appareil telephonique portable a ecran tactile particulierement adapte pour des utilisateurs malvoyants
WO2012168405A3 (fr) * 2011-06-07 2013-01-31 Marcus Regensburger Procédé pour déterminer des signaux vocaux acoustiques
US10146795B2 (en) 2012-01-12 2018-12-04 Kofax, Inc. Systems and methods for mobile image capture and processing
US10664919B2 (en) 2012-01-12 2020-05-26 Kofax, Inc. Systems and methods for mobile image capture and processing
US10657600B2 (en) 2012-01-12 2020-05-19 Kofax, Inc. Systems and methods for mobile image capture and processing
US9080882B2 (en) 2012-03-02 2015-07-14 Qualcomm Incorporated Visual OCR for positioning
US9996741B2 (en) 2013-03-13 2018-06-12 Kofax, Inc. Systems and methods for classifying objects in digital images captured using mobile devices
US10127441B2 (en) 2013-03-13 2018-11-13 Kofax, Inc. Systems and methods for classifying objects in digital images captured using mobile devices
US10146803B2 (en) 2013-04-23 2018-12-04 Kofax, Inc Smart mobile application development platform
US9819825B2 (en) 2013-05-03 2017-11-14 Kofax, Inc. Systems and methods for detecting and classifying objects in video captured using mobile devices
US9946954B2 (en) 2013-09-27 2018-04-17 Kofax, Inc. Determining distance between an object and a capture device based on captured image data
US9747504B2 (en) 2013-11-15 2017-08-29 Kofax, Inc. Systems and methods for generating composite images of long documents using mobile video data
US9760788B2 (en) 2014-10-30 2017-09-12 Kofax, Inc. Mobile document detection and orientation based on reference object characteristics
US9569701B2 (en) 2015-03-06 2017-02-14 International Business Machines Corporation Interactive text recognition by a head-mounted device
US10242285B2 (en) 2015-07-20 2019-03-26 Kofax, Inc. Iterative recognition-guided thresholding and data extraction
US10467465B2 (en) 2015-07-20 2019-11-05 Kofax, Inc. Range and/or polarity-based thresholding for improved data extraction
US9779296B1 (en) 2016-04-01 2017-10-03 Kofax, Inc. Content-based detection and three dimensional geometric reconstruction of objects in image and video data
US10803350B2 (en) 2017-11-30 2020-10-13 Kofax, Inc. Object detection and image cropping using a multi-detector approach
US11062176B2 (en) 2017-11-30 2021-07-13 Kofax, Inc. Object detection and image cropping using a multi-detector approach
WO2022020167A1 (fr) * 2020-07-22 2022-01-27 Optum, Inc. Systèmes et procédés pour corriger automatiquement l'orientation d'une image de document
US11495014B2 (en) 2020-07-22 2022-11-08 Optum, Inc. Systems and methods for automated document image orientation correction
US11776248B2 (en) 2020-07-22 2023-10-03 Optum, Inc. Systems and methods for automated document image orientation correction
RU2784678C1 (ru) * 2021-11-27 2022-11-29 Альберт Владимирович Федотов Детское устройство для озвучивания текста

Similar Documents

Publication Publication Date Title
WO2007082534A1 (fr) Unité mobile comportant une caméra et un système de reconnaissance de caractères optique, éventuellement destinée à convertir un texte sous forme d'image en discours compréhensible
US9430467B2 (en) Mobile speech-to-speech interpretation system
CN1116770C (zh) 应用语音识别的自动旅店服务者
KR100220960B1 (ko) 문자인식 번역시스템 및 음성인식 번역시스템
US9298704B2 (en) Language translation of visual and audio input
US8694323B2 (en) In-vehicle apparatus
CA2280331C (fr) Plate-forme web pour reponse vocale interactive (ivr)
JP4356745B2 (ja) 機械翻訳システム、機械翻訳方法及びプログラム
US7127397B2 (en) Method of training a computer system via human voice input
US20050050165A1 (en) Internet access via smartphone camera
JP2004534268A (ja) 自動アテンダントによって使用される情報を前処理するシステムと方法
US20040210444A1 (en) System and method for translating languages using portable display device
EP1083545A2 (fr) Reconnaissance vocale de noms propres dans un système de navigation
US20120130704A1 (en) Real-time translation method for mobile device
KR20070113665A (ko) 네비게이션 단말의 목적지 설정 방법 및 장치
WO2001045088A1 (fr) Traducteur electronique permettant de faciliter la communication
WO2005066882A1 (fr) Dispositif de reconnaissance de caracteres, systeme de communications mobiles, dispositif terminal mobile, dispositif de station fixe, procede de reconnaissance de caracteres, et programme de reconnaissance de caracteres
EP1979858A1 (fr) Unité mobile dotée d'une caméra, présentant une reconnaissance optique de caractères, servant éventuellement à convertir du texte imagé en paroles compréhensibles
US20090081630A1 (en) Text to Training Aid Conversion System and Service
JP2009187349A (ja) 文章修正支援システム、文章修正支援方法、および文章修正支援用プログラム
JP2012168349A (ja) 音声認識システムおよびこれを用いた検索システム
US20060129398A1 (en) Method and system for obtaining personal aliases through voice recognition
KR102300589B1 (ko) 수화통역 시스템
CN1217808A (zh) 自动语音识别
CN111078992B (zh) 一种听写内容生成方法及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06776004

Country of ref document: EP

Kind code of ref document: A1