US20160188990A1 - Method and system for recognizing characters - Google Patents
Method and system for recognizing characters Download PDFInfo
- Publication number
- US20160188990A1 US20160188990A1 US14/636,929 US201514636929A US2016188990A1 US 20160188990 A1 US20160188990 A1 US 20160188990A1 US 201514636929 A US201514636929 A US 201514636929A US 2016188990 A1 US2016188990 A1 US 2016188990A1
- Authority
- US
- United States
- Prior art keywords
- character
- input image
- characters
- graphical representation
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 57
- 239000011159 matrix material Substances 0.000 claims abstract description 30
- 238000007781 pre-processing Methods 0.000 claims description 57
- 239000000284 extract Substances 0.000 claims description 17
- 238000010200 validation analysis Methods 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 10
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 claims 1
- 238000004891 communication Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 4
- 238000012937 correction Methods 0.000 description 4
- 238000003708 edge detection Methods 0.000 description 4
- 230000011218 segmentation Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 238000013499 data model Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012015 optical character recognition Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000012916 structural analysis Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- FMFKNGWZEQOWNK-UHFFFAOYSA-N 1-butoxypropan-2-yl 2-(2,4,5-trichlorophenoxy)propanoate Chemical compound CCCCOCC(C)OC(=O)C(C)OC1=CC(Cl)=C(Cl)C=C1Cl FMFKNGWZEQOWNK-UHFFFAOYSA-N 0.000 description 1
- 241000010972 Ballerus ballerus Species 0.000 description 1
- VYZAMTAEIAYCRO-UHFFFAOYSA-N Chromium Chemical compound [Cr] VYZAMTAEIAYCRO-UHFFFAOYSA-N 0.000 description 1
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000013515 script Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G06K9/18—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/18—Extraction of features or characteristics of the image
- G06V30/18162—Extraction of features or characteristics of the image related to a structural representation of the pattern
- G06V30/18181—Graphical representation, e.g. directed attributed graph
-
- G06K9/4604—
-
- G06K9/6201—
-
- G06T7/0083—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/19007—Matching; Proximity measures
- G06V30/19013—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
- G06V30/1902—Shifting or otherwise transforming the patterns to accommodate for positional errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
- G06V20/625—License plates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Definitions
- the present subject matter is related, in general to a recognition method and a recognition system, and more particularly, but not exclusively to method and system for recognizing characters.
- OCR optical character recognition
- OCR optical character recognition
- the characters being recognized are optically scanned and binarized to create into a suitable form for analyzing and transformed into an electronic form for further processing.
- Various feature recognition techniques are known for examining and recognizing characters based on predetermined patterns stored in memory of an electronic device. However, these known techniques depend heavily on standardized fonts or approximations thereof. Further, there exist problems in identifying characters due to magnification, reduction or rotation, different lighting condition, resolution limitation, perspective distortions, arbitrary orientation and bad quality of characters, and non-uniform illumination conditions during image acquisition.
- the present disclosure relates to a recognition method.
- the method comprising the step of receiving an input image comprising one or more characters from an image sensor. The received input image is then processed to extract one or more nodes and edges of each character in the input image and a graphical representation of each character is generated based on the one or more edges. The method further comprising comparing the graphical representation of each character with a predetermined graphical representation of each reference character stored in a reference repository. Based on the comparison, the characters in the input image are recognized.
- the present disclosure relates to a recognition system comprising an image sensor and a processor coupled with the image sensor.
- the system further comprises a memory communicatively coupled to the processor, wherein the memory stores processor-executable instructions, which, on execution, cause the processor to receive an input image comprising one or more characters from the image sensor.
- the processor is configured to extract one or more nodes and edges of each character from the input image and generate a graphical representation of each character based on the one or more edges.
- the system further comprises a comparison unit coupled with the processor and configured to compare the graphical representation of each character with the predetermined graphical representation of each reference character stored in a reference repository.
- the system comprises a recognition unit coupled with the comparison unit and configured to recognize the reference character as one of the characters in the input image based on the comparing.
- the present disclosure relates to a non-transitory computer readable medium including instructions stored thereon that when processed by at least one processor cause a system to perform the act of receiving an image comprising one or more characters. Further, the instructions cause the processor to perform the acts of extracting one or more nodes and edges of each character from the input image and generating a graphical representation of each character based on the one or more edges. Furthermore, the instructions cause the processor to perform the acts of comparing the generated graphical representation of each character with the predetermined representation of each reference character stored in a reference repository and recognizing the reference character as one of the characters in the input image based on the comparing.
- FIG. 1 a illustrates a block diagram of a recognition system for recognizing characters in accordance with some embodiments of the present disclosure
- FIGS. 1 b -1 k illustrate a graphical representation and adjacency matrix of exemplary characters in accordance with some embodiments of the present disclosure
- FIG. 2 illustrates a flowchart of a method of recognizing characters in accordance with some embodiments of the present disclosure
- FIG. 3 is a block diagram of an exemplary computer system for implementing embodiments consistent with the present disclosure.
- exemplary is used herein to mean “serving as an example, instance, or illustration.” Any embodiment or implementation of the present subject matter described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments.
- the present disclosure relates to a method, and a system for recognizing characters.
- the input image comprising one or more characters to be recognized is received and processed to extract one or more nodes and edges of each character in the input image.
- a graphical representation and adjacency matrix of each character is generated and compared with a predetermined graphical representation and adjacency matrix to determine a match.
- a matching probability is determined based on which one or more characters in the input image is recognized and displayed as output.
- the proposed recognition method and system recognizes character with more accuracy and speed. Further, the present disclosure is simple, cost-effective and reduces the complexity involved in automatic recognition of characters.
- FIG. 1 illustrates block diagram of recognition system 100 for recognizing characters automatically in accordance with some embodiments of the present disclosure.
- the recognition system 100 is configured to recognize characters printed, handwritten or captured by an image capturing device or image sensor.
- the recognition system 100 may be used to automatically recognize characters on number plate of automobiles or vehicles during motion.
- the exemplary recognition system 100 includes a central processing unit (“CPU” or “processor”) 102 , the memory 104 and an I/O interface 106 .
- the I/O interface 106 is coupled with the processor 102 and an I/O device.
- the I/O device is configured to receive inputs via the I/O interface 106 and transmit outputs for displaying in the I/O device via the I/O interface 106 .
- the recognition system 100 further comprises an image sensor 108 coupled with the processor 102 and configured to capture an input image comprising one or more characters for recognition.
- the recognition system 100 also comprises data 110 and modules 112 .
- the data 110 and the modules 112 may be stored within the memory 104 .
- the data 110 may include input image 114 , reference repository 116 , image of skeletonized character 118 , matching probability data 120 , adjacency matrix 122 , recognized characters 124 and other data 126 .
- the data 110 may be stored in the memory 104 in the form of various data structures. Additionally, the aforementioned data can be organized using data models, such as relational or hierarchical data models.
- the other data 126 may be used to store data, including temporary data and temporary files, generated by the modules 112 for performing the various functions of the recognition system 100 .
- the modules 112 may include, for example, a pre-processing and feature extracting unit (hereinafter referred to as pre-processing unit) 128 , a comparison unit 130 , and a character validation unit 132 .
- the modules 112 may also comprise other modules 134 to perform various miscellaneous functionalities of the recognition system 100 . It will be appreciated that such aforementioned modules may be represented as a single module or a combination of different modules.
- the image sensor 108 captures the input image 114 comprising one or more characters in a capture frame like a rectangular mask.
- the image sensor 108 may be for example, a still/video camera, a camera in a mobile device or any other known image capturing devices.
- the input image 114 comprises one or more frames captured for a predetermined time for example, comprises ‘n’ frames per second. Each frame of the input image 114 is hereinafter referred to as input image 114 .
- the input image 114 is an image with resolution limitation or having background images that may cross the text or characters or create graphic noise, with changing light intensity, or an image captured from arbitrary view point having perspective distortion.
- the image may have characters with different fonts, sizes, shapes, styles and thickness, shadows, text embossed, geometric distortions, and image resolution limitation or may be of bad quality.
- the input image 114 is pre-processed by the pre-processing unit 128 to remove image resolution limitations and to extract one or more predetermined features used for recognition of characters in the input image 114 .
- the pre-processing unit 128 pre-processes the input image 114 by removing background or graphical noise from the input image 114 using any known filters and converting the filtered input image 114 in RGB format into a corresponding grayscale format.
- the pre-processing unit 128 detects the boundary of the number plate in the input image 114 by using any known boundary detection techniques to locate the images of the characters.
- the number plates may be generally rectangular in shape with dark characters on a light/bright background, for example black fonts on white background and may have characters of same or different size.
- the number plates may also have a fixed ratio of width to height relationship.
- the pre-processing unit 128 analyses the input image 114 , determines the extent of intensity variation of each row and selects the adjacent rows that exhibit the maximum variation to contain the characters of the number plate.
- the pre-processing unit 128 further determines the borders of the input image 114 by detecting the edges of the input image 114 using any edge detection technique.
- the pre-processing unit 128 determines the horizontal and vertical edges using Hough Transformation technique for example, and extracts the number plate based on the determined edges. Further, the pre-processing unit 128 processes the extracted number plate input image 114 for skewing/slant correction using known skew/slant correction techniques. Upon skewing/slant correcting the input image 114 , the pre-processing unit 128 detects the orientation of the input image 114 and resizes the dimensions of the extracted number plate input image 114 based on detected orientation. In one embodiment, the pre-processing unit 128 detects the orientation of the number plate input image 114 and aligns the input image 114 using a bounding box. Upon correcting the orientation, the dimensions of the input image 114 is altered by the pre-processing unit 128 to a predetermined size.
- the pre-processing unit 128 is further configured to locate the characters of interest for example, one or more characters in the number plate and extract the corresponding images. In one embodiment, the pre-processing unit 128 identifies the location of the number plate and extracts images of each alphanumeric character from the identified location. The pre-processing unit 128 extracts images of each character by segmenting the input image 114 using any known segmentation techniques. In one embodiment, the pre-processing unit 128 converts the gray scale input image 114 into a corresponding binary image using a predetermined adaptive threshold value.
- the pre-processing unit 128 Upon conversion, the pre-processing unit 128 extracts one or more characters from the input binary image 114 by computing horizontal projections on the input binary image 114 to obtain horizontal image segments and further computing vertical projections on the horizontal image segments to obtain the one or more image segments of the one or more characters of the input image 114 . Upon segmentation, the pre-processing unit 128 generates skeletonized image segments of each character to extract one or more features representing a general form of the characters as shown in FIG. 1 b.
- the pre-processing unit 128 normalizes the dimensions of the character contained in the image segments using techniques including Nearest-neighbor or weighted average method of down sampling to generate one or more normalized characters.
- the pre-processing unit 128 further extracts appropriate descriptors from the one or more normalized characters, performs structural analysis from the extracted descriptors and extracts region-based shaped features representing the general form of the character.
- the pre-processing unit 128 Upon obtaining each skeletonized character, the pre-processing unit 128 generates graphical representation and adjacency matrix of each skeletonized character 118 .
- the pre-processing unit 128 converts each skeletonized character 118 into a graphical representation comprising one or more nodes and edges using a bounding box and edge detection techniques known in the art. Upon conversion, the pre-processing unit 128 extracts the one or more nodes and the edges using a bounding box in clockwise direction for example as illustrated in FIG. 1 c .
- wave graph of each character is generated using the extracted edges and corresponding graphical ending position angle is determined
- the pre-processing unit 128 generates sine wave graph of each character using the extracted edges only once per each edge and determines sine wave ending position angle for each character.
- the pre-processing unit 128 plots each character onto the sine wave graph by plotting the extracted edges of each character on the sine wave graph and determining the sine wave ending position angle of the plotted edges.
- the pre-processing unit 128 determines adjacency matrix 122 for each character based on the nodes and edges extracted in the clockwise direction.
- a sample illustration of sine wave graph of letter ‘A’ and digit ‘4’ is shown in FIGS. 1 a - 1 j.
- the recognition system 100 recognizes one or more characters of the input image 114 by comparing it with a previously stored reference characters in the reference repository 116 .
- the reference repository is predetermined and stored in the memory 104 .
- the pre-processing unit 128 pre-processes images of one or more characters including for example, digits, capital alphabet letters, small alphabet letters and special characters by removing noise and converts the image in RGB format into its corresponding grayscale format.
- the images are then normalized, re-sampled to create images of equal dimensions, and skeletonized to extract region-based shape feature representing the general form of the characters. Further, the location of the characters in the input image is detected and identified.
- the images are then converted into graphical representation comprising one or more nodes and edges based on which graph wave ending position angle and adjacency matrix are determined and stored for each reference characters.
- the recognition system 100 recognizes characters in the input image 114 of the number plate based on the graphical representation and the adjacency matrix 122 of each character in the input image 114 .
- the recognition system 100 recognizes characters in the input image 114 by comparing each character in the input image 114 with each of the reference character in the reference repository 116 .
- the comparison unit 130 compares the sine wave ending position angle of each skeletonized characters 118 in ‘n’ frames of the input image 114 with the sine wave ending position angle of one or more reference characters in the reference repository 116 and further compares the adjacency matrix 122 of the character in ‘n’ frames of the input image 114 with the adjacency matrix of the reference character.
- the reference characters are identified and recognized as characters in the input image 114 . If at least one character of ‘n’ frames of the input image 114 is not matching with the reference characters, then the unmatched character is recognized based on the matching probability data 120 of the unmatched character.
- the character validation unit determines the matching probability data 120 of the at least one unmatched character and recognizes the at least one unmatched character of the input image 114 based on the matching probability data 120 . If the validation unit 132 determines that the matching probability data 120 exceeds a predetermined matching probability threshold data stored in the other data 126 , then the validation unit 132 selects the reference character as recognized character in the input image 114 and displays the selected character as output. If the validation unit 132 determines that the matching probability data 120 of the unmatched character does not exceed the predetermined matching probability threshold data, then the validation unit 132 discards the matching probability data 120 of the unmatched character and repeats the entire process.
- FIG. 2 illustrates a flowchart of a method of recognizing characters in accordance with some embodiments of the present disclosure.
- the method 200 comprises one or more blocks implemented by the processor 102 for recognizing characters.
- the method 200 may be described in the general context of computer executable instructions.
- computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, and functions, which perform particular functions or implement particular abstract data types.
- the order in which the method 200 is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method 200 . Additionally, individual blocks may be deleted from the method 200 without departing from the spirit and scope of the subject matter described herein. Furthermore, the method 200 can be implemented in any suitable hardware, software, firmware, or combination thereof.
- the image sensor 108 captures the input image 114 comprising one or more characters in a capture frame like a rectangular mask.
- the image sensor 108 may be for example, a still/video camera, a camera in a mobile device or any other known image capturing devices.
- the input image 114 is an image with resolution limitation or having background images that may cross the text or characters or create graphic noise, with changing light intensity, or an image captured from arbitrary view point having perspective distortion.
- the image may have characters with different fonts, sizes, shapes, styles and thickness, shadows, text embossed, geometric distortions, and image resolution limitation or may be of bad quality.
- pre-process the input image is pre-processed by the pre-processing unit 128 to remove image resolution limitations and to extract one or more predetermined features used for recognition of characters in the input image 114 .
- the pre-processing unit 128 pre-processes the input image 114 by removing background or graphical noise from the input image 114 using any known filters and converting the filtered input image 114 in RGB format into a corresponding grayscale format.
- the pre-processing unit 128 detects the boundary of the number plate in the input image 114 by using any known boundary detection techniques to locate the images of the characters.
- the number plates may be generally rectangular in shape with dark characters on a light/bright background, for example black fonts on white background and may have characters of same or different size.
- the number plates may also have a fixed ratio of width to height relationship.
- the pre-processing unit 128 analyses the input image 114 , determines the extent of intensity variation of each row and selects the adjacent rows that exhibit the maximum variation to contain the characters of the number plate.
- the pre-processing unit 128 further determines the borders of the input image 114 by detecting the edges of the input image 114 using any edge detection technique.
- the pre-processing unit 128 determines the horizontal and vertical edges using Hough Transformation technique for example, and extracts the number plate based on the determined edges. Further, the pre-processing unit 128 processes the extracted number plate input image 114 for skewing/slant correction using known skew/slant correction techniques. Upon skewing/slant correcting the input image 114 , the pre-processing unit 128 detects the orientation of the input image 114 and resizes the dimensions of the extracted number plate input image 114 based on detected orientation. In one embodiment, the pre-processing unit 128 detects the orientation of the number plate input image 114 and aligns the input image 114 using a bounding box. Upon correcting the orientation, the dimensions of the input image 114 is altered by the pre-processing unit 128 to a predetermined size.
- the pre-processing unit 128 is further configured to locate the characters of interest for example, one or more characters in the number plate and extract the corresponding images. In one embodiment, the pre-processing unit 128 identifies the location of the number plate and extracts images of each alphanumeric character from the identified location. The pre-processing unit 128 extracts images of each character by segmenting the input image 114 using any known segmentation techniques. In one embodiment, the pre-processing unit 128 converts the gray scale input image 114 into a corresponding binary image using a predetermined adaptive threshold value.
- the pre-processing unit 128 Upon conversion, the pre-processing unit 128 extracts one or more characters from the input binary image 114 by computing horizontal projections on the input binary image 114 to obtain horizontal image segments and further computing vertical projections on the horizontal image segments to obtain the one or more image segments of the one or more characters of the input image 114 . Upon segmentation, the pre-processing unit 128 generates skeletonized image segments of each character to extract one or more features representing a general form of the characters as shown in FIG. 1 c.
- the pre-processing unit 128 normalizes the dimensions of the character contained in the image segments using techniques including Nearest-neighbor or weighted average method of down sampling to generate one or more normalized characters.
- the pre-processing unit 128 further extracts appropriate descriptors from the one or more normalized characters, performs structural analysis from the extracted descriptors and extracts region-based shaped features representing the general form of the character.
- the pre-processing unit 128 Upon obtaining each skeletonized character, the pre-processing unit 128 generates graphical representation and adjacency matrix of each skeletonized character 118 . In one embodiment, the pre-processing unit 128 converts each skeletonized character 118 into a graphical representation comprising one or more nodes and edges using a bounding box and edge detection techniques known in the art. Upon conversion, the pre-processing unit 128 extracts the one or more nodes and the edges using a bounding box in clockwise direction for example as illustrated in FIG. 1 d .
- wave graph of each character is generated using the extracted edges and corresponding graphical ending position angle is determined
- the pre-processing unit generates sine wave graph of each character using the extracted edges only once per each edge and determines sine wave ending position angle for each character.
- the pre-processing unit 128 determines adjacency matrix 122 for each character based on the nodes and edges extracted in the clockwise direction. A sample illustration of sine wave graph of letter ‘A’ and digit ‘4’ is shown in FIGS. 1 b - 1 k.
- the recognition system 100 recognizes characters in the input image 114 of the number plate based on the graphical representation and the adjacency matrix 122 of each character in the input image 114 . In one embodiment, the recognition system 100 recognizes characters in the input image 114 by comparing each character in the input image 114 with each of the reference character in the reference repository 116 .
- the comparison unit 130 compares the sine wave ending position angle of each skeletonized characters 118 in ‘n’ frames of the input image 114 with the sine wave ending position angle of one or more reference characters in the reference repository 116 and further compares the adjacency matrix 122 of the character in ‘n’ frames of the input image 114 with the adjacency matrix of the reference character. If all the characters of ‘n’ frames of the input image 114 match with the reference characters, then the reference characters are identified and recognized as characters in the input image 114 . If at least one character of ‘n’ frames of the input image 114 is not matching with the reference characters, then the unmatched character is recognized based on the matching probability data 120 of the unmatched character.
- the character validation unit 132 determines the matching probability data 120 of the at least one unmatched character and recognizes the at least one unmatched character of the input image 114 based on the matching probability data 120 . If the validation unit 132 determines that the matching probability data 120 exceeds a predetermined matching probability threshold data stored in the other data 126 , then the validation unit 132 selects the reference character as recognized character in the input image 114 and displays the selected character as output. If the validation unit 132 determines that the matching probability data 120 of the unmatched character does not exceed the predetermined matching probability threshold data, then the validation unit 132 discards the matching probability data 120 of the unmatched character and repeats the entire process.
- FIG. 3 is a block diagram of an exemplary computer system for implementing embodiments consistent with the present disclosure.
- Computer system 301 may be used for implementing all the computing systems that may be utilized to implement the features of the present disclosure.
- Computer system 301 may comprise a central processing unit (“CPU” or “processor”) 302 .
- Processor 302 may comprise at least one data processor for executing program components for executing user- or system-generated requests.
- the processor may include specialized processing units such as integrated system (bus) controllers, memory management control units, floating point units, graphics processing units, digital signal processing units, etc.
- the processor 302 may include a microprocessor, such as AMD Athlon, Duron or Opteron, ARM's application, embedded or secure processors, IBM PowerPC, Intel's Core, Itanium, Xeon, Celeron or other line of processors, etc.
- the processor 302 may be implemented using mainframe, distributed processor, multi-core, parallel, grid, or other architectures. Some embodiments may utilize embedded technologies like application-specific integrated circuits (ASICs), digital signal processors (DSPs), Field Programmable Gate Arrays (FPGAs), etc.
- ASICs application-specific integrated circuits
- DSPs digital signal processors
- FPGAs Field Programmable Gate Arrays
- I/O Processor 302 may be disposed in communication with one or more input/output (I/O) devices via I/O interface 303 .
- the I/O interface 303 may employ communication protocols/methods such as, without limitation, audio, analog, digital, monoaural, RCA, stereo, IEEE-1394, serial bus, universal serial bus (USB), infrared, PS/2, BNC, coaxial, component, composite, digital visual interface (DVI), high-definition multimedia interface (HDMI), RF antennas, S-Video, VGA, IEEE 802.n/b/g/n/x, Bluetooth, cellular (e.g., code-division multiple access (CDMA), high-speed packet access (HSPA+), global system for mobile communications (GSM), long-term evolution (LTE), WiMax, or the like), etc.
- CDMA code-division multiple access
- HSPA+ high-speed packet access
- GSM global system for mobile communications
- LTE long-term evolution
- WiMax wireless wide area network
- the computer system 301 may communicate with one or more I/O devices.
- the input device 304 may be an antenna, keyboard, mouse, joystick, (infrared) remote control, camera, card reader, fax machine, dongle, biometric reader, microphone, touch screen, touchpad, trackball, sensor (e.g., accelerometer, light sensor, GPS, gyroscope, proximity sensor, or the like), stylus, scanner, storage device, transceiver, video device/source, visors, etc.
- Output device 305 may be a printer, fax machine, video display (e.g., cathode ray tube (CRT), liquid crystal display (LCD), light-emitting diode (LED), plasma, or the like), audio speaker, etc.
- a transceiver 306 may be disposed in connection with the processor 302 . The transceiver may facilitate various types of wireless transmission or reception.
- the transceiver may include an antenna operatively connected to a transceiver chip (e.g., Texas Instruments WiLink WL1283, Broadcom BCM4750IUB8, Infineon Technologies X-Gold 618-PMB9800, or the like), providing IEEE 802.11a/b/g/n, Bluetooth, FM, global positioning system (GPS), 2G/3G HSDPA/HSUPA communications, etc.
- a transceiver chip e.g., Texas Instruments WiLink WL1283, Broadcom BCM4750IUB8, Infineon Technologies X-Gold 618-PMB9800, or the like
- IEEE 802.11a/b/g/n e.g., Texas Instruments WiLink WL1283, Broadcom BCM4750IUB8, Infineon Technologies X-Gold 618-PMB9800, or the like
- IEEE 802.11a/b/g/n e.g., Bluetooth, FM, global positioning system (GPS), 2G/3G HSDPA/HS
- the processor 302 may be disposed in communication with a communication network 308 via a network interface 307 .
- the network interface 307 may communicate with the communication network 308 .
- the network interface 307 may employ connection protocols including, without limitation, direct connect, Ethernet (e.g., twisted pair 10/40/400 Base T), transmission control protocol/internet protocol (TCP/IP), token ring, IEEE 802.11a/b/g/n/x, etc.
- the communication network 308 may include, without limitation, a direct interconnection, local area network (LAN), wide area network (WAN), wireless network (e.g., using Wireless Application Protocol), the Internet, etc.
- the computer system 301 may communicate with devices 309 , 310 , and 311 .
- These devices may include, without limitation, personal computer(s), server(s), fax machines, printers, scanners, various mobile devices such as cellular telephones, smartphones (e.g., Apple iPhone, Blackberry, Android-based phones, etc.), tablet computers, eBook readers (Amazon Kindle, Nook, etc.), laptop computers, notebooks, gaming consoles (Microsoft Xbox, Nintendo DS, Sony PlayStation, etc.), or the like.
- the computer system 301 may itself embody one or more of these devices.
- the processor 302 may be disposed in communication with one or more memory devices (e.g., RAM 313 , ROM 314 , etc.) via a storage interface 312 .
- the storage interface may connect to memory devices including, without limitation, memory drives, removable disc drives, etc., employing connection protocols such as serial advanced technology attachment (SATA), integrated drive electronics (IDE), IEEE-1394, universal serial bus (USB), fiber channel, small computer systems interface (SCSI), etc.
- the memory drives may further include a drum, magnetic disc drive, magneto-optical drive, optical drive, redundant array of independent discs (RAID), solid-state memory devices, solid-state drives, etc.
- the memory 315 may store a collection of program or database components, including, without limitation, an operating system 316 , user interface application 317 , web browser 318 , mail server 319 , mail client 320 , user/application data 321 (e.g., any data variables or data records discussed in this disclosure), etc.
- the operating system 316 may facilitate resource management and operation of the computer system 301 .
- Operating systems include, without limitation, Apple Macintosh OS X, UNIX, Unix-like system distributions (e.g., Berkeley Software Distribution (BSD), FreeBSD, NetBSD, OpenBSD, etc.), Linux distributions (e.g., Red Hat, Ubuntu, Kubuntu, etc.), IBM OS/2, Microsoft Windows (XP, Vista/7/8, etc.), Apple iOS, Google Android, Blackberry OS, or the like.
- User interface 317 may facilitate display, execution, interaction, manipulation, or operation of program components through textual or graphical facilities.
- user interfaces may provide computer interaction interface elements on a display system operatively connected to the computer system 301 , such as cursors, icons, check boxes, menus, scrollers, windows, widgets, etc.
- GUIs Graphical user interfaces
- GUIs may be employed, including, without limitation, Apple Macintosh operating systems' Aqua, IBM OS/2, Microsoft Windows (e.g., Aero, Metro, etc.), Unix X-Windows, web interface libraries (e.g., ActiveX, Java, Javascript, AJAX, HTML, Adobe Flash, etc.), or the like.
- the computer system 301 may implement a web browser 318 stored program component.
- the web browser may be a hypertext viewing application, such as Microsoft Internet Explorer, Google Chrome, Mozilla Firefox, Apple Safari, etc. Secure web browsing may be provided using HTTPS (secure hypertext transport protocol), secure sockets layer (SSL), Transport Layer Security (TLS), etc. Web browsers may utilize facilities such as AJAX, DHTML, Adobe Flash, JavaScript, Java, application programming interfaces (APIs), etc.
- the computer system 501 may implement a mail server 319 stored program component.
- the mail server may be an Internet mail server such as Microsoft Exchange, or the like.
- the mail server may utilize facilities such as ASP, ActiveX, ANSI C++/C#, Microsoft .NET, CGI scripts, Java, JavaScript, PERL, PHP, Python, WebObjects, etc.
- the mail server may utilize communication protocols such as internet message access protocol (IMAP), messaging application programming interface (MAPI), Microsoft Exchange, post office protocol (POP), simple mail transfer protocol (SMTP), or the like.
- IMAP internet message access protocol
- MAPI messaging application programming interface
- POP post office protocol
- SMTP simple mail transfer protocol
- the computer system 301 may implement a mail client 320 stored program component.
- the mail client may be a mail viewing application, such as Apple Mail, Microsoft Entourage, Microsoft Outlook, Mozilla Thunderbird, etc.
- computer system 301 may store user/application data 321 , such as the data, variables, records, etc. as described in this disclosure.
- databases may be implemented as fault-tolerant, relational, scalable, secure databases such as Oracle or Sybase.
- databases may be implemented using standardized data structures, such as an array, hash, linked list, struct, structured text file (e.g., XML), table, or as object-oriented databases (e.g., using ObjectStore, Poet, Zope, etc.).
- object-oriented databases e.g., using ObjectStore, Poet, Zope, etc.
- Such databases may be consolidated or distributed, sometimes among the various computer systems discussed above in this disclosure. It is to be understood that the structure and operation of the any computer or database component may be combined, consolidated, or distributed in any working combination.
- the modules 112 include routines, programs, objects, components, and data structures, which perform particular tasks or implement particular abstract data types.
- the modules 112 may also be implemented as, signal processor(s), state machine(s), logic circuitries, and/or any other device or component that manipulate signals based on operational instructions. Further, the modules 112 can be implemented by one or more hardware components, by computer-readable instructions executed by a processing unit, or by a combination thereof.
- a computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored.
- a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein.
- the term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., are non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
Abstract
Description
- This application claims the benefit of Indian Patent Application Serial No. 6520/CHE/2014 filed Dec. 24, 2014, which is hereby incorporated by reference in its entirety.
- The present subject matter is related, in general to a recognition method and a recognition system, and more particularly, but not exclusively to method and system for recognizing characters.
- Generally available character recognition techniques like optical character recognition (OCR) is used to automatically convert written, printed or if desired even handwritten texts into a data form that can be electronically processed or computer readable form. The characters being recognized are optically scanned and binarized to create into a suitable form for analyzing and transformed into an electronic form for further processing. Various feature recognition techniques are known for examining and recognizing characters based on predetermined patterns stored in memory of an electronic device. However, these known techniques depend heavily on standardized fonts or approximations thereof. Further, there exist problems in identifying characters due to magnification, reduction or rotation, different lighting condition, resolution limitation, perspective distortions, arbitrary orientation and bad quality of characters, and non-uniform illumination conditions during image acquisition.
- Therefore, there is a need to provide a method and a system to recognize characters automatically.
- One or more shortcomings of the prior art are overcome and additional advantages are provided through the present disclosure. Additional features and advantages are realized through the techniques of the present disclosure. Other embodiments and aspects of the disclosure are described in detail herein and are considered a part of the claimed disclosure.
- Accordingly, the present disclosure relates to a recognition method. In one embodiment, the method comprising the step of receiving an input image comprising one or more characters from an image sensor. The received input image is then processed to extract one or more nodes and edges of each character in the input image and a graphical representation of each character is generated based on the one or more edges. The method further comprising comparing the graphical representation of each character with a predetermined graphical representation of each reference character stored in a reference repository. Based on the comparison, the characters in the input image are recognized.
- Further, the present disclosure relates to a recognition system comprising an image sensor and a processor coupled with the image sensor. The system further comprises a memory communicatively coupled to the processor, wherein the memory stores processor-executable instructions, which, on execution, cause the processor to receive an input image comprising one or more characters from the image sensor. Further, the processor is configured to extract one or more nodes and edges of each character from the input image and generate a graphical representation of each character based on the one or more edges. The system further comprises a comparison unit coupled with the processor and configured to compare the graphical representation of each character with the predetermined graphical representation of each reference character stored in a reference repository. Furthermore, the system comprises a recognition unit coupled with the comparison unit and configured to recognize the reference character as one of the characters in the input image based on the comparing.
- Furthermore, the present disclosure relates to a non-transitory computer readable medium including instructions stored thereon that when processed by at least one processor cause a system to perform the act of receiving an image comprising one or more characters. Further, the instructions cause the processor to perform the acts of extracting one or more nodes and edges of each character from the input image and generating a graphical representation of each character based on the one or more edges. Furthermore, the instructions cause the processor to perform the acts of comparing the generated graphical representation of each character with the predetermined representation of each reference character stored in a reference repository and recognizing the reference character as one of the characters in the input image based on the comparing.
- The foregoing summary is illustrative only and is not intended to be in any way limiting. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features will become apparent by reference to the drawings and the following detailed description.
- The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The same numbers are used throughout the figures to reference like features and components. Some embodiments of system and/or methods in accordance with embodiments of the present subject matter are now described, by way of example only, and with reference to the accompanying figures, in which:
-
FIG. 1a illustrates a block diagram of a recognition system for recognizing characters in accordance with some embodiments of the present disclosure; -
FIGS. 1b-1k illustrate a graphical representation and adjacency matrix of exemplary characters in accordance with some embodiments of the present disclosure; -
FIG. 2 illustrates a flowchart of a method of recognizing characters in accordance with some embodiments of the present disclosure; -
FIG. 3 is a block diagram of an exemplary computer system for implementing embodiments consistent with the present disclosure. - It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative systems embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and executed by a computer or processor, whether or not such computer or processor is explicitly shown.
- In the present document, the word “exemplary” is used herein to mean “serving as an example, instance, or illustration.” Any embodiment or implementation of the present subject matter described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments.
- While the disclosure is susceptible to various modifications and alternative forms, specific embodiment thereof has been shown by way of example in the drawings and will be described in detail below. It should be understood, however that it is not intended to limit the disclosure to the particular forms disclosed, but on the contrary, the disclosure is to cover all modifications, equivalents, and alternative falling within the spirit and the scope of the disclosure.
- The terms “comprises”, “comprising”, or any other variations thereof, are intended to cover a non-exclusive inclusion, such that a setup, device or method that comprises a list of components or steps does not include only those components or steps but may include other components or steps not expressly listed or inherent to such setup or device or method. In other words, one or more elements in a system or apparatus proceeded by “comprises . . . a” does not, without more constraints, preclude the existence of other elements or additional elements in the system or apparatus.
- The present disclosure relates to a method, and a system for recognizing characters. In one embodiment, the input image comprising one or more characters to be recognized is received and processed to extract one or more nodes and edges of each character in the input image. Using the extracted nodes and edges, a graphical representation and adjacency matrix of each character is generated and compared with a predetermined graphical representation and adjacency matrix to determine a match. Based on the comparison, a matching probability is determined based on which one or more characters in the input image is recognized and displayed as output. The proposed recognition method and system recognizes character with more accuracy and speed. Further, the present disclosure is simple, cost-effective and reduces the complexity involved in automatic recognition of characters.
- In the following detailed description of the embodiments of the disclosure, reference is made to the accompanying drawings that form a part hereof, and in which are shown by way of illustration specific embodiments in which the disclosure may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the disclosure, and it is to be understood that other embodiments may be utilized and that changes may be made without departing from the scope of the present disclosure. The following description is, therefore, not to be taken in a limiting sense.
-
FIG. 1 illustrates block diagram ofrecognition system 100 for recognizing characters automatically in accordance with some embodiments of the present disclosure. - The
recognition system 100 is configured to recognize characters printed, handwritten or captured by an image capturing device or image sensor. For example, therecognition system 100 may be used to automatically recognize characters on number plate of automobiles or vehicles during motion. - The recognition system is described in greater details below with reference to
FIG. 1 . In one implementation, theexemplary recognition system 100 includes a central processing unit (“CPU” or “processor”) 102, thememory 104 and an I/O interface 106. The I/O interface 106 is coupled with theprocessor 102 and an I/O device. The I/O device is configured to receive inputs via the I/O interface 106 and transmit outputs for displaying in the I/O device via the I/O interface 106. Therecognition system 100 further comprises animage sensor 108 coupled with theprocessor 102 and configured to capture an input image comprising one or more characters for recognition. - The
recognition system 100 also comprisesdata 110 andmodules 112. In one implementation, thedata 110 and themodules 112 may be stored within thememory 104. In one example, thedata 110 may includeinput image 114,reference repository 116, image ofskeletonized character 118, matchingprobability data 120,adjacency matrix 122, recognizedcharacters 124 andother data 126. In one embodiment, thedata 110 may be stored in thememory 104 in the form of various data structures. Additionally, the aforementioned data can be organized using data models, such as relational or hierarchical data models. Theother data 126 may be used to store data, including temporary data and temporary files, generated by themodules 112 for performing the various functions of therecognition system 100. - The
modules 112 may include, for example, a pre-processing and feature extracting unit (hereinafter referred to as pre-processing unit) 128, acomparison unit 130, and acharacter validation unit 132. Themodules 112 may also compriseother modules 134 to perform various miscellaneous functionalities of therecognition system 100. It will be appreciated that such aforementioned modules may be represented as a single module or a combination of different modules. - In operation, the
image sensor 108 captures theinput image 114 comprising one or more characters in a capture frame like a rectangular mask. Theimage sensor 108 may be for example, a still/video camera, a camera in a mobile device or any other known image capturing devices. Theinput image 114 comprises one or more frames captured for a predetermined time for example, comprises ‘n’ frames per second. Each frame of theinput image 114 is hereinafter referred to asinput image 114. Theinput image 114 is an image with resolution limitation or having background images that may cross the text or characters or create graphic noise, with changing light intensity, or an image captured from arbitrary view point having perspective distortion. In another embodiment, the image may have characters with different fonts, sizes, shapes, styles and thickness, shadows, text embossed, geometric distortions, and image resolution limitation or may be of bad quality. Theinput image 114 is pre-processed by thepre-processing unit 128 to remove image resolution limitations and to extract one or more predetermined features used for recognition of characters in theinput image 114. - In one embodiment, the
pre-processing unit 128 pre-processes theinput image 114 by removing background or graphical noise from theinput image 114 using any known filters and converting the filteredinput image 114 in RGB format into a corresponding grayscale format. Thepre-processing unit 128 detects the boundary of the number plate in theinput image 114 by using any known boundary detection techniques to locate the images of the characters. The number plates may be generally rectangular in shape with dark characters on a light/bright background, for example black fonts on white background and may have characters of same or different size. The number plates may also have a fixed ratio of width to height relationship. In one embodiment, thepre-processing unit 128 analyses theinput image 114, determines the extent of intensity variation of each row and selects the adjacent rows that exhibit the maximum variation to contain the characters of the number plate. Thepre-processing unit 128 further determines the borders of theinput image 114 by detecting the edges of theinput image 114 using any edge detection technique. - In one aspect, the
pre-processing unit 128 determines the horizontal and vertical edges using Hough Transformation technique for example, and extracts the number plate based on the determined edges. Further, thepre-processing unit 128 processes the extracted numberplate input image 114 for skewing/slant correction using known skew/slant correction techniques. Upon skewing/slant correcting theinput image 114, thepre-processing unit 128 detects the orientation of theinput image 114 and resizes the dimensions of the extracted numberplate input image 114 based on detected orientation. In one embodiment, thepre-processing unit 128 detects the orientation of the numberplate input image 114 and aligns theinput image 114 using a bounding box. Upon correcting the orientation, the dimensions of theinput image 114 is altered by thepre-processing unit 128 to a predetermined size. - The
pre-processing unit 128 is further configured to locate the characters of interest for example, one or more characters in the number plate and extract the corresponding images. In one embodiment, thepre-processing unit 128 identifies the location of the number plate and extracts images of each alphanumeric character from the identified location. Thepre-processing unit 128 extracts images of each character by segmenting theinput image 114 using any known segmentation techniques. In one embodiment, thepre-processing unit 128 converts the grayscale input image 114 into a corresponding binary image using a predetermined adaptive threshold value. Upon conversion, thepre-processing unit 128 extracts one or more characters from the inputbinary image 114 by computing horizontal projections on the inputbinary image 114 to obtain horizontal image segments and further computing vertical projections on the horizontal image segments to obtain the one or more image segments of the one or more characters of theinput image 114. Upon segmentation, thepre-processing unit 128 generates skeletonized image segments of each character to extract one or more features representing a general form of the characters as shown inFIG. 1 b. - In one embodiment, the
pre-processing unit 128 normalizes the dimensions of the character contained in the image segments using techniques including Nearest-neighbor or weighted average method of down sampling to generate one or more normalized characters. Thepre-processing unit 128 further extracts appropriate descriptors from the one or more normalized characters, performs structural analysis from the extracted descriptors and extracts region-based shaped features representing the general form of the character. - Upon obtaining each skeletonized character, the
pre-processing unit 128 generates graphical representation and adjacency matrix of eachskeletonized character 118. In one embodiment, thepre-processing unit 128 converts eachskeletonized character 118 into a graphical representation comprising one or more nodes and edges using a bounding box and edge detection techniques known in the art. Upon conversion, thepre-processing unit 128 extracts the one or more nodes and the edges using a bounding box in clockwise direction for example as illustrated inFIG. 1c . Based on the extracted nodes and edges, wave graph of each character is generated using the extracted edges and corresponding graphical ending position angle is determined In one example, thepre-processing unit 128 generates sine wave graph of each character using the extracted edges only once per each edge and determines sine wave ending position angle for each character. For example, thepre-processing unit 128 plots each character onto the sine wave graph by plotting the extracted edges of each character on the sine wave graph and determining the sine wave ending position angle of the plotted edges. Further, thepre-processing unit 128 determinesadjacency matrix 122 for each character based on the nodes and edges extracted in the clockwise direction. A sample illustration of sine wave graph of letter ‘A’ and digit ‘4’ is shown inFIGS. 1a -1 j. - The
recognition system 100 recognizes one or more characters of theinput image 114 by comparing it with a previously stored reference characters in thereference repository 116. In one embodiment, the reference repository is predetermined and stored in thememory 104. Thepre-processing unit 128 pre-processes images of one or more characters including for example, digits, capital alphabet letters, small alphabet letters and special characters by removing noise and converts the image in RGB format into its corresponding grayscale format. The images are then normalized, re-sampled to create images of equal dimensions, and skeletonized to extract region-based shape feature representing the general form of the characters. Further, the location of the characters in the input image is detected and identified. The images are then converted into graphical representation comprising one or more nodes and edges based on which graph wave ending position angle and adjacency matrix are determined and stored for each reference characters. - The
recognition system 100 recognizes characters in theinput image 114 of the number plate based on the graphical representation and theadjacency matrix 122 of each character in theinput image 114. In one embodiment, therecognition system 100 recognizes characters in theinput image 114 by comparing each character in theinput image 114 with each of the reference character in thereference repository 116. Thecomparison unit 130 compares the sine wave ending position angle of eachskeletonized characters 118 in ‘n’ frames of theinput image 114 with the sine wave ending position angle of one or more reference characters in thereference repository 116 and further compares theadjacency matrix 122 of the character in ‘n’ frames of theinput image 114 with the adjacency matrix of the reference character. If all the characters of ‘n’ frames of theinput image 114 match with the reference characters, then the reference characters are identified and recognized as characters in theinput image 114. If at least one character of ‘n’ frames of theinput image 114 is not matching with the reference characters, then the unmatched character is recognized based on the matchingprobability data 120 of the unmatched character. - In one embodiment, the character validation unit (hereinafter referred to as validation unit) 132 determines the matching
probability data 120 of the at least one unmatched character and recognizes the at least one unmatched character of theinput image 114 based on the matchingprobability data 120. If thevalidation unit 132 determines that the matchingprobability data 120 exceeds a predetermined matching probability threshold data stored in theother data 126, then thevalidation unit 132 selects the reference character as recognized character in theinput image 114 and displays the selected character as output. If thevalidation unit 132 determines that the matchingprobability data 120 of the unmatched character does not exceed the predetermined matching probability threshold data, then thevalidation unit 132 discards the matchingprobability data 120 of the unmatched character and repeats the entire process. -
FIG. 2 illustrates a flowchart of a method of recognizing characters in accordance with some embodiments of the present disclosure. - As illustrated in
FIG. 2 , themethod 200 comprises one or more blocks implemented by theprocessor 102 for recognizing characters. Themethod 200 may be described in the general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, and functions, which perform particular functions or implement particular abstract data types. - The order in which the
method 200 is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement themethod 200. Additionally, individual blocks may be deleted from themethod 200 without departing from the spirit and scope of the subject matter described herein. Furthermore, themethod 200 can be implemented in any suitable hardware, software, firmware, or combination thereof. - At
block 202, receive input image. In one embodiment, theimage sensor 108 captures theinput image 114 comprising one or more characters in a capture frame like a rectangular mask. Theimage sensor 108 may be for example, a still/video camera, a camera in a mobile device or any other known image capturing devices. Theinput image 114 is an image with resolution limitation or having background images that may cross the text or characters or create graphic noise, with changing light intensity, or an image captured from arbitrary view point having perspective distortion. In another embodiment, the image may have characters with different fonts, sizes, shapes, styles and thickness, shadows, text embossed, geometric distortions, and image resolution limitation or may be of bad quality. - At
block 204, pre-process the input image. Theinput image 114 is pre-processed by thepre-processing unit 128 to remove image resolution limitations and to extract one or more predetermined features used for recognition of characters in theinput image 114. In one embodiment, thepre-processing unit 128 pre-processes theinput image 114 by removing background or graphical noise from theinput image 114 using any known filters and converting the filteredinput image 114 in RGB format into a corresponding grayscale format. Thepre-processing unit 128 detects the boundary of the number plate in theinput image 114 by using any known boundary detection techniques to locate the images of the characters. The number plates may be generally rectangular in shape with dark characters on a light/bright background, for example black fonts on white background and may have characters of same or different size. The number plates may also have a fixed ratio of width to height relationship. In one embodiment, thepre-processing unit 128 analyses theinput image 114, determines the extent of intensity variation of each row and selects the adjacent rows that exhibit the maximum variation to contain the characters of the number plate. Thepre-processing unit 128 further determines the borders of theinput image 114 by detecting the edges of theinput image 114 using any edge detection technique. - In one aspect, the
pre-processing unit 128 determines the horizontal and vertical edges using Hough Transformation technique for example, and extracts the number plate based on the determined edges. Further, thepre-processing unit 128 processes the extracted numberplate input image 114 for skewing/slant correction using known skew/slant correction techniques. Upon skewing/slant correcting theinput image 114, thepre-processing unit 128 detects the orientation of theinput image 114 and resizes the dimensions of the extracted numberplate input image 114 based on detected orientation. In one embodiment, thepre-processing unit 128 detects the orientation of the numberplate input image 114 and aligns theinput image 114 using a bounding box. Upon correcting the orientation, the dimensions of theinput image 114 is altered by thepre-processing unit 128 to a predetermined size. - The
pre-processing unit 128 is further configured to locate the characters of interest for example, one or more characters in the number plate and extract the corresponding images. In one embodiment, thepre-processing unit 128 identifies the location of the number plate and extracts images of each alphanumeric character from the identified location. Thepre-processing unit 128 extracts images of each character by segmenting theinput image 114 using any known segmentation techniques. In one embodiment, thepre-processing unit 128 converts the grayscale input image 114 into a corresponding binary image using a predetermined adaptive threshold value. Upon conversion, thepre-processing unit 128 extracts one or more characters from the inputbinary image 114 by computing horizontal projections on the inputbinary image 114 to obtain horizontal image segments and further computing vertical projections on the horizontal image segments to obtain the one or more image segments of the one or more characters of theinput image 114. Upon segmentation, thepre-processing unit 128 generates skeletonized image segments of each character to extract one or more features representing a general form of the characters as shown inFIG. 1 c. - In one embodiment, the
pre-processing unit 128 normalizes the dimensions of the character contained in the image segments using techniques including Nearest-neighbor or weighted average method of down sampling to generate one or more normalized characters. Thepre-processing unit 128 further extracts appropriate descriptors from the one or more normalized characters, performs structural analysis from the extracted descriptors and extracts region-based shaped features representing the general form of the character. - At
block 206, generate graphical representation of character in the input image. Upon obtaining each skeletonized character, thepre-processing unit 128 generates graphical representation and adjacency matrix of eachskeletonized character 118. In one embodiment, thepre-processing unit 128 converts eachskeletonized character 118 into a graphical representation comprising one or more nodes and edges using a bounding box and edge detection techniques known in the art. Upon conversion, thepre-processing unit 128 extracts the one or more nodes and the edges using a bounding box in clockwise direction for example as illustrated inFIG. 1d . Based on the extracted nodes and edges, wave graph of each character is generated using the extracted edges and corresponding graphical ending position angle is determined In one example, the pre-processing unit generates sine wave graph of each character using the extracted edges only once per each edge and determines sine wave ending position angle for each character. Further, thepre-processing unit 128 determinesadjacency matrix 122 for each character based on the nodes and edges extracted in the clockwise direction. A sample illustration of sine wave graph of letter ‘A’ and digit ‘4’ is shown inFIGS. 1b -1 k. - At
block 208, compare graphical representation with predetermined graphical representation. In one implementation, therecognition system 100 recognizes characters in theinput image 114 of the number plate based on the graphical representation and theadjacency matrix 122 of each character in theinput image 114. In one embodiment, therecognition system 100 recognizes characters in theinput image 114 by comparing each character in theinput image 114 with each of the reference character in thereference repository 116. Thecomparison unit 130 compares the sine wave ending position angle of eachskeletonized characters 118 in ‘n’ frames of theinput image 114 with the sine wave ending position angle of one or more reference characters in thereference repository 116 and further compares theadjacency matrix 122 of the character in ‘n’ frames of theinput image 114 with the adjacency matrix of the reference character. If all the characters of ‘n’ frames of theinput image 114 match with the reference characters, then the reference characters are identified and recognized as characters in theinput image 114. If at least one character of ‘n’ frames of theinput image 114 is not matching with the reference characters, then the unmatched character is recognized based on the matchingprobability data 120 of the unmatched character. - At
block 210, recognize the character. In one embodiment, thecharacter validation unit 132 determines the matchingprobability data 120 of the at least one unmatched character and recognizes the at least one unmatched character of theinput image 114 based on the matchingprobability data 120. If thevalidation unit 132 determines that the matchingprobability data 120 exceeds a predetermined matching probability threshold data stored in theother data 126, then thevalidation unit 132 selects the reference character as recognized character in theinput image 114 and displays the selected character as output. If thevalidation unit 132 determines that the matchingprobability data 120 of the unmatched character does not exceed the predetermined matching probability threshold data, then thevalidation unit 132 discards the matchingprobability data 120 of the unmatched character and repeats the entire process. -
FIG. 3 is a block diagram of an exemplary computer system for implementing embodiments consistent with the present disclosure. - Variations of
computer system 301 may be used for implementing all the computing systems that may be utilized to implement the features of the present disclosure.Computer system 301 may comprise a central processing unit (“CPU” or “processor”) 302.Processor 302 may comprise at least one data processor for executing program components for executing user- or system-generated requests. The processor may include specialized processing units such as integrated system (bus) controllers, memory management control units, floating point units, graphics processing units, digital signal processing units, etc. Theprocessor 302 may include a microprocessor, such as AMD Athlon, Duron or Opteron, ARM's application, embedded or secure processors, IBM PowerPC, Intel's Core, Itanium, Xeon, Celeron or other line of processors, etc. Theprocessor 302 may be implemented using mainframe, distributed processor, multi-core, parallel, grid, or other architectures. Some embodiments may utilize embedded technologies like application-specific integrated circuits (ASICs), digital signal processors (DSPs), Field Programmable Gate Arrays (FPGAs), etc. -
Processor 302 may be disposed in communication with one or more input/output (I/O) devices via I/O interface 303. The I/O interface 303 may employ communication protocols/methods such as, without limitation, audio, analog, digital, monoaural, RCA, stereo, IEEE-1394, serial bus, universal serial bus (USB), infrared, PS/2, BNC, coaxial, component, composite, digital visual interface (DVI), high-definition multimedia interface (HDMI), RF antennas, S-Video, VGA, IEEE 802.n/b/g/n/x, Bluetooth, cellular (e.g., code-division multiple access (CDMA), high-speed packet access (HSPA+), global system for mobile communications (GSM), long-term evolution (LTE), WiMax, or the like), etc. - Using the I/
O interface 303, thecomputer system 301 may communicate with one or more I/O devices. For example, theinput device 304 may be an antenna, keyboard, mouse, joystick, (infrared) remote control, camera, card reader, fax machine, dongle, biometric reader, microphone, touch screen, touchpad, trackball, sensor (e.g., accelerometer, light sensor, GPS, gyroscope, proximity sensor, or the like), stylus, scanner, storage device, transceiver, video device/source, visors, etc.Output device 305 may be a printer, fax machine, video display (e.g., cathode ray tube (CRT), liquid crystal display (LCD), light-emitting diode (LED), plasma, or the like), audio speaker, etc. In some embodiments, atransceiver 306 may be disposed in connection with theprocessor 302. The transceiver may facilitate various types of wireless transmission or reception. For example, the transceiver may include an antenna operatively connected to a transceiver chip (e.g., Texas Instruments WiLink WL1283, Broadcom BCM4750IUB8, Infineon Technologies X-Gold 618-PMB9800, or the like), providing IEEE 802.11a/b/g/n, Bluetooth, FM, global positioning system (GPS), 2G/3G HSDPA/HSUPA communications, etc. - In some embodiments, the
processor 302 may be disposed in communication with acommunication network 308 via anetwork interface 307. Thenetwork interface 307 may communicate with thecommunication network 308. Thenetwork interface 307 may employ connection protocols including, without limitation, direct connect, Ethernet (e.g., twistedpair 10/40/400 Base T), transmission control protocol/internet protocol (TCP/IP), token ring, IEEE 802.11a/b/g/n/x, etc. Thecommunication network 308 may include, without limitation, a direct interconnection, local area network (LAN), wide area network (WAN), wireless network (e.g., using Wireless Application Protocol), the Internet, etc. Using thenetwork interface 307 and thecommunication network 308, thecomputer system 301 may communicate withdevices computer system 301 may itself embody one or more of these devices. - In some embodiments, the
processor 302 may be disposed in communication with one or more memory devices (e.g., RAM 313, ROM 314, etc.) via astorage interface 312. The storage interface may connect to memory devices including, without limitation, memory drives, removable disc drives, etc., employing connection protocols such as serial advanced technology attachment (SATA), integrated drive electronics (IDE), IEEE-1394, universal serial bus (USB), fiber channel, small computer systems interface (SCSI), etc. The memory drives may further include a drum, magnetic disc drive, magneto-optical drive, optical drive, redundant array of independent discs (RAID), solid-state memory devices, solid-state drives, etc. - The
memory 315 may store a collection of program or database components, including, without limitation, anoperating system 316, user interface application 317,web browser 318,mail server 319, mail client 320, user/application data 321 (e.g., any data variables or data records discussed in this disclosure), etc. Theoperating system 316 may facilitate resource management and operation of thecomputer system 301. Examples of operating systems include, without limitation, Apple Macintosh OS X, UNIX, Unix-like system distributions (e.g., Berkeley Software Distribution (BSD), FreeBSD, NetBSD, OpenBSD, etc.), Linux distributions (e.g., Red Hat, Ubuntu, Kubuntu, etc.), IBM OS/2, Microsoft Windows (XP, Vista/7/8, etc.), Apple iOS, Google Android, Blackberry OS, or the like. User interface 317 may facilitate display, execution, interaction, manipulation, or operation of program components through textual or graphical facilities. For example, user interfaces may provide computer interaction interface elements on a display system operatively connected to thecomputer system 301, such as cursors, icons, check boxes, menus, scrollers, windows, widgets, etc. Graphical user interfaces (GUIs) may be employed, including, without limitation, Apple Macintosh operating systems' Aqua, IBM OS/2, Microsoft Windows (e.g., Aero, Metro, etc.), Unix X-Windows, web interface libraries (e.g., ActiveX, Java, Javascript, AJAX, HTML, Adobe Flash, etc.), or the like. - In some embodiments, the
computer system 301 may implement aweb browser 318 stored program component. The web browser may be a hypertext viewing application, such as Microsoft Internet Explorer, Google Chrome, Mozilla Firefox, Apple Safari, etc. Secure web browsing may be provided using HTTPS (secure hypertext transport protocol), secure sockets layer (SSL), Transport Layer Security (TLS), etc. Web browsers may utilize facilities such as AJAX, DHTML, Adobe Flash, JavaScript, Java, application programming interfaces (APIs), etc. In some embodiments, the computer system 501 may implement amail server 319 stored program component. The mail server may be an Internet mail server such as Microsoft Exchange, or the like. The mail server may utilize facilities such as ASP, ActiveX, ANSI C++/C#, Microsoft .NET, CGI scripts, Java, JavaScript, PERL, PHP, Python, WebObjects, etc. The mail server may utilize communication protocols such as internet message access protocol (IMAP), messaging application programming interface (MAPI), Microsoft Exchange, post office protocol (POP), simple mail transfer protocol (SMTP), or the like. In some embodiments, thecomputer system 301 may implement a mail client 320 stored program component. The mail client may be a mail viewing application, such as Apple Mail, Microsoft Entourage, Microsoft Outlook, Mozilla Thunderbird, etc. - In some embodiments,
computer system 301 may store user/application data 321, such as the data, variables, records, etc. as described in this disclosure. Such databases may be implemented as fault-tolerant, relational, scalable, secure databases such as Oracle or Sybase. Alternatively, such databases may be implemented using standardized data structures, such as an array, hash, linked list, struct, structured text file (e.g., XML), table, or as object-oriented databases (e.g., using ObjectStore, Poet, Zope, etc.). Such databases may be consolidated or distributed, sometimes among the various computer systems discussed above in this disclosure. It is to be understood that the structure and operation of the any computer or database component may be combined, consolidated, or distributed in any working combination. - As described above, the
modules 112, amongst other things, include routines, programs, objects, components, and data structures, which perform particular tasks or implement particular abstract data types. Themodules 112 may also be implemented as, signal processor(s), state machine(s), logic circuitries, and/or any other device or component that manipulate signals based on operational instructions. Further, themodules 112 can be implemented by one or more hardware components, by computer-readable instructions executed by a processing unit, or by a combination thereof. - The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
- Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., are non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
- It is intended that the disclosure and examples be considered as exemplary only, with a true scope and spirit of disclosed embodiments being indicated by the following claims.
Claims (13)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
IN6520CH2014 | 2014-12-24 | ||
IN6520/CHE/2014 | 2014-12-24 |
Publications (2)
Publication Number | Publication Date |
---|---|
US9373048B1 US9373048B1 (en) | 2016-06-21 |
US20160188990A1 true US20160188990A1 (en) | 2016-06-30 |
Family
ID=56118287
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/636,929 Active US9373048B1 (en) | 2014-12-24 | 2015-03-03 | Method and system for recognizing characters |
Country Status (1)
Country | Link |
---|---|
US (1) | US9373048B1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11551034B2 (en) * | 2019-10-09 | 2023-01-10 | Ancestry.Com Operations Inc. | Adversarial network for transforming handwritten text |
US20230049395A1 (en) * | 2020-01-16 | 2023-02-16 | Visa International Service Association | System and Computer-Implemented Method for Character Recognition in Payment Card |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5267332A (en) * | 1991-06-19 | 1993-11-30 | Technibuild Inc. | Image recognition system |
US5588072A (en) * | 1993-12-22 | 1996-12-24 | Canon Kabushiki Kaisha | Method and apparatus for selecting blocks of image data from image data having both horizontally- and vertically-oriented blocks |
EP1634135B1 (en) * | 2003-02-28 | 2011-09-14 | Gannon Technologies Group | Systems and methods for source language word pattern matching |
WO2005024711A1 (en) * | 2003-09-05 | 2005-03-17 | Gannon Technologies Group | Systems and methods for biometric identification using handwriting recognition |
US7724958B2 (en) * | 2004-09-07 | 2010-05-25 | Gannon Technologies Group Llc | Systems and methods for biometric identification using handwriting recognition |
US8452108B2 (en) * | 2008-06-25 | 2013-05-28 | Gannon Technologies Group Llc | Systems and methods for image recognition using graph-based pattern matching |
US20150055866A1 (en) * | 2012-05-25 | 2015-02-26 | Mark Joseph Cummins | Optical character recognition by iterative re-segmentation of text images using high-level cues |
US9740925B2 (en) * | 2012-11-19 | 2017-08-22 | Imds America Inc. | Method and system for the spotting of arbitrary words in handwritten documents |
US9213910B2 (en) * | 2013-11-06 | 2015-12-15 | Xerox Corporation | Reinforcement learning approach to character level segmentation of license plate images |
-
2015
- 2015-03-03 US US14/636,929 patent/US9373048B1/en active Active
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11551034B2 (en) * | 2019-10-09 | 2023-01-10 | Ancestry.Com Operations Inc. | Adversarial network for transforming handwritten text |
US20230049395A1 (en) * | 2020-01-16 | 2023-02-16 | Visa International Service Association | System and Computer-Implemented Method for Character Recognition in Payment Card |
Also Published As
Publication number | Publication date |
---|---|
US9373048B1 (en) | 2016-06-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10733433B2 (en) | Method and system for detecting and extracting a tabular data from a document | |
US10482344B2 (en) | System and method for performing optical character recognition | |
US10565443B2 (en) | Method and system for determining structural blocks of a document | |
US9984287B2 (en) | Method and image processing apparatus for performing optical character recognition (OCR) of an article | |
US20170286803A1 (en) | System and method for optical character recognition | |
US10846525B2 (en) | Method and system for identifying cell region of table comprising cell borders from image document | |
US20140193029A1 (en) | Text Detection in Images of Graphical User Interfaces | |
US9412052B1 (en) | Methods and systems of text extraction from images | |
US11501548B2 (en) | Method and system for determining one or more target objects in an image | |
US9396389B2 (en) | Techniques for detecting user-entered check marks | |
US20200051272A1 (en) | Method and a system for counting objects in an image | |
US10769472B2 (en) | Method and system counting plurality of objects placed in a region | |
EP3543909A1 (en) | Method and system for detecting and correcting an orientation of an image | |
US20220067585A1 (en) | Method and device for identifying machine learning models for detecting entities | |
US9373048B1 (en) | Method and system for recognizing characters | |
US20180341830A1 (en) | Method and device for extracting images from portable document format (pdf) documents | |
US10529075B2 (en) | Method and system for tracking objects within a video | |
US20170177968A1 (en) | Method for the optical detection of symbols | |
US10769429B2 (en) | Method and system for extracting text from an engineering drawing | |
US11386687B2 (en) | System and method for reconstructing an image | |
US10325148B2 (en) | Method and a system for optical character recognition | |
US11748971B2 (en) | Method and system for identifying products using digital fingerprints | |
US10248857B2 (en) | System and method for detecting and annotating bold text in an image document | |
CN113657162A (en) | Bill OCR recognition method based on deep learning | |
US20200293613A1 (en) | Method and system for identifying and rendering hand written content onto digital display interface |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: WIPRO LIMITED, INDIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HOSABETTU, RAGHAVENDRA;LENKA, ANIL KUMAR;REEL/FRAME:035155/0254 Effective date: 20141223 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |