AU2011253930A1 - Document image line detector using tiles and projection - Google Patents

Document image line detector using tiles and projection Download PDF

Info

Publication number
AU2011253930A1
AU2011253930A1 AU2011253930A AU2011253930A AU2011253930A1 AU 2011253930 A1 AU2011253930 A1 AU 2011253930A1 AU 2011253930 A AU2011253930 A AU 2011253930A AU 2011253930 A AU2011253930 A AU 2011253930A AU 2011253930 A1 AU2011253930 A1 AU 2011253930A1
Authority
AU
Australia
Prior art keywords
line
tile
reference point
code
segments
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU2011253930A
Inventor
Yu-Ling Chen
Timothy Stephen Mason
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Priority to AU2011253930A priority Critical patent/AU2011253930A1/en
Publication of AU2011253930A1 publication Critical patent/AU2011253930A1/en
Abandoned legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

DOCUMENT IMAGE LINE DETECTOR USING TILES AND PROJECTION Disclosed is a method (200) of detecting a line in a digital image (111). The digital image 5 is processed to form connected components for adjacently located pixels of similar colour. , The method forms (230) a tiled line candidate bitmap (183) of the digital image using line candidates, the line candidates being at least one of the connected components (181) selected according to geometric properties of the at least one connected component. A line segment is detected (235) in each tile of the tiled line candidate bitmap, the line segment 10 (187) passing a reference point (930) corresponding to the peak formed from a projection profile (186,802,803) of the tile with a measure (250) being determined by a polar transform (1001) of the tile about the reference point. The method forms (260) line based on the reference point and the measure of line segments for at least two adjacent tiles. 5816796_1 P015747_speci_lodge X -1000 0 -* 100 0 ->1001 r 0 -+0 1002 1003 0 >1005 Fig. 10 5816784 1 P015747 fias lodae

Description

S&F Ref: P015747 AUSTRALIA PATENTS ACT 1990 IP Australia COMPLETE SPECIFICATION - 9 DEC 2011 FOR A STANDARD PATENT Name and Address Canon Kabushiki Kaisha, of 30-2, Shimomaruko 3 of Applicant: chome, Ohta-ku, Tokyo, 146, Japan Actual Inventor(s): Stephen Mason Yu-Ling Chen Address for Service: Spruson & Ferguson St Martins Tower Level 35 31 Market Street Sydney NSW 2000 (CCN 3710000177) Invention Title: Document image line detector using tiles and projection The following statement is a full description of this invention, including the best method of performing it known to me/us: 5845c(5823303_1) - 1 DOCUMENT IMAGE LINE DETECTOR USING TILES AND PROJECTION TECHNICAL FIELD The current invention relates to scan processing and, in particular, to the detection of lines in a document image. BACKGROUND 5 Straight lines are an important element in documents. They can divide a page into separate regions (e.g. as column separators); they can emphasise or deemphasise specific parts of a page (e.g. as underlines or strikethrough lines); and they can be used to form complex objects (e.g. as table borders). Detecting straight lines is an important step in many document analysis systems, and has garnered some attention in the past. 10 Straight lines in documents are often laid out horizontally or vertically on the page. When a hardcopy of a document is imaged, however, the resulting document image rarely contains lines that are perfectly horizontal or vertical. This can be due to skew introduced when producing the hardcopy, or through imaging the hardcopy document, or physical warping of the hardcopy material, etc. As a result, document images may contain lines that 15 are approximately horizontal or vertical. Some line detection methods aim to detect lines for a specific purpose. Purpose driven line detection methods often make assumptions about the context of the desired lines, such as regions to search, or the type of surrounding content. For example, one automated mail processing system extracts address text from envelopes, by first detecting 20 text regions and then detecting underlines adjoining those text regions. Detected underlines are then removed from the text regions, so that only text remains. To detect an underline, this method first checks if a dashed underline is present, by inspecting a histogram of intervals from a thresholded orthogonal distance profile for an indication of regularly spaced gaps. If no dashed underline is found, the method then checks if a solid 25 underline is present, by inspecting a histogram of an orthogonal distance profile for a narrow distribution indicative of a solid underline possibly pierced by some descenders. Descenders are parts of text that may extend beyond an underline, such as the tails on letters such as "y" and "p". If neither a dashed nor a solid underline is found, the method concludes there is no underline present. This method has the disadvantage that it only 30 finds lines adjoining text regions. As another example of a purpose-driven line detection method, line detection used by one table analysis module looks within known table areas to find lines representing table 5816796_1 P01 5747_speci_lodge -2 cell borders. First, horizontal and vertical projection profiles are calculated for the entire table area, to locate x coordinates of vertical lines and y coordinates of horizontal lines. The next step is to find where the lines begin and end. Due to the purpose of this method being for finding lines in a table, the method expects that lines begin and end either at an 5 extremity of the table area, or at an intersection of lines (i.e. at a table cell corner). The method inspects local regions surrounding each possible intersection of vertical and horizontal lines, and determines which type of intersection (if any) is present, using local horizontal and vertical projection profiles and using template matching. From this information, cell boundaries are inferred according to allowable table structure, and these 10 cell boundaries describe the lines of the table. This method has the disadvantage that it requires a known table region consistent with expected line structure. Other line detection methods can be more general. A number of transform methods have been used to simplify the detection of lines. These methods work by transforming a document image, say, into a parameter space that readily describes lines, finding lines in 15 that parameter space, then inverse transforming those lines back onto the image. Such transform methods can involve arbitrarily complex transforms, depending on the parameters desired. General transform methods can be slow and require large amounts of memory when performed on sizeable images. For the purposes of computational efficiency, some line detection methods first find 20 short sections of lines, then combine multiple short sections into longer lines. The line joining step is a potential source of error, as it must accurately and reliably determine which line segments to connect. Overeager joining can lead to line segments being joined incorrectly, whereas conservative joining can lead to fragmented lines. Furthermore, such methods often suffer severe false negatives in finding short sections of lines when the page 25 is skewed. Purpose-driven line detection methods can only detect lines of a specific document type under tightly controlled operating conditions. Other line detection methods are either inefficient for sizeable images or not robust to unconstrained operating conditions. A need therefore exists for a line detection method that can detect lines of various types efficiently 30 and robustly. SUMMARY According to an aspect of the present dislcosure, tehre is provided a method of detecting a line in a digital image, the digital image being processed to form connected 5816796_1 P015747_speci.lodge -3 components for adjacently located pixels of similar colour. The method comprises: (a) forming a tiled line candidate bitmap of the digital image using line candidates, the line candidates being at least one of the connected components selected according to geometric properties of the at least one connected component; (b) detecting a line segment in each 5 tile of the tiled line candidate bitmap, said line segment passing a reference point corresponding to the peak formed from a projection profile of the tile with a measure being determined by a polar transform of the tile about the reference point; and (c) forming a line based on the reference point and the measure of line segments for at least two adjacent tiles. 10 Desirably, the forming of the tiled line candidate bitmap comprises (a) creating an empty line candidate bitmap corresponding to the digital image; (b) selecting high confidence line-like connected components; (c) marking pixels corresponding to the selected high-confidence line-like connected components in the line candidate bitmap; (d) selecting low-confidence line-like connected components; (e) marking pixels 15 corresponding to the selected low-confidence line-like connected components in the line candidate bitmap; and (f) dividing the line candidate bitmap into tiles to form the tiled line candidate bitmap. Preferably, the detecting of the line segment in each tile comprises (a) determining at least one of a horizontal projection profile and a vertical projection profile of the tile; (b) 20 locating a peak from the at least one horizontal and vertical projection profiles; (c) determining a reference point based on the peak location and mid-point of the tile; (d) calculating a polar transform about the reference point; (e) analysing peaks in each angle of the polar transform to find a line segment passing the reference point; and (f) creating the measure for the line segment based on the angle and distribution of distance from the 25 reference point. Advamntageously, the forming of the at least one line comprises estimating a location of line segments of neighbouring tiles and, where one or more line segments exist at an estimated location, linking at least the neighbouring line segments to form a line of constituent line segments. Preferably this forming comprises iterating the forming to link a 30 current line segment to a previously formed line of constituent line segments. Typically line is one of a plurality of line types, said method further comprising determining a line type for each of the formed lines based on accumulated statistics of the constituent line segments. 5816796_1 P015747_specilodge -4 Also disicosed is a computer-implementatble method of detecting a line of a plurality of types in a digital image, the digital image being processed to form connected components for adjacently located pixels of similar colour, the method comprising: (a) forming a tiled line candidate bitmap of the digital image using line candidates, the line 5 candidates being at least one of the connected components selected according to geometric properties of the at least one connected component; (b) detecting a line segment in each tile of the tiled line candidate bitmap, said line segment passing a reference point corresponding to the peak formed from a projection profile of the tile with a measure being determined by a polar transform of the tile about the reference point; (c) forming at least 10 one line based on the reference point and the measure of line segments for at least two adjacent tiles; and (d) determining a line type for each of the formed lines based on accumulated statistics of the constituent line segments. Other aspects are also disclosed. BRIEF DESCRIPTION OF THE DRAWINGS 15 At least one embodiment of the present invention will now be described with reference to the following drawings, in which: Fig. I is a functional block diagram of a system for finding lines in an image; Fig. 2 is a schematic flow diagram of a method for finding lines in an image; Fig. 3 is a schematic flow diagram illustrating a method of composing a tiled 20 bitmap of line-like connected components as used in the method of Fig. 2; Fig. 4 is a schematic flow diagram illustrating a method of localising approximately horizontal and approximately vertical lines segments as used in the method of Fig. 2; Fig. 5 is a schematic flow diagram illustrating a method of determining line segment angles as used in the method of Fig. 2; 25 Fig. 6 is a schematic flow diagram illustrating a method of forming lines from line segments as used in the method of Fig. 2; Fig. 7 is a schematic flow diagram illustrating a method of classifying line types, as optionally used in the method of Fig. 2; Fig. 8A is a diagram illustrating the process of calculating a horizontal projection 30 profile; Fig. 8B is an example of a horizontal projection profile and a vertical projection profile for a tile; Fig. 9A is a diagram illustrating a tile with a reference point; 5816796_1 P015747_speci_lodge -5 Fig. 9B is a diagram illustrating the process of calculating a polar transform centred on the reference point shown in Fig. 9A; Fig. 10 is an example of an angular projection profile for a tile; and Figs. llA and 11 B collectively form a schematic block diagram representation of 5 an electronic device upon which described arrangements can be practised. DETAILED DESCRIPTION INCLDUING BEST MODE The arrangements described below serve to efficiently locate approximately horizontal and approximately vertical lines in a document image. In one implementation, the located lines are further classified by line type (e.g. solid/dashed). The located lines 10 may be used for further document analysis, such as layout analysis or table analysis. System Context Fig. I depicts a system 100 for finding lines in a document image. The system 100 processes a document image 111 of an input document using a scan processor 150 to produce an electronic document 190 containing demarcated lines such as the underline 191, 15 the dashed line 192 and the solid line 193. The document image 111 is typically a bitmap image but may be configured in another format, such as a compressed format (e.g. JPEG), scalable vector graphics (SVG), portable document format (PDF), a proprietary format created by Adobe Systems Inc., to name but a few. The electronic document 190 may be an image file, an editable document, a Portable Document Format (PDF, a proprietary 20 format created by Adobe Systems Inc.) file, or a record stored in an electronic database, for example. The document image 111 may be produced by any of a number of sources, such as by a scanner 120 scanning a hardcopy document 110; by a retrieval from a data storage system 130 such as a hard disk having a database of images stored on the hard disk; or by 25 digital photography using a camera 140. These are merely examples of how the document image 11l might be provided. As another example, the document image 111 could be created by a software application as an extension of a printing functionality of the software application. The scan processor 150 implements a process of finding lines in the document 30 image 11l, and comprises an optional preprocessing unit 160, a line extraction module 170, and a memory 180. The scan processor 150 performs a combination of document analysis tasks on the document image 111. The tasks include document preprocessing, 5816796_1 P015747_specilodge -6 colour quantisation, connected component (CC) analysis, tamper-proofing, visual quality enhancement, output compression, and optical character recognition (OCR). The memory 180 is used to store data pertaining to scan processing tasks. In one configuration of the system 100, this memory 180 contains a document representation 5 buffer 182 that stores the document image 111 that is being processed, preferably in bitmap form at a resolution of 300 dots per inch (DPI). Other resolutions and configurations may be practiced without departing from the scope of the present disclosure. The memory 180 may also contain: (a) A collection 181 of CCs that records characterising information 10 about each CC of the document image 111, such information for example comprising any one or combination of location, colour, shape, topology, classification (e.g., text, image, logo, etc.) and other descriptive features of each CC. The collection 181 may, for example, be organised as a list of CCs (b) A tiled line candidate bitmap 183 that holds a rendered image 15 displaying selected CCs that might be part of a line, the rendered image being divided or partitioned along a regular grid into non-overlapping regions of the document image referred to as "tiles"; (c) A collection 184 of straight lines found in the document image I11, that may record characterising information about each line, such information for example 20 comprising any one or combination of location, colour, angle, type (e.g., solid, dashed, underline, etc.) and other descriptive features of each straight line. The collection 184 may, for example, be organised as a list of straight lines; and (d) A further memory 185 associated with each of the aforementioned tiles, referred to as the "per-tile memory", that preferably contains stores for: 25 (i) Projection profiles 186 that each record a histogram of the tile with respect to a parameter; and (ii) Line segments 187 found in the tile, each line segment having characterising information for example comprising any one or combination of location, width, angle, a reference point within the line segment, a polar transform centred 30 on the line segment's reference point, and other descriptive features of the line segment. The memory 180 and the further memory 185 of the scan processor 150 may be formed using one or more of Random Access Memory (RAM), Hard Disk Drive (HDD) 5816796_1 P015747_speci_lodge -7 memory, or other types of memory such as flash based memory or memristor based memory for example. The scan processor 150 initially stores the input document image Ill in the document representation buffer 182, decoding and/or decompressing the input document 5 image 111 where necessitated by the format of the input document image 111. In one configuration of the system 100, the preprocessing unit 160 performs image processing tasks, such as colour quantisation, on the document image IlI and writes the resulting preprocessed image back to the document representation buffer 182. Next, the line extraction module 170 operates on the (preprocessed) document image from the buffer 182 10 to find lines. In the process of finding the lines, the line extraction module 170 generates the collection of CCs 181 of the document image. Connected component collection generation is described in more detail later. Using a subset of the CCs in the CC collection 181, a tiled line candidate bitmap 183 is formed. The line extraction module 170 analyses each tile of the tiled line candidate bitmap 183 and records projection profiles 186 and line 15 segments 187 of each tile in the per-tile memory 185. Next, the line extraction module 170 forms lines 184 by combining line segments 187. Finally, the scan processor 150 produces an electronic document 190 using the extracted lines 184. In various configurations of the system 100, the preprocessing unit 160 and the line extraction module 170 are implemented, either separately or together, as application 20 specific integrated circuits (ASICs), on embedded processors, on general purpose computers, or using other such platforms. In one configuration of the system 100, the scanner 120 and the scan processor 150 are integrated into a single device such as a multi function printer (MFP). The local data store 130 may be formed within the MFP and may be used to temporarily cache documents 110 that have been scanned by the scanner 120 but 25 not yet processed by the scan processor 150. The format of the produced electronic document 190 may be chosen by the user operating the MFP, and the document analysis tasks performed by the scan processor 150 may be implied by the user's choice. One of these tasks may be line extraction, as performed by the line extraction module 170. Generic Implementation 30 Whilst the arrangements to be described consistent with Fig. I may be implemented in devices such a scanners, copiers and multi-function printers, a generic implementation of the processing is one formed within a general purpose computer to which appropriate 5816796_1 P015747_speci_lodge -8 interface devices may be coupled. Figs. I lA and 11 B depict a general-purpose computer system 1100, upon which the various arrangements described can be practiced. As seen in Fig. 1 IA, the computer system 1100 includes: a computer module 1101; input devices such as a keyboard 1102, a mouse pointer device 1103, the scanner 120, the 5 camera 140, and a microphone 1180; and output devices including a printer 1115, a display device 1114 and loudspeakers 1117. An external Modulator-Demodulator (Modem) transceiver device 1116 may be used by the computer module 1101 for communicating to and from a communications network 1120 via a connection 1121. The communications network 1120 may be a wide-area network (WAN), such as the Internet, a cellular 10 telecommunications network, or a private WAN. Where the connection 1121 is a telephone line, the modem 1116 may be a traditional "dial-up" modem. Alternatively, where the connection 1121 is a high capacity (e.g., cable) connection, the modem 1116 may be a broadband modem. A wireless modem may also be used for wireless connection to the communications network 1120. 15 The computer module 1101 typically includes at least one processor unit 1105, and a memory unit 1106. For example, the memory unit 1106 may have semiconductor random access memory (RAM) and semiconductor read only memory (ROM). The computer module 1101 also includes an number of input/output (1/0) interfaces including: an audio-video interface 1107 that couples to the video display 1114, loudspeakers 1117 20 and microphone 1180; an 1/0 interface 1113 that couples to the keyboard 1102, mouse 1103, scanner 120, camera 140 and optionally a joystick or other human interface device (not illustrated); and an interface 1108 for the external modem 1116 and printer 1115. In some implementations, the modem 1116 may be incorporated within the computer module 1101, for example within the interface 1108. The computer 25 module 1101 also has a local network interface 1111, which permits coupling of the computer system 1100 via a connection 1123 to a local-area communications network 1122, known as a Local Area Network (LAN). As illustrated in Fig. 1lA, the local communications network 1122 may also couple to the wide network 1120 via a connection 1124, which would typically include a so-called "firewall" device or device of 30 similar functionality. The local network interface 1111 may comprise an Etherneti" circuit card, a BluetoothTM wireless arrangement or an IEEE 802.11 wireless arrangement; however, numerous other types of interfaces may be practiced for the interface I111. 5816796_1 P015747_specilodge -9 The I/O interfaces 1108 and 1113 may afford either or both of serial and parallel connectivity, the former typically being implemented according to the Universal Serial Bus (USB) standards and having corresponding USB connectors (not illustrated). Storage devices 1109 are provided and typically include a hard disk drive (HDD) 1110. Other 5 storage devices such as a floppy disk drive and a magnetic tape drive (not illustrated) may also be used. An optical disk drive 1112 is typically provided to act as a non-volatile source of data. Portable memory devices, such optical disks (e.g., CD-ROM, DVD, Blu-ray DiscTM), USB-RAM, portable, external hard drives, and floppy disks, for example, may be used as appropriate sources of data to the system 1100. 10 The components 1105 to 1113 of the computer module 1101 typically communicate via an interconnected bus 1104 and in a manner that results in a conventional mode of operation of the computer system 1100 known to those in the relevant art. For example, the processor 1105 is coupled to the system bus 1104 using a connection 1118. Likewise, the memory 1106 and optical disk drive 1112 are coupled to the system bus 1104 by 15 connections 1119. Examples of computers on which the described arrangements can be practised include IBM-PC's and compatibles, Sun Sparcstations, Apple Maci or a like computer systems. In the computer system 1100 of Fig. 11 A, the previously described data storage system 130 may be formed by a server (not illustrated) coupled in one of the networks 1120 20 and 1122, a combination of the optical drive 1112 and optical disk 1125, or the HDD 110, all of which may be configured to provide the input document image 111 to the computer module 1101 for line extraction processing. Further, the memory 180 may be implemented using the memory 1106, possibly in concert with the HDD 1110. The methods of detecting lines and line extraction may typically be implemented 25 using the computer system 1100 wherein the processes of Figs. 2 to 10, to be described, may be implemented as one or more software application programs 1133 executable within the computer system 1100. In particular, the steps of the methods of detecting lines and line extraction are effected by instructions 1131 (see Fig. 11 B) in the software 1133 that are carried out within the computer system 1100. The software instructions 1131 may be 30 formed as one or more code modules, each for performing one or more particular tasks. The software may also be divided into two separate parts, in which a first part and the corresponding code modules performs the detecting lines and line extraction methods and a 5816796_1 P015747_specijlodge -10 second part and the corresponding code modules manage a user interface between the first part and the user. The software may be stored in a computer readable medium, including the storage devices described below, for example. The software is loaded into the computer 5 system 1100 from the computer readable medium, and then executed by the computer system 1100. A computer readable medium having such software or computer program recorded on the computer readable medium is a computer program product. The use of the computer program product in the computer system 1100 preferably effects an advantageous apparatus for detecting lines and line extraction. 10 The software 1133 is typically stored in the HDD 1110 or the memory 1106. The software is loaded into the computer system 1100 from a computer readable medium, and executed by the computer system 1100. Thus, for example, the software 1133 may be stored on an optically readable disk storage medium (e.g., CD-ROM) 1125 that is read by the optical disk drive 1112. A computer readable medium having such software or 15 computer program recorded on it is a computer program product. The use of the computer program product in the computer system 1100 preferably effects an apparatus for detecting lines and line extraction. In some instances, the application programs 1133 may be supplied to the user encoded on one or more CD-ROMs 1125 and read via the corresponding drive 1112, or 20 alternatively may be read by the user from the networks 1120 or 1122. Still further, the software can also be loaded into the computer system 1100 from other computer readable media. Computer readable storage media refers to any non-transitory tangible storage medium that provides recorded instructions and/or data to the computer system 1100 for execution and/or processing. Examples of such storage media include floppy disks, 25 magnetic tape, CD-ROM, DVD, Blu-rayTM Disc, a hard disk drive, a ROM or integrated circuit, USB memory, a magneto-optical disk, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the computer module 1101. Examples of transitory or non-tangible computer readable transmission media that may also participate in the provision of software, application 30 programs, instructions and/or data to the computer module 1101 include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like. 5816796_1 P015747_speci_lodge - 11 The second part of the application programs 1133 and the corresponding code modules mentioned above may be executed to implement one or more graphical user interfaces (GUIs) to be rendered or otherwise represented upon the display 1114. Through manipulation of typically the keyboard 1102 and the mouse 1103, a user of the computer 5 system 1100 and the application may manipulate the interface in a functionally adaptable manner to provide controlling commands and/or input to the applications associated with the GUI(s). Other forms of functionally adaptable user interfaces may also be implemented, such as an audio interface utilizing speech prompts output via the loudspeakers 1117 and user voice commands input via the microphone 1180. 10 Fig. 11B is a detailed schematic block diagram of the processor 1105 and a "memory" 1134. The memory 1134 represents a logical aggregation of all the memory modules (including the HDD 1109 and semiconductor memory 1106) that can be accessed by the computer module 1101 in Fig. 11 A. When the computer module 1101 is initially powered up, a power-on self-test 15 (POST) program 1150 executes. The POST program 1150 is typically stored in a ROM 1149 of the semiconductor memory 1106 of Fig. 11 A. A hardware device such as the ROM 1149 storing software is sometimes referred to as firmware. The POST program 1150 examines hardware within the computer module 1101 to ensure proper functioning and typically checks the processor 1105, the memory 1134 (1109, 1106), and a 20 basic input-output systems software (BIOS) module 1151, also typically stored in the ROM 1149, for correct operation. Once the POST program 1150 has run successfully, the BIOS 1151 activates the hard disk drive 1110 of Fig. 11 A. Activation of the hard disk drive 1110 causes a bootstrap loader program 1152 that is resident on the hard disk drive 1110 to execute via the processor 1105. This loads an operating system 1153 into the 25 RAM memory 1106, upon which the operating system 1153 commences operation. The operating system 1153 is a system level application, executable by the processor 1105, to fulfil various high level functions, including processor management, memory management, device management, storage management, software application interface, and generic user interface. 30 The operating system 1153 manages the memory 1134 (1109, 1106) to ensure that each process or application running on the computer module 1101 has sufficient memory in which to execute without colliding with memory allocated to another process. Furthermore, the different types of memory available in the system 1100 of Fig. 11 A must 5816796_1 P015747_specilodge - 12 be used properly so that each process can run effectively. Accordingly, the aggregated memory 1134 is not intended to illustrate how particular segments of memory are allocated (unless otherwise stated), but rather to provide a general view of the memory accessible by the computer system 1100 and how such is used. 5 As shown in Fig. 11 B, the processor 1105 includes a number of functional modules including a control unit 1139, an arithmetic logic unit (ALU) 1140, and a local or internal memory 1148, sometimes called a cache memory. The cache memory 1148 typically include a number of storage registers 1144 - 1146 in a register section. One or more internal busses 1141 functionally interconnect these functional modules. The 10 processor 1105 typically also has one or more interfaces 1142 for communicating with external devices via the system bus 1104, using a connection 1118. The memory 1134 is coupled to the bus 1104 using a connection 1119. The application program 1133 includes a sequence of instructions 1131 that may include conditional branch and loop instructions. The program 1133 may also include 15 data 1132 which is used in execution of the program 1133. The instructions 1131 and the data 1132 are stored in memory locations 1128, 1129, 1130 and 1135, 1136, 1137, respectively. Depending upon the relative size of the instructions 1131 and the memory locations 1128-1130, a particular instruction may be stored in a single memory location as depicted by the instruction shown in the memory location 1130. Alternately, an instruction 20 may be segmented into a number of parts each of which is stored in a separate memory location, as depicted by the instruction segments shown in the memory locations 1128 and 1129. In general, the processor 1105 is given a set of instructions which are executed therein. The processor 1105 waits for a subsequent input, to which the processor 1105 25 reacts to by executing another set of instructions. Each input may be provided from one or more of a number of sources, including data generated by one or more of the input devices 1102, 1103, data received from an external source across one of the networks 1120, 1102, data retrieved from one of the storage devices 1106, 1109 or data retrieved from a storage medium 1125 inserted into the corresponding reader 1112, all 30 depicted in Fig. I IA. The execution of a set of the instructions may in some cases result in output of data. Execution may also involve storing data or variables to the memory 1134. The disclosed detecting lines and line extraction arrangements use input variables 1154, which are stored in the memory 1134 in corresponding memory 5816796_1 P015747_specilodge - 13 locations 1155, 1156, 1157. The detecting lines and line extraction arrangements produce output variables 1161, which are stored in the memory 1134 in corresponding memory locations 1162, 1163, 1164. Intermediate variables 1158 may be stored in memory locations 1159, 1160, 1166 and 1167. 5 Referring to the processor 1105 of Fig. 1IB, the registers 1144, 1145, 1146, the arithmetic logic unit (ALU) 1140, and the control unit 1139 work together to perform sequences of micro-operations needed to perform "fetch, decode, and execute" cycles for every instruction in the instruction set making up the program 1133. Each fetch, decode, and execute cycle comprises: 10 (a) a fetch operation, which fetches or reads an instruction 1131 from a memory location 1128, 1129, 1130; (b) a decode operation in which the control unit 1139 determines which instruction has been fetched; and (c) an execute operation in which the control unit 1139 and/or the ALU 1140 15 execute the instruction. Thereafter, a further fetch, decode, and execute cycle for the next instruction may be executed. Similarly, a store cycle may be performed by which the control unit 1139 stores or writes a value to a memory location 1132. Each step or sub-process in the processes of Figs. 2 to 10 is associated with one or 20 more segments of the program 1133 and is performed by the register section 1144, 1145, 1147, the ALU 1140, and the control unit 1139 in the processor 1105 working together to perform the fetch, decode, and execute cycles for every instruction in the instruction set for the noted segments of the program 1133. The one or more parts of the methods of detecting lines and line extraction may 25 alternatively be implemented in dedicated hardware such as one or more integrated circuits performing the functions or sub functions of detecting lines and line extraction. Such dedicated hardware may include graphic processors, digital signal processors, or one or more microprocessors and associated memories. Nevertheless, in a typical implementation, operations of the scan processor 150 including the preprocessing unit 160 30 and the line extraction module 170 are performed in software, for example by way of the application program 1133. Line extraction 5816796_1 P015747_speci_lodge - 14 Fig. 2 illustrates a method 200 of extracting lines, as can be implemented in the line extraction module 170 of the system 100. In the computer system 1100 of Figs. 11 A and 11 B, the extraction module 170 and thus the method 200 may be implemented entirely in software as the application program 1133. The method 200 has a CC generating step 210, 5 a CC classifying step 220, a tiled line candidate bitmap composition step 230, a line segment detecting step 235, a line forming step 260, and an optional line type determining step 270. The line segment detecting step 235 comprises a reference point locating substep 240 and a segment measure determining substep 250. The optional line type determining step 270 is indicated by dashed lines. The method 200 ends at step 299. 10 The CC generating step 210 processes pixels of the document image Ill in a raster scan order and connects adjacent pixels that meet a connectedness criterion as described hereinafter. In a specific implementation, the CC generating step 210 operates on a colour quantised document image as provided by the preprocessing unit 160 of Fig. 1, and the CC generating step 210 connects adjacently located pixels meeting the connectedness criterion, 15 preferably being that the pixels share the same quantised colour. In another implementation, the document image 111 is not preprocessed by the preprocessing unit 160, and the CC generating step 210 connects adjacently located pixels meeting a connectedness criterion that the pixels are substantially similar in colour. Two colours are substantially similar if their colour difference is relatively small compared to the colour 20 difference between a foreground object and a background object, which can be determined by checking whether the difference between two colours is within a predefined colour difference threshold. Thus, the connected components are generated from the colour bitmap image 111 by grouping substantially similarly coloured and adjacent pixels. In this manner, CCs are formed. Each CC is written to the connected component memory 181 of 25 Fig. 1. The CC classifying step 220 determines which types of content best characterise each CC, using attributes of the connected components described hereinafter. More particularly, this step 220 classifies the CCs by assigning a class (e.g., text, image, logo, etc.) to each of the CCs formed by the CC generating step 210. An assigned class indicates 30 that a CC has characteristics associated with that specific type of content. In a particular implementation, the CC classifying step 220 analyses CCs based on statistical measures independent of the CC's colour, relating to those CCs, and assigns each CC a class based on those statistical measures using existing techniques. The statistical measures used may 5816796_1 P015747_specilodge - 15 be the attributes of the CC, such as the number of pixels that make up the CC, the height and the width of a bounding box of the CC, and a fill ratio of the bounding box of the CC. The bounding box of a CC is the minimum rectangle that completely encloses the CC. The fill ratio is the ratio of the number of pixels belonging to the CC in the bounding box to the 5 total number of pixels in the bounding box. A classifier can be trained to classify a CC based on the statistical measures into its most likely class The class assigned to each CC is added to the connected component memory 181 of Fig. 1. The tiled line candidate bitmap composition step 230 selects CCs from the connected component memory 181 and composes a bitmap by rasterising the selected CCs. 10 The selected CCs are chosen based on geometric properties that indicate that they are line like. The resulting bitmap therefore principally contains a representation of CCs that are line constituents. The resulting bitmap is then divided into non-overlapping regular tiles, and stored as the tiled line candidate bitmap 183 in the memory 180. The composition step 230 is described in more detail later. 15 The line segment detecting step 235 detects short line segments exhibited by individual tiles of the tiled line candidate bitmap 183. Each detected short line segment is described by at least a reference point that localises the line segment within its tile, and a measure derived from a polar transform of the tile about the reference point. Detected line segments and their descriptions are stored as the line segments 187 of the per-tile memory 20 185. The line segment detecting step comprises a reference point locating substep 240, which writes at least a projection profile 186 of the per-tile memory 185 to locate reference points corresponding to projection profile peaks, and a segment measure determining substep 250, which creates and analyses a polar transform to determine a measure corresponding to each individual line segment. According to one implementation, the 25 determined measure is the angle of each line segment with respect to the axes of the document image. The substeps 240 and 250 of the line segment detecting step 235 are described in more detail later. The line forming step 260 combines line segments 187 from adjacent tiles to form lines, where the combining process is guided by the measures determined by the segment 30 measure determining substep 250. As part of the line forming step, characterising information about the formed lines may also be produced, such as, for example the width, length, colour, etc. of the formed lines. The formed lines and their associated characterising information are written to the corresponding line storage location in the line 5816796_1 P015747_speci_lodge - 16 collection 184 in the memory 180. The line forming step 260 is described in more detail later. The optional line type determining step 270 analyses the characterising information associated with lines formed by the line forming step 260, and annotates each line in the 5 line storage location 184 with the line types that best describe the line. In one implementation, the line types are solid, dashed (i.e. deliberately non-solid) and underline (i.e. running under text in order accentuate that text), while an alternative implementation uses broken lines to include lines that were intended to be non-solid and lines intended to be solid but that have breaks in them caused by a process such as scanning. In 10 implementations where the optional line type determining step 270 is performed, further optional steps may also be performed in order to produce further line characterising information that can better determine a line type. For example, the optional transition counting step 430, described later, is one such further optional step. The method 200 of finding lines in a bitmap image of a document ends at step 299. 15 Tiled line candidate bitmap composition Fig. 3 illustrates in detail the tiled line candidate bitmap composition step 230 of Fig. 2. Step 230 preferably comprises a high-confidence line candidate selection step 310, a first selected line candidate traversal loop (with a first initialiser 320 and a first iterator 325), a high-confidence line candidate marking step 330, a high-confidence line candidate 20 tagging step 340, a low-confidence line candidate selection step 350, a second selected line candidate traversal loop (with a second initialiser 360 and a second iterator 365), a low confidence line candidate marking step 370, and a bitmap division step 380. A "line candidate" is a CC that exhibits geometric properties similar to CCs that constitute a line. A line candidate can be either "high-confidence", meaning that there is a high likelihood it 25 is a line constituent, or "low-confidence", meaning that there is a reduced yet nonetheless significant likelihood it is a line constituent. First, the tiled line candidate bitmap composition step 230 creates a bitmap of high confidence line candidates. To begin with, step 230 writes an empty bitmap to an impermanent location in the memory 180. This impermanent location is referred to as the 30 "temporary bitmap", and corresponds in size to the document image 111, and is preserved throughout the tiled line bitmap composition step 230. Then, the high-confidence line candidate selection step 310 examines each of the CCs in the collection of CCs 181 of the memory 180, and selects those CCs that are high-confidence line candidates. Desirably, a 5816796_1 P015747_speci_lodge - 17 CC is selected based on its assigned class and possibly further geometric properties of the CC, where the selection criteria are intended to target constituent CCs of common line types. For example, well-defined solid lines are often single CCs correctly classified as 5 either lines or table frames using existing techniques, so CCs of these classes can be selected as high-confidence line candidates without further examination. However, the constituent CCs of fragmented solid lines and dashed lines are often misclassified as text, due to a geometric similarity between short line segments and characters such as "I" (the lower case letter L) or "1" (the numeral one). Thus the high-confidence line candidate 10 selection step 310 further examines the geometric properties of CCs classified as text, to determine if those CCs should be selected. A CC classified as text is selected as line-like if the CC further: (i) is formed of an appropriate number of pixels, dependent on the image resolution (e.g. more than 4 pixels and fewer than 1000 pixels at a resolution of 300 15 dots per inch); (ii) encloses no other CCs; and (iii) has a fill ratio at least as great as the fill ratio of a circle, with some leeway for imperfect quantisation. A person skilled in the art will recognise that the high-confidence line candidate 20 selection step 310 could equivalently use a different combination of geometric CC properties to achieve a similar selection. The first selected line candidate traversal loop (formed between the first initialiser 320 and the first iterator 325) causes the high-confidence line candidate marking step 330 and the high-confidence line candidate tagging step 340 to be performed for each of the 25 line candidates selected by the high-confidence line candidate selection step 310. The high-confidence line candidate marking step 330 rasterises the current line candidate to the temporary bitmap. In a preferred implementation of step 330, the line candidate is converted to a binary mask, and logical ORed onto the temporary bitmap, thus marking the temporary bitmap with the current line candidate. 30 The high-confidence line candidate tagging step 340 annotates the current CC in the collection of CCs 181 in the memory 180 to note that the current CC is a high-confidence line candidate. 5816796_1 P01 5747_speci-lodge -18 Secondly, the tiled line candidate bitmap composition step 230 further composes the temporary bitmap with low-confidence line candidates. The low-confidence line candidate selection step 350 examines each of the CCs in the collection of CCs 181 of the memory 180, and selects those that are low-confidence line candidates. Step 350 proceeds 5 similarly to the high-confidence line candidate selection step 310, but with different selection criteria intended to target constituent CCs of line types that are difficult to distinguish. For example, underlines are lines that adjoin text. The geometric properties of underlines are difficult to characterise, and underlines may be assigned a wide variety of 10 classes using existing techniques. Thus a CC is selected if the CC: (i) has a large aspect ratio; (ii) has a small fill ratio (as adhered text characters will expand its bounding box); and (iii) does not touch a CC that is tagged as a high-confidence line 15 candidate in the collection of CCs 181 of the memory 180 (as a lower confidence line candidate may disrupt a high confidence line). The second selected line candidate traversal loop (defined by the second initialiser 360 and the second iterator 365) causes the low-confidence line candidate marking step 370 to be performed for each of the line candidates selected by the low-confidence line 20 candidate selection step 350. The low-confidence line candidate marking step 370 behaves equivalently to the high-confidence line candidate marking step 330. Thirdly, the tiled line candidate bitmap composition step 230 divides the temporary bitmap into non-overlapping regular tiles. This is performed by the bitmap division step 380. In a specific implementation, the tiles are squares of size 32 x 32 pixels for a 25 document image resolution of 300 dots per inch. Thresholds introduced later will be given with reference to tiles of this shape and size, however a person skilled in the art will understand that tiles of different shape and size could be used with minor alterations to the procedure described below. The divided tiles are written to the tiled line candidate bitmap 183 of the memory 180. 30 Line segment detection Fig. 4 illustrates in detail the reference point locating substep 240 of Fig. 2. Step 240 comprises a tile loop (defined by the steps of an initialiser 410 and an iterator 415), where the tile loop comprises a projection determining step 420, an optional transition 5816796_1 P015747_speci lodge - 19 counting step 430, a projection peak locating step 440, a line segment defining step 450, and a reference point determining step 460. The tile loop 410 and 415 iterates over each of the tiles of the tiled line candidate bitmap 183 in an arbitrary order. In other words, the steps 420, optionally 430, 440, 450 5 and 460 are carried out independently for each tile. The projection determining step 420 determines at least one projection profile of the current tile of the tile loop 410. A projection profile is calculated by: the processor 1105 determining a projection axis onto which the tile should be projected; the processor 1105 calculating the projection vectors as vectors normal to the projection axis; and the 10 processor 1105 summing the tile values along each projection vector to form a histogram with respect to the projection axis. In a preferred implementation, two projection profiles are calculated: a horizontal projection profile (i.e. a projection profile where the projection vectors are horizontal and the projection axis is vertical, such that a horizontal line is represented by a peak of the projection profile) and a vertical projection profile. These are 15 written to a corresponding storage location in the projection profiles 186 of the per-tile memory 185 corresponding to the current tile. Fig. 8A and Fig. 8B collectively illustrate the projection determining step 420. Fig. 8A shows the elements of a horizontal projection profile of a 5 x 5 pixel tile 800. The pixel boundaries are illustrated by dashed lines, and the tile boundary is illustrated by solid 20 lines. The projection axis 810, illustrated by a dotted line, is vertically oriented, and each of the five projection vectors 820, illustrated by an arrow, is horizontally oriented. There is one projection vector for each row of the tile. The horizontal projection profile is calculated by summing, for each projection vector, the values of each pixel that the projection vector coincides with. As shown by Fig. 8A, for a horizontal projection, each 25 value of the projection profile is the sum of all the pixel values of a row. Fig. 8B illustrates an example of the horizontal projection profile 802 and vertical projection profile 803 of a 5 x 5 pixel tile 801. As in Fig. 8A, the pixel boundaries are illustrated by dashed lines 850, and the tile boundary is illustrated by solid lines 860. Unfilled pixels 830 illustrate pixels that are cleared (i.e. with a value of 0), and 30 crosshatched pixels 840 illustrate pixels that are set (i.e. with a value of 1). The horizontal projection profile 802 of the tile 801 indicates that there is I set pixel in the first row, no set pixels in the second row, 5 set pixels in the third row, 4 set pixels in the fourth row, and no set pixels in the fifth row. The vertical projection profile 803 of the tile 801 indicates 5816796_1 P015747_speci_lodge - 20 that there are 2 set pixels in the first column, 1 set pixel in the second column, 3 set pixels in the third column, 2 set pixels in the fourth column, and 2 set pixels in the fifth column. Note that the apparent line in the third and fourth rows of the tile 801 corresponds with a relative peak in the horizontal projection profile 802. 5 The optional transition counting step 430 counts transitions of the current tile of the tile loop 410,415. This is useful for the optional line type determining step 270 of Fig 2, which is described in more detail later. For binary tiles, a transition is a change from a black (set) pixel to a white (cleared) pixel or from a white (cleared) pixel to a black (set) pixel. The transitions are counted along the same projection vectors as were used in the 10 projection determining step 420, and are written alongside the projection profiles 186 of the per-tile memory 185 corresponding to the current tile. Referring again to Fig. 8B, the first row of the tile 801 has two transitions, as there is one change from an unset pixel to a set pixel (columns two and three), and another change from a set pixel to an unset pixel (columns three and four) along that row. Similarly, the fourth row has two transitions 15 (between columns one and two, and between columns two and three). The second, third and fifth rows each have no transitions, as each of those rows contains no change from a set pixel to a cleared pixel or vice versa. The projection peak locating step 440 identifies significant peaks of the at least one projection profile 186 corresponding to the current tile. A significant peak is a relatively 20 high point or region of a projection profile. The projection peak locating step 440 requires that peaks must meet certain criteria, such as, in one specific implementation: (i) having relative peak height above a threshold according to the tile geometry (e.g. two-fifths of the maximum possible height); and (ii) having peak width below a threshold (e.g. 16 pixels for a document 25 image resolution of 300 dots per inch). Returning to Fig. 4, the line segment defining step 450 writes a new line segment to the corresponding storage location in the line segments store 187, of the per-tile memory 185 corresponding to the current tile, for each of the projection peaks located by the projection peak locating step 440. Step 450 also records information about each located 30 peak in order to define the line segments, for example comprising any one or more or combination of: peak region start and end positions, peak region width, maximum peak height, peak region integral, and minimum number of transitions within the peak region (if the optional transition counting step 430 was performed). 5816796_1 P015747_specilodge -21 The reference point determining step 460 localises each of the line segments 187, of the per-tile memory 185 corresponding to the current tile, within the document image. According to a specific implementation of the methods described herein, a line segment is localised by selecting a mid-point, halfway along the tile of the line segment, in the 5 dimension of the projection vector of the line segment, and halfway between the peak region start and end positions of the line segment along the dimension of the projection axis of the line segment. For example, a horizontal line segment would be localised halfway along the corresponding tile in the x-direction, and halfway between the start and end positions in the y-direction of the peak region of the line segment. Each reference 10 point is appropriately annotated in the storage location associated with the line segment in the line segment store 187. Thus the reference point is determined using the peak location to localise along one axis, and using the mid-point of the current tile to localise along another axis. Fig. 5 illustrates in detail the segment measure determining substep 250 of Fig. 2 15 according to a specific implementation where the segment measure is an angle. Step 250 comprises a tile loop (defined by a tile initialiser 510 and a tile iterator 515) that iterates through each of the tiles of the tiled line candidate bitmap 183 in an arbitrary order. For each iteration of the tile loop 510 and 515, a reference point loop (defined by a reference point initialiser 520 and a reference point iterator 525) is performed that iterates through 20 each of the reference points of the line segments 187 of the current tile. The reference point loop 520 and 525 comprises a polar transform calculating step 530, an angular projection performing step 540, an angular projection wrapping step 550, an angular peak locating step 560, and a line segment angle estimating step 570. The polar transform calculating step 530 calculates and performs a transform of the 25 current tile into polar coordinates, centred on the current reference point. The result of the transform is a transformed tile, with each column of the transformed tile equivalent to a line of the current tile extending radially outwards from the reference point at a known angle, and each row of the transformed tile equivalent to a circle of the current tile centred on the reference point of a known radius. The number of columns of the transformed tile is 30 chosen as a multiple of 4, so that a line passing through the reference point at an angle of 0 degrees, 90 degrees, 180 degrees or 270 degrees is equivalent to a column of the transformed tile. The transformed tile is annotated alongside the current reference point in the associated line segment's storage location 187. 5816796_1 P01 5747_specijlodge - 22 Fig. 9A and Fig. 9B collectively illustrate the polar transform calculating step 410. Fig. 9A shows a 5 x 5 pixel tile 900. The pixel boundaries are illustrated by dashed lines 910, and the tile boundary is illustrated by solid lines 920. The central pixel 930 is marked to indicate it is a reference point. Fig. 9B shows the elements of a polar transform of the 5 5 x 5 pixel tile 901. As in Fig. 9A, the reference point is the central pixel. Radial lines 940 extending outwards from the reference point are indicated by solid lines, and circles 950 centred on the reference point are indicated by dotted lines. The top row of the transformed tile is generated by sampling the innermost dotted circle at each point the innermost dotted circle crosses a radial line, in order of increasing angle. Similarly, the second row of the 10 transformed tile is generated by sampling the next most innermost dotted circle at each point it crosses a radial line, in order of increasing angle, and so on. The sampling of a point requires the interpolation of pixels of the tile 901. In a preferred implementation, the interpolation uses nearest neighbour interpolation, but in other implementations may use bilinear interpolation, trilinear interpolation, or sinc interpolation, to name but a few 15 alternatives. Fig. 9B shows radial lines with regularly spaced angles. In another implementation, the angles of the radial lines are not spaced regularly. Returning to Fig. 5, the angular projection performing step 540 creates a projection profile of the transformed tile produced by the polar transform calculating step 530. The created projection profile is an angular projection profile, where the projection axis is 20 parallel to the rows of the transformed tiles, and the projection vectors are along the columns of the transformed tile. Thus a peak of the angular projection profile corresponds to a line segment of constant angle passing through the current reference point in the current tile. The angular projection profile is annotated alongside the current reference point in the storage location of the line segment in the line segment store 187. 25 The angular projection wrapping step 550 sums pairs of rows of the angular projection profile created by the angular projection performing step 540, where the pairs of rows summed represent lines of angles separated by 180 degrees. That is, the row representing the line extending to the right of the current reference point in the current tile is summed with the row representing the line extending to the left, and so on. The 30 resulting wrapped angular projection profile replaces the non-wrapped angular projection profile. Fig. 10 illustrates an example of calculating the wrapped angular projection profile 1005 for a 5 x 5 pixel tile 1000. The tile 1000 of Fig. 10 is similar to the tile 901 of Fig. 5816796_1 P015747_speci lodge -23 9B, in that its central pixel is the reference point being considered. Radial lines 1040 extending outwards from the reference point, indicated by solid lines, and circles 1030 centred on the reference point, indicated by dotted lines, have been overlaid on the tile 1000 for the purpose of illustration. Unfilled pixels 1010 illustrate pixels that are cleared 5 (i.e. with a value of 0), and crosshatched pixels 1020 illustrate pixels that are set (i.e. with a value of 1). The polar transform 1001 of the tile 1000 is formed using the same process described earlier with reference to Fig. 9B. The first row is formed by sampling along the innermost dotted circle (in this case entirely coincident with the reference pixel) of the tile 1000 starting at 0* (the x-axis), and so on. The angular projection profile 1002 is formed 10 by summing along each of the columns of the polar transform 1001. The angular projection profile 1002 is split into two halves: a left half 1003 representing angles in the half-open interval [0 - 180 degrees) and a right half 1004 representing angles in the half open interval [180 - 360 degrees). The wrapped angular projection profile 1005 is formed by summing the first column of the left half 1003 with the first column of the right half 15 1005, the second column of the left half 1004 with the second column of the right half 1005, and so on. The wrapped angular projection profile 1005 represents angles in the half-open interval [0 - 180 degrees). Note that that the wrapped angular projection profile 1005 suggests a line segment of angle less than 180 degrees, which reflects the contents of the tile 1000. 20 Returning to Fig. 5, the angular peak locating step 560 finds peaks in the wrapped angular projection profile produced by the angular projection wrapping step 550. Angular peaks are located similarly to the manner employed by the projection peak locating step 440 of the reference point location substep 240. However, further care is desirably taken when determining angular peak height. For example, if the current reference point is at the 25 centre of a tile, a line segment oriented horizontally or vertically will be shorter within the tile than a line segment oriented at another angle. To ensure that all angles are considered equivalently, the wrapped angular projection profile is preferably normalised at each angle, according to the reference point, to the maximum possible angular peak height in a tile at that angle from the reference point. Following this normalisation, angular peak locating 30 continues as before, also calculating angular peak information in a similar manner to the line segment defining step 450. The successful location of an angular peak validates the line segment associated with the current reference point, and thus that line segment is 5816796_1 P01 5747 speci odge - 24 considered found. Step 560 thus analyses peaks in each angle of the polar transform to find a line segment passing the reference point. The line segment angle estimating step 570 estimates the angle at which the line segment associated with the current reference point is oriented, the angle being a measure 5 of the line segment. Step 570 examines the angular peaks located by the angular peak locating step 560, and selects the angular peak region most similar to the projection vector that located the line segment during the reference point locating substep 240. For example, if the line segment was located with a horizontal projection during the reference point locating substep 240 and has multiple angular peak regions in its wrapped angular 10 projection profile, step 570 would select the angular peak region representing angles closest to the horizontal. Step 570 then expands the selected angular peak region to either side, and estimates the line segment angle using a weighted average of the angles in the expanded angular peak region, where the weights are the normalised values for those angles as calculated in the angular peak locating step 560. The estimated line segment 15 angle is annotated in the storage location of the associated line segment in the line segment store 187. Lineformation Fig. 6 illustrates in detail the line forming step 260 of Fig. 2. Step 260 comprises a line segment loop (defined by an initialiser step 610 and an iterator step 615), a neighbour 20 location estimating step 620, an existence decision 630, and a line segment linking step 640. The line segment loop defined by steps 610 and 615 iterates through the line segments 187 of every per-tile memory 185. For each line segment 187, the neighbour location estimating step 620 and the existence decision 630 are performed and the line 25 segment linking step 640 is possibly performed. The neighbour location estimating step 620 estimates the location in the document image 111 of the continuation on either side of the current line segment, where the line segment is assumed to be part of a larger line. This estimation is performed using the reference point of the line segment and measure of the line segment, determined in step 30 250. According to an implementation of the method 200 where the measure is an angle, the adjoining reference points are determined using trigonometry. For example, for a line segment that was located with a horizontal projection during the reference point locating substep 240, the next reference point to the right is found using: 5816796_1 P015747_speci_lodge -25 x_next = x ref+ 32 y.next = yref+ 3 2 * tan ( angle) where (xref, yref) is the reference point of the current line segment, (x next, ynext) is the estimated position of the reference point to the right, and angle is the estimated angle of 5 the current line segment as estimated by the line segment angle estimating step 570. The constant value of 32 is selected based on tile size, in this case for a tile of size 32 pixels. The existence decision 630 searches the line segments 187 of every per-tile memory 185 to find any line segment that corresponds to the estimated neighbour reference locations given by the neighbour location estimating step 620. The determination of 10 whether a line segment under test corresponds to an estimated reference location uses a combination of: (i) distance between the estimated reference location and reference location under test; (ii) difference in estimated angles of the current line segment and the 15 line segment under test, according to an implementation where the line segment measure is an angle; (iii) difference in line widths of the current line segment and the line segment under test, as calculated from the angular peak region start and end positions of the line segments; and 20 (iv) other comparisons associated with the attributes of the line segments, such as the maximum heights of their associated peaks. If any line segments are found to correspond to an estimated neighbour reference location, the line segment linking step 640 is performed on these neighbouring line segments; otherwise the line segment loop 610 and 615 continues with its next iteration. 25 The line segment linking step 640 links together neighbouring line segments, as determined by the existence decision 630, and the current line segment. If any of the neighbouring line segments or the current line segment is part of a line, that line is selected as the destination line; otherwise, a new line is written to the corresponding storage location in the line collection store 184 in the memory 180 and used as the destination line. 30 By choosing the order of iteration of the line segment loop (610 and 615) carefully, there should be at most one destination line for each iteration of the line segment loop (610 and 615). For example, the order of iteration of reference locations, for line segments that were 5816796_1 P015747_specilodge - 26 located with a horizontal projection during the reference point locating substep 240, should be monotonic in the x-dimension. The line segment linking step 640 then links each of the currently unlinked neighbouring line segments and the current line segment, if unlinked, into the destination 5 line. The characterising information of the destination line is updated to reflect all constituent line segments of the destination line, such as by calculating the line width as the average width of all the constituent line segments, or by calculating the line angle using trigonometry to find the angle between the two most separated reference points of the constituent line segments of the destination line. This characterising information is 10 annotated to the destination line. Each destination line is thus formed of at least two constituent line segments. Optional line type determination Fig. 7 illustrates in detail the optional line type determining step 270 of Fig. 2. Step 270 comprises a line loop (defined by an initialiser step 710 and an iterator step 715) that 15 iterates through the lines 184 of the memory 180, and for each iteration, performs a line feature calculation step 720, a dashed feature decision 730, possibly a dashed line marking step 740, an adhered text decision 750, and possibly an adhered text marking step 760. The line feature calculating step 720 calculates further characterising information that describes the current line, where the further characterising information is useful for the 20 purposes of line type determination. This further characterising information is calculated using all of the constituent line segments of the current line, which may include at least two line segments. For example, in a specific implementation, the line feature calculating step 720 calculates a statistic comprising the total number of transitions along the line by summing 25 the minimum number of line transitions in each constituent line segment's peak region, as were recorded by the line segment defining step 450. As another example, the line feature calculating step 720 further calculates a statistic comprising the sum of angular peak region integrals, as were determined by the angular peak locating step 560 following normalisation. This sum is divided by the sum of 30 angular peak region widths, as were also determined by the angular peak locating step 560, to produce a line duty cycle. As yet another example, the line feature calculating step 720 may further calculate a statistic comprising an estimate of the number of line segments that intersect the current 5816796_1 P015747_speci_lodge - 27 line at angles other than right-angles. This estimate is calculated by inspecting all the tiles that the current line passes through. Each line segment in those tiles that is not a constituent line segment of the current line is inspected, and a tally is incremented if the line segment: 5 (i) intersects the current line, and (ii) has an angle that is not within a threshold of the angle + 90 degrees of the current line (e.g. with a threshold of, say, ± 5 degrees). The line feature calculating step 720 annotates all further characterising information and corresponding accumulated statistics to the current line. 10 The dashed feature decision 730 determines if the current line has statistical features that indicate it is a dashed line. According to one implementation, the current line is decided to have dashed features if: (i) the total number of transitions of the line is high in proportion to the length of the current line (e.g. the line has more than 3 times as many transitions as the 15 number of tiles the line passes through); and (ii) the duty cycle of the line is below a threshold (e.g. 0.8). If the dashed feature decision 730 decides that the current line has dashed features, the dashed line marking step 740 is performed; otherwise the dashed line marking step 740 is skipped. 20 The dashed line marking step 740 annotates the memory location of the current line in the line collection store 184 with an indication that the current line is a dashed line. The adhered text decision 750 then determines if the current line has statistical features that indicate whether there is text attached to the current line (e.g. as may occur if the current line is an underline). In an exemplary implementation, the current line is 25 determined to have adhered text features if the estimated number of line segments that intersect the current line at angles other than right-angles is higher than a threshold relative to the length of the current line (e.g. the line has at least one such intersection for each five tiles that it passes through). If the adhered text decision 750 decides that the current line has adhered text 30 features, the adhered text marking step 760 is performed; otherwise the line loop 710 and 715 continues with the next iteration. The adhered text marking step 760 annotates the memory location of the current line in the line collection store 184 with an indication that the current line has adhered text. 5816796_1 P015747_specilodge - 28 INDUSTRIAL APPLICABILITY The arrangements described are applicable to the computer and data processing industries and particularly for the analysing of documents and for the generation of object representations of documents from scanned images of documents. 5 The foregoing describes only some embodiments of the present invention, and modifications and/or changes can be made thereto without departing from the scope and spirit of the invention, the embodiments being illustrative and not restrictive. (Australia Only) In the context of this specification, the word "comprising" means "including principally but not necessarily solely" or "having" or "including", and not 10 "consisting only of'. Variations of the word "comprising", such as "comprise" and "comprises" have correspondingly varied meanings. 5816796_1 P015747_specijlodge

Claims (16)

  1. 2. The method as claimed in claim 1, wherein said forming a tiled line candidate bitmap comprises the steps of: (a) creating an empty line candidate bitmap corresponding to the digital image; (b) selecting high-confidence line-like connected components; 20 (c) marking pixels corresponding to the selected high-confidence line-like connected components in the line candidate bitmap; (d) selecting low-confidence line-like connected components; (e) marking pixels corresponding to the selected low-confidence line-like connected components in the line candidate bitmap; and 25 (f) dividing the line candidate bitmap into tiles to form the tiled line candidate bitmap.
  2. 3. The method as claimed in claim 1, wherein said detecting a line segment in each tile comprises the steps of: 30 (a) determining at least one of a horizontal projection profile and a vertical projection profile of the tile; (b) locating a peak from the at least one horizontal and vertical projection profiles; 5816796_1 P015747_speci_lodge -30 (c) determining a reference point based on the peak location and mid-point of the tile; (d) calculating a polar transform about the reference point; (e) analysing peaks in each angle of the polar transform to find a line segment 5 passing the reference point; and (f) creating the measure for the line segment based on the angle and distribution of distance from the reference point.
  3. 4. A method as claimed in claim 1, wherein the forming of at least one said line 10 comprises estimating a location of line segments of neighbouring tiles and, where one or more line segments exist at an estimated location, linking at least the neighbouring line segments to form a line of constituent line segments.
  4. 5. A method as claimed in claim 4, wherein the forming comprises iterating the 15 forming to link a current line segment to a previously formed line of constituent line segments.
  5. 6. A method as claimed in claim 1, wherein the line is one of a plurality of line types, said method further comprising determining a line type for each of the formed lines based 20 on accumulated statistics of the constituent line segments.
  6. 7. A computer-implementatble method of detecting a line of a plurality of types in a digital image, the digital image being processed to form connected components for adjacently located pixels of similar colour, the method comprising the steps of: 25 (a) forming a tiled line candidate bitmap of the digital image using line candidates, the line candidates being at least one of the connected components selected according to geometric properties of the at least one connected component; (b) detecting a line segment in each tile of the tiled line candidate bitmap, said line segment passing a reference point corresponding to the peak formed from a projection 30 profile of the tile with a measure being determined by a polar transform of the tile about the reference point; (c) forming at least one line based on the reference point and the measure of line segments for at least two adjacent tiles; and 5816796_1 P015747_specilodge -31 (d) determining a line type for each of the formed lines based on accumulated statistics of the constituent line segments.
  7. 8. A computer readable storage medium having a program recorded thereon, the 5 program being executable by computer apparatus to detect a line in a digital image, the digital image being processed to form connected components for adjacently located pixels of similar colour, the program comprising: code for forming a tiled line candidate bitmap of the digital image using line candidates, the line candidates being at least one of the connected components selected 10 according to geometric properties of the at least one connected component; code for detecting a line segment in each tile of the tiled line candidate bitmap, said line segment passing a reference point corresponding to the peak formed from a projection profile of the tile with a measure being determined by a polar transform of the tile about the reference point; and 15 code for forming a line based on the reference point and the measure of line segments for at least two adjacent tiles.
  8. 9. A computer readable storage medium as claimed in claim 8, wherein said code for forming a tiled line candidate bitmap comprises: 20 code for creating an empty line candidate bitmap corresponding to the digital image; code for selecting high-confidence line-like connected components; code for marking pixels corresponding to the selected high-confidence line-like connected components in the line candidate bitmap; 25 code for selecting low-confidence line-like connected components; code for marking pixels corresponding to the selected low-confidence line-like connected components in the line candidate bitmap; and code for dividing the line candidate bitmap into tiles to form the tiled line candidate bitmap. 30
  9. 10. A computer readable storage medium as claimed in claim 9, wherein said code for detecting a line segment in each tile comprises: code for determining at least one of a horizontal projection profile and a vertical 5816796_1 P015747_speci_lodge - 32 projection profile of the tile; code for locating a peak from the at least one horizontal and vertical projection profiles; code for determining a reference point based on the peak location and mid-point of 5 the tile; code for calculating a polar transform about the reference point; code for analysing peaks in each angle of the polar transform to find a line segment passing the reference point; and code for creating the measure for the line segment based on the angle and 10 distribution of distance from the reference point.
  10. 11. A computer readable storage medium as claimed in claim 8, wherein the code for forming of at least one said line comprises code for estimating a location of line segments of neighbouring tiles and, where one or more line segments exist at an estimated location, 15 code for linking at least the neighbouring line segments to form a line of constituent line segments.
  11. 12. A computer readable storage medium as claimed in claim 11, wherein the code for forming comprises code for iterating the forming to link a current line segment to a 20 previously formed line of constituent line segments.
  12. 13. A compter readable storage medium as claimed in claim 8, wherein the line is one of a plurality of line types, and said program further comprises code for determining a line type for each of the formed lines based on accumulated statistics of the constituent line 25 segments.
  13. 14. Computer appratus comprising: an input for receiving a digital document image; a memory in which the digital doument image is stored; 30 a processor coupled to the input and the memory and operative to store the document image in the memory and to process the digital document image to detect a line in the digital image, said processor being configured to process the digital image to: form connected components for adjacently located pixels of similar colour 5816796_1 P01 5747_specilodge - 33 in the digital document image; form a tiled line candidate bitmap of the digital image using line candidates, the line candidates being at least one of the connected components selected according to geometric properties of the at least one connected component; 5 detect a line segment in each tile of the tiled line candidate bitmap, said line segment passing a reference point corresponding to the peak formed from a projection profile of the tile with a measure being determined by a polar transform of the tile about the reference point; and form a line based on the reference point and the measure of line segments 10 for at least two adjacent tiles.
  14. 15. A method of detecting a line in a digital image, the method being substantially as described herein with refenrce to any opne of the embodiments as that embodiment is illustarted in the drawings. 15
  15. 16. A computer program adapted to perform the method of any one of claims 1 to 7 or 15.
  16. 17. Computer apparatus adapted to perform the method of any one of claims I to 7 20 or 15. Dated this 9th day of December 2011 CANON KABUSHIKI KAISHA 25 Patent Attorneys for the Applicant Spruson&Ferguson 5816796_1 P015747_specilodge
AU2011253930A 2011-12-09 2011-12-09 Document image line detector using tiles and projection Abandoned AU2011253930A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2011253930A AU2011253930A1 (en) 2011-12-09 2011-12-09 Document image line detector using tiles and projection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2011253930A AU2011253930A1 (en) 2011-12-09 2011-12-09 Document image line detector using tiles and projection

Publications (1)

Publication Number Publication Date
AU2011253930A1 true AU2011253930A1 (en) 2013-06-27

Family

ID=48670239

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2011253930A Abandoned AU2011253930A1 (en) 2011-12-09 2011-12-09 Document image line detector using tiles and projection

Country Status (1)

Country Link
AU (1) AU2011253930A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110634098A (en) * 2019-06-13 2019-12-31 眸芯科技(上海)有限公司 Lossless sparse image display method, device and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110634098A (en) * 2019-06-13 2019-12-31 眸芯科技(上海)有限公司 Lossless sparse image display method, device and system

Similar Documents

Publication Publication Date Title
US11694456B2 (en) Object detection and image cropping using a multi-detector approach
US10699146B2 (en) Mobile document detection and orientation based on reference object characteristics
US7894689B2 (en) Image stitching
US8532374B2 (en) Colour document layout analysis with multi-level decomposition
US8351691B2 (en) Object extraction in colour compound documents
JP4516778B2 (en) Data processing system
US8712188B2 (en) System and method for document orientation detection
US6839466B2 (en) Detecting overlapping images in an automatic image segmentation device with the presence of severe bleeding
EP2974261A2 (en) Systems and methods for classifying objects in digital images captured using mobile devices
US9066036B2 (en) Determining transparent fills based on a reference background colour
US20230005108A1 (en) Method and system for replacing scene text in a video sequence
US9424488B2 (en) Applying a segmentation engine to different mappings of a digital image
Fang et al. 1-D barcode localization in complex background
Liu et al. Detection and segmentation text from natural scene images based on graph model
Bhaskar et al. Implementing optical character recognition on the android operating system for business cards
AU2011253930A1 (en) Document image line detector using tiles and projection
Roullet et al. An automated technique to recognize and extract images from scanned archaeological documents
AU2014277851A1 (en) Detecting a gap between text columns from text line fragments
AU2015201663A1 (en) Dewarping from multiple text columns
Elmore et al. A morphological image preprocessing suite for ocr on natural scene images
US20220343464A1 (en) Extracting region of interest from scanned images and determining an associated image type thereof
AU2007249103B2 (en) Document analysis method
Safonov et al. Automatic Cropping and Deskew of Multiple Objects
CN116310351A (en) Image processing method, device and storage medium
Itani et al. Text Line Extraction Method Using Domain-Based Active Contour Model

Legal Events

Date Code Title Description
DA3 Amendments made section 104

Free format text: THE NATURE OF THE AMENDMENT IS: AMEND THE NAME OF THE INVENTOR TO READ MASON, TIMOTHY STEPHEN AND CHEN, YU-LING

MK4 Application lapsed section 142(2)(d) - no continuation fee paid for the application