SE522437C2

SE522437C2 - Method and apparatus for extracting information from a target area within a two-dimensional graphic object in an image

Info

Publication number: SE522437C2
Application number: SE0102021A
Authority: SE
Inventors: Karl Aastroem; Andreas Bjoerklund; Martin Sjoelin; Markus Andreasson
Original assignee: C Technologies Ab
Priority date: 2001-06-07
Filing date: 2001-06-07
Publication date: 2004-02-10
Also published as: WO2002099738A1; EP1412910A1; SE0102021D0; SE0102021L

Abstract

A method is presented for extracting information from a target area (101) within a two-dimensional graphical object (100) having a plurality of predetermined features (23) with known characteristics in a first plane. An image (102) is read where the object (100) is located in a second plane, which is a priori unknown. A plurality of candidates (108) to the features in the second plane are identified in the image. A transformation matrix (H) for projective mapping between the second and first planes is calculated from the identified feature candidates. The target area (101) of the object is transformed from the second plane into the first plane. Finally, the target area is processed so as to extract the information.

Description

l5 20 25 30 522 437 2 Den höga upplösningen hos dessa sensorer gör det möjligt att ta bilder av objekt med tillräckligt hög noggrannhet för att kunna behandla bilderna med tillfredsställande resultat. l5 20 25 30 522 437 2 The high resolution of these sensors makes it possible to take pictures of objects with sufficiently high accuracy to be able to process the pictures with satisfactory results.

En bild som tagits från en handhållen anordning ger emellertid upphov till rotations- och perspektiveffekter.However, an image taken from a handheld device gives rise to rotational and perspective effects.

För att således extrahera och tolka den önskade informa- tionen inom bilden, behövs en projicerande transforma- tion. En sådan projicerande transformation kräver åtmins- tone fyra olika punktöverensstämmelser där inga tre punkter ligger på samma linje.In order to thus extract and interpret the desired information within the image, a projecting transformation is needed. Such a projecting transformation requires at least four different point correspondences where no three points lie on the same line.

Sammanfattning av uppfinningen I ljuset av ovanstående, är ett ändamål med uppfin- ningen att underlätta detektion av ett känt tvàdimensio- nellt objekt i en bild för att möjliggöra extraktion av önskad information som är lagrad i ett målområde inom objektet, även om bilden registrerats i en oförutsägbar miljö, och således vid okänd vinkel, rotation och ljus- förhållanden.SUMMARY OF THE INVENTION In light of the above, an object of the invention is to facilitate detection of a known two-dimensional object in an image to enable extraction of desired information stored in a target area within the object, even if the image is recorded in a unpredictable environment, and thus at unknown angles, rotation and light conditions.

Ett annat ändamål år att åstadkomma ett universellt detektionsförfarande, vilket, med ett minimum av juste- ringar, är anpassningsbart till en mängd kända objekt. Ännu ett ändamål är att åstadkomma ett detektions- förfarande, vilket är effektivt i termer av beräknings- kraft och minnesanvändning, och vilket således är sär- skilt lämpat för handhållna bildregistreringsanordningar.Another object is to achieve a universal detection method, which, with a minimum of adjustments, is adaptable to a number of known objects. Yet another object is to provide a detection method which is efficient in terms of computing power and memory usage, and which is thus particularly suitable for handheld image recording devices.

De ovannämnda ändamålen uppnås i allmänhet genom ett förfarande och en anordning enligt de bifogade självstän- diga patentkraven.The above objects are generally achieved by a method and an apparatus according to the appended independent claims.

Enligt uppfinningen åstadkoms således ett förfarande för extraktion av information från ett målområde inom ett tvàdimensionellt grafiskt objekt som har ett flertal för- 10 l5 20 25 30 Lﬂ PJ DJ $> QR] *J 3 utbestämda särdrag med, i ett första plan, känd karakte- ristik. Förfarandet innefattar stegen: att läsa en bild i vilken nämnda objekt är placerat i ett andra plan, varvid nämnda andra plan är på förhand okänt, att i nämnda bild identifiera ett flertal kandidater till nämnda förutbestämda särdrag i nämnda andra plan, att från nämnda identifierade flertal särdrags- kandidater, beräkna en transformationsmatris för projice- rande avbildning mellan nämnda andra och första plan, att transformera nämnda målområde på nämnda objekt från nämnda andra plan till nämnda första plan, och att behandla nämnda målområde för att extrahera nämnda information.According to the invention, there is thus provided a method for extracting information from a target area within a two-dimensional graphic object having a plurality of predetermined features with, in a first plane, known character - ristik. The method comprises the steps of: reading an image in which said object is placed in a second plane, said second plane being unknown in advance, identifying in said image a plurality of candidates for said predetermined features in said second plane, that from said identified plurality feature candidates, calculating a transformation matrix for projecting imaging between said second and first planes, transforming said target area on said object from said second plane to said first plane, and processing said target area to extract said information.

Anordningen enligt uppfinningen är med fördel en handhållen anordning som används för detektion och tolk- ning av ett känt tvàdimensionellt objekt i form av en skylt i en enda bild, vilken registreras vid okänd vin- kel, rotation och ljusförhållanden. För att lokalisera den kända skylten i en sådan bild identifieras specifika särdrag hos skylten. Särdragsidentifikationen baseras företrädesvis på skyltens kanter. Detta möjliggör en lös- ning som är anpassningsbar till de flesta redan existe- rande skyltar, eftersom särdragen är sä generella som möjligt och gemensamma för de flesta skyltar. För att hitta linjer som är baserade på skyltens kanter, används företrädesvis en kantdetektor baserad på den Gaussiska kärnan. När alla kantpunkter identifierats, grupperas de till linjer. Den Gaussiska kärnan används med fördel också för att lokalisera kantpunkternas gradient. Hörn- punkterna på insidan av kanterna används sedan som sär- dragspunktkandidater. Dessa hörnpunkter erhålls då de linjer som löper längs kanterna skär varandra. 10 15 20 25 30 LW BJ NJ $> OJ *J 4 I en alternativ utföringsform kan, om det finns andra mycket framträdande särdrag på skylten (exempelvis punkter med en specifik gråskala, färg, intensitet eller luminescens), dessa användas istället för eller utöver kanterna, eftersom sàdana framträdande särdrag är enklare att detektera.The device according to the invention is advantageously a hand-held device used for detection and interpretation of a known two-dimensional object in the form of a sign in a single image, which is registered at an unknown angle, rotation and light conditions. In order to locate the known sign in such an image, specific features of the sign are identified. The feature identification is preferably based on the edges of the sign. This enables a solution that is adaptable to most existing signs, as the features are as general as possible and common to most signs. To find lines based on the edges of the sign, an edge detector based on the Gaussian core is preferably used. Once all edge points have been identified, they are grouped into lines. The Gaussian core is also used to advantage to locate the gradient of the edge points. The corner points on the inside of the edges are then used as special feature candidates. These corner points are obtained when the lines running along the edges intersect. 10 15 20 25 30 LW BJ NJ $> OJ * J 4 In an alternative embodiment, if there are other very prominent features on the sign (eg points with a specific grayscale, color, intensity or luminescence), these can be used instead of or in addition to edges, as such prominent features are easier to detect.

Så snart en specifik mängd särdragskandidater har identifierats, exekveras en algoritm baserad på den algo- ritm som är allmänt känd som RANSAC, för att verifiera att särdragen har rätt uppställning och för att beräkna en transformationsmatris. Efter att ha säkerställt att särdragen befinner sig i den korrekta geometriska upp- ställningen, kan något målområde hos objektet transfor- meras, extraheras och tolkas med exempelvis en tolk för OCR eller streckkoder.Once a specific set of feature candidates has been identified, an algorithm based on the algorithm commonly known as RANSAC is executed to verify that the features are set correctly and to calculate a transformation matrix. After ensuring that the features are in the correct geometric arrangement, any target area of the object can be transformed, extracted and interpreted with, for example, an interpreter for OCR or barcodes.

Andra ändamål, kännetecken och fördelar med före- liggande uppfinning kommer att framgå av den följande detaljerade beskrivningen, av de bifogade osjälvständiga kraven, såväl som av ritningarna.Other objects, features and advantages of the present invention will become apparent from the following detailed description, from the appended dependent claims, as well as from the drawings.

Kort beskrivning av ritningarna En föredragen utföringsform av föreliggande uppfin- ning kommer nu att beskrivas närmare, under hänvisning till de bifogade ritningarna.Brief Description of the Drawings A preferred embodiment of the present invention will now be described in more detail, with reference to the accompanying drawings.

Fig l är en schematisk vy av en bildregistrerings- anordning enligt uppfinningen, i form av en handhållen anordning.Fig. 1 is a schematic view of an image recording device according to the invention, in the form of a hand-held device.

Fig la visar såväl bildregistreringsanordningen i fig l som en datormiljö, i vilken anordningen kan an- vändas.Fig. 1a shows both the image recording device in Fig. 1 and a computer environment in which the device can be used.

Fig 2 är ett blockschema, som illustrerar viktiga delar hos den i fig 1 visade bildregistreringsanord- ningen. <2 ii." : r '~. 10 15 20 25 30 522 437 5 Fig 3 är ett flödesschema som illustrerar de över- gripande steg som utförs genom förfarandet enligt upp- finningen.Fig. 2 is a block diagram illustrating important parts of the image recording apparatus shown in Fig. 1. <2 ii. ": R '~. 10 15 20 25 30 522 437 5 Fig. 3 is a flow chart illustrating the overall steps performed by the method according to the invention.

Fig 4 är ett flödesschema som i närmare detalj illu- strerar ett av stegen i fig 3.Fig. 4 is a flow chart illustrating in more detail one of the steps in Fig. 3.

Fig 5 är ett diagram för illustration av en mask för utjämning och härledning, vilken anbringas pà en regi- strerad bild under ett steg i det i fig 3 och 4 illustre- rade förfarandet.Fig. 5 is a diagram for illustrating a mask for smoothing and derivation, which is applied to a recorded image during a step in the method illustrated in Figs. 3 and 4.

Fig 6-17 är foton som illustrerar behandling av en registrerad bild under olika steg av det i fig 3 och 4 illustrerade förfarandet.Figs. 6-17 are photographs illustrating processing of a recorded image during different steps of the method illustrated in Figs. 3 and 4.

Detaljerad beskrivning av en föredragen utföringsform Ãterstoden av denna specifikation har följande upp- lägg: I sektion A ges en allmän översikt av förfarandet och anordningen enligt den föredragna utföringsformen.Detailed Description of a Preferred Embodiment The remainder of this specification has the following layout: Section A provides a general overview of the method and apparatus of the preferred embodiment.

För att bättre förstå det material som täcks av denna specifikation, ges en introduktion till projiceran- de geometri i termer av homogen notation och beskrivs en kameraprojektionsmatris i sektion B.To better understand the material covered by this specification, an introduction to projecting geometry in terms of homogeneous notation is given and a camera projection matrix is described in section B.

Sektion C ger en förklaring av hur transformations- matrisen eller homografimatrisen kan erhållas när över- ensstämmelser i särdragspunkter har identifierats.Section C provides an explanation of how the transformation matrix or homography matrix can be obtained when similarities in feature points have been identified.

En förklaring av vilken sorts särdrag som bör väl- jas, och varför, àterfinns i sektion D.An explanation of what kind of feature should be chosen, and why, can be found in section D.

Sektion E beskriver en linjedetektionsalgoritm.Section E describes a line detection algorithm.

Sektion F ger en beskrivning av den typ av informa- tion som kan erhållas fràn linjer.Section F provides a description of the type of information that can be obtained from lines.

Så snart särdragspunkterna har identifierats kan homografimatrisen beräknas, vilket görs med hjälp av en RANSAC-algoritm, såsom förklaras i sektion G. 5 lO 15 20 25 30 522 437 6 Sektion H beskriver hur den önskade informationen kan extraheras från målområdet.Once the feature points have been identified, the homography matrix can be calculated, which is done using a RANSAC algorithm, as explained in section G. Section H describes how the desired information can be extracted from the target area.

Slutligen berör sektion I nägra alternativa utfö- ringsformer.Finally, section I touches on some alternative embodiments.

A. Allmän översikt En föredragen utföringsform kommer nu att beskrivas, där det objekt som skall kännas igen och läsas från är en skylt 100, som visas vid botten av fig 1. Det skall emel- lertid betonas att uppfinningen inte är begränsad till endast skyltar. Skylten 100 är avsedd att se ut som vil- ken skylt som helst. Màlomràdet 101, fràn vilket informa- tionen skall extraheras och tolkas, är området med siff- rorna "12345678", och indikeras i fig 1 av en streckad ram. Sàsom inses, innehåller inte skylten 100 särskilt mycket information som kan användas som särdrag.A. General Overview A preferred embodiment will now be described, in which the object to be recognized and read from is a sign 100, which is shown at the bottom of Fig. 1. However, it should be emphasized that the invention is not limited to signs only. The sign 100 is intended to look like any other sign. The target area 101, from which the information is to be extracted and interpreted, is the area with the numbers "12345678", and is indicated in Fig. 1 by a dashed frame. As will be appreciated, the plate 100 does not contain much information which can be used as a feature.

Som med mànga andra skyltar, omges skylten 100 av en ram. Kanterna på denna ram ger upphov till linjer. Den föredragna utföringsformen baseras pà att dessa linjer används som särdrag. Emellertid kan vilken typ av särdrag som helst användas så länge som åtminstone totalt fyra särdragspunkter kan urskiljas. Om skylten innehåller nägra specifika särdrag (exempelvis punkter med en speci- fik färg), kan dessa användas istället för, eller utöver, ramen, eftersom de vanligtvis är enklare att detektera.As with many other signs, the sign 100 is surrounded by a frame. The edges of this frame give rise to lines. The preferred embodiment is based on the use of these lines as features. However, any type of feature can be used as long as at least a total of four feature points can be distinguished. If the sign contains some specific features (for example, dots with a specific color), these can be used instead of, or in addition to, the frame, as they are usually easier to detect.

Fig 1 illustrerar en bildalstrande handhållen an- ordning 300, vilken implementerar anordningen enligt den föredragna utföringsformen, och med hjälp av vilken för- farandet enligt den föredragna utföringsformen kan ut- föras. Den handhàllna anordningen 300 har ett hölje 1 med ungefär samma form som en konventionell överstryknings- penna. En kortsida av höljet har ett fönster 2, genom vilket bilder registreras för olika bildbaserade funk- tioner hos den handhàllna anordningen. 10 15 20 25 30 522 457 7 I huvudsak innehåller höljet 1 en optikdel, en elektronikdel och en strömförsörjning.Fig. 1 illustrates an image generating handheld device 300, which implements the device according to the preferred embodiment, and by means of which the method according to the preferred embodiment can be performed. The handheld device 300 has a housing 1 having approximately the same shape as a conventional highlighter. A short side of the housing has a window 2, through which images are registered for various image-based functions of the handheld device. 10 15 20 25 30 522 457 7 Essentially, the housing 1 contains an optics part, an electronics part and a power supply.

Optikdelen innefattar ett antal ljuskällor 6, såsom lysdioder, ett linssystem 7 och en optisk bildsensor 8, vilken utgör gränssnitt med elektronikdelen. Lysdioderna 6 är avsedda att belysa en yta pà objektet (skylten) 100, vilken vid varje ögonblick ligger inom synhåll för fönst- ret 2. Linssystemet 7 är avsett att, så korrekt som möj- ligt, projicera en bild av ytan pà den ljuskänsliga sen- sorn 8. Den optiska sensorn 8 kan bestà av en areasensor, sàsom en CMOS-sensor eller en CCD-sensor med en inbyggd A/D-omvandlare. Sådana sensorer är kommersiellt till- gängliga. Den optiska sensorn 8 kan åstadkomma VGA-bilder ("Video Graphics Array") med en upplösning pà 640x480 och 24 bitars färgdjup. Sàledes bildar optikdelen en digital kamera.The optical part comprises a number of light sources 6, such as LEDs, a lens system 7 and an optical image sensor 8, which interfaces with the electronics part. The LEDs 6 are intended to illuminate a surface of the object (sign) 100, which at any moment is within sight of the window 2. The lens system 7 is intended to project, as accurately as possible, an image of the surface of the photosensitive sensor. sensor 8. The optical sensor 8 may consist of an area sensor, such as a CMOS sensor or a CCD sensor with a built-in A / D converter. Such sensors are commercially available. The optical sensor 8 can produce VGA ("Video Graphics Array") images with a resolution of 640x480 and 24 bit color depth. Thus, the optics part forms a digital camera.

I detta exempel är den handhàllna anordningens 300 strömförsörjning ett batteri 12, men alternativt kan den vara en nätanslutning (visas ej).In this example, the power supply of the handheld device 300 is a battery 12, but alternatively it may be a mains connection (not shown).

Som visas närmare i fig 2, innefattar elektronikde- len en behandlingsanordning 20 med lagringsorgan 21. Be- handlingsanordningen 20 kan implementeras genom en kom- mersiellt tillgänglig mikroprocessor, såsom en CPU ("Central Processing Unit") eller en DSP ("Digital Signal Processor"). Alternativt kan behandlingsanordningen 20 implementeras som en ASIC ("Application-Specific Inte- grated Circuit"), som diskreta analoga och digitala kom- ponenter, eller någon kombination därav.As shown in more detail in Fig. 2, the electronics part comprises a processing device 20 with storage means 21. The processing device 20 can be implemented by a commercially available microprocessor, such as a CPU ("Central Processing Unit") or a DSP ("Digital Signal Processor"). "). Alternatively, the processing device 20 may be implemented as an ASIC ("Application-Specific Integrated Circuit"), as discrete analog and digital components, or any combination thereof.

Lagringsorganet 21 innefattar olika typer av minne, (RAM) cierade program 22 för att utföra förfarandet enligt den sàsom ett arbetsminne och ett läsminne (ROM). Asso- föredragna utföringsformen lagras i lagringsorganet 21.The storage means 21 comprises various types of memory (RAM) programmed programs 22 for performing the method according to it such as a working memory and a read only memory (ROM). Asso-preferred embodiment is stored in the storage means 21.

Dessutom innefattar lagringsorganet 21 en uppsättning 10 15 20 25 30 LH RJ BJ J> LN \J 8 objektsärdragsdefinitioner 23 och en uppsättning inre kameraparametrar 24, vars syfte kommer att beskrivas närmare längre fram. Registrerade bilder lagras i ett omrâde 25 i lagringsorganet 21.In addition, the storage means 21 comprises a set of object feature definitions 23 and a set of internal camera parameters 24, the purpose of which will be described in more detail later. Registered images are stored in an area 25 in the storage means 21.

Som visas i fig la kan den handhàllna anordningen 300 vara ansluten till en dator 200 via en överförings- länk 301. Datorn 200 kan vara en vanlig persondator med kretsar och program, vilka möjliggör kommunikation med den handhàllna anordningen 300 genom ett kommunikations- gränssnitt 210. I detta syfte kan elektronikdelen också innefatta en sändtagare 26 för överföring av information till/fràn datorn 200. Sändtagaren 26 är företrädesvis anpassad för korthàllsradiokommunikation i enlighet med exempelvis Bluetooth-standarden pá 2,4 GHz-ISM-bandet ("Industrial, Scientific and Medical"). Sändtagaren kan emellertid, alternativt, vara anpassad för infraröd kommunikation (såsom IrDA - "Infrared Data Association", som indikeras av streckade linjer vid 26') eller trad- baserad seriell kommunikation (såsom RS232, vilken indi- keras av streckade linjer vid 26"), eller i stort sett vilken som helst tillgänglig standard för korthàllskom- munikation mellan en handhållen anordning och en dator.As shown in Fig. 1a, the handheld device 300 may be connected to a computer 200 via a transmission link 301. The computer 200 may be a standard personal computer with circuits and programs which enable communication with the handheld device 300 through a communication interface 210. For this purpose, the electronics part may also comprise a transceiver 26 for transmitting information to / from the computer 200. The transceiver 26 is preferably adapted for short-range radio communication in accordance with, for example, the Bluetooth standard on the 2.4 GHz ISM band ("Industrial, Scientific and Medical"). "). The transceiver may, however, alternatively be adapted for infrared communication (such as IrDA - "Infrared Data Association", which is indicated by dashed lines at 26 ') or wired serial communication (such as RS232, which is indicated by dashed lines at 26'). ), or virtually any available standard for short-range communication between a handheld device and a computer.

Elektronikdelen innefattar vidare knappar 27, med hjälp av vilka användaren kan styra den handhàllna anord- ningen 300 och specifikt skifta mellan dess olika funk- tionalitetsmoder.The electronics part further comprises buttons 27, by means of which the user can control the hand-held device 300 and specifically switch between its various modes of functionality.

Valbart kan den handhàllna anordningen 300 innefatta en bildskärm 28, såsom en flytande-kristall-bildskärm (LCD).Optionally, the handheld device 300 may include a monitor 28, such as a liquid crystal display (LCD).

I samband med föreliggande uppfinning är, såsom visas i fig 3, den handhàllna anordningens 300 allmänt viktiga funktion, att identifiera ett känt tvàdimensio- nellt objekt 100 i en bild, vilken registreras av den 10 15 20 25 30 9 handhàllna anordningen 300 vid en okänd vinkel, rotation och belysning (steg 31-33 i fig 3). Så snart det två- dimensionella objektet har identifierats i den regist- rerade bilden, bestäms en transformationsmatris (steg 34 i fig 3) i syfte att genom projektion transformera (steg 35 i fig 3) målområdet 101 inom den registrerade bilden av det tvådimensionella objektet 100 till ett plan som är lämpligt för vidare behandling av informationen inom mål- området.In the context of the present invention, as shown in Fig. 3, the generally important function of the handheld device 300 is to identify a known two-dimensional object 100 in an image which is registered by the handheld device 300 at an unknown angle, rotation and illumination (steps 31-33 in Fig. 3). Once the two-dimensional object has been identified in the registered image, a transformation matrix (step 34 in Fig. 3) is determined for the purpose of projecting (step 35 in Fig. 3) the target area 101 within the registered image of the two-dimensional object 100 by projection. to a plan that is suitable for further processing of the information within the target area.

Màlomràdet 101 kan helt enkelt transformeras till ett förutbestämt första plan, vilket kan vara normal- planet hos den handhàllna anordningens 300 optiska in- axel, så att det verkar som att bilden registrerades rakt framför den handhàllna anordningens 300 fönster 2, sna- rare än vid en okänd vinkel och rotation.The target area 101 can simply be transformed into a predetermined first plane, which may be the normal plane of the optical axis of the handheld device 300, so that it appears that the image was recorded directly in front of the window 2 of the handheld device 300, rather than at an unknown angle and rotation.

Det första planet innefattar ett antal särdrag, vilka kan användas för transformationen. Dessa särdrag kan erhållas direkt från det fysiska objektet 100, för att avbildas genom direkta mätningar vid själva objektet.The first plane comprises a number of features which can be used for the transformation. These features can be obtained directly from the physical object 100, to be imaged by direct measurements at the object itself.

Ett annat sätt att erhålla sådan information är att ta en bild av objektet och att mäta själva bilden.Another way to obtain such information is to take a picture of the object and to measure the image itself.

Slutligen behandlas det transformerade målområdet genom exempelvis optisk teckenigenkänning (OCR) eller streckkodstolkning, för att extrahera den eftersökta informationen (steg 36 och 37 i fig 3). I detta syfte innehåller den föredragna utföringsformen åtminstone en av en OCR-modul 29 och en streckkodsmodul 29'. Med fördel kan sådana moduler 29 eller 29' implementeras som pro- gramkod 22, vilken lagras i lagringsorganet 21 samt exe- kveras av behandlingsanordningen 20.Finally, the transformed target area is processed by, for example, optical character recognition (OCR) or bar code interpretation, to extract the desired information (steps 36 and 37 in Fig. 3). To this end, the preferred embodiment includes at least one of an OCR module 29 and a bar code module 29 '. Advantageously, such modules 29 or 29 'can be implemented as program code 22, which is stored in the storage means 21 and executed by the processing device 20.

Den extraherade informationen kan användas på många olika sätt, antingen internt i den handhàllna anordningen 10 15 20 25 30 522 437 10 300, eller externt i datorn 200, efter att ha överförts över överföringslänken 301.The extracted information can be used in many different ways, either internally in the handheld device 10 22 20 25 30 522 437 10 300, or externally in the computer 200, after being transmitted over the transmission link 301.

Icke begränsande exempel på användningsfall inne- fattar en nattvakt som kontrollerar var och när under hans nattskift som han befann sig vid olika platser, genom infàngst av bilder av i allmänhet identiska skyltar 100 som innehåller olika information, när han går runt det skyddade området, ett affärsbiträde som använder den handhàllna anordningen 300 i inventeringssyften, spàrning av gods i industriomràden, eller för registrering av registreringsnummer för bilar och andra fordon.Non-limiting examples of use include a night watchman who checks where and when during his night shift he was at different locations, by capturing images of generally identical signs 100 containing different information as he walks around the protected area, a sales assistant using the handheld device 300 for inventory purposes, tracking goods in industrial areas, or for registering registration numbers for cars and other vehicles.

Den handhållna anordningen 300 kan med fördel åstad- komma andra bildbaserade tjänster, såsom skannerfunktio- nalitet och musfunktionalitet.The handheld device 300 can advantageously provide other image-based services, such as scanner functionality and mouse functionality.

Skannerfunktionaliteten används för registrering av text. Användaren för inenheten 300 över den text som han vill registrera. Den optiska sensorn 8 registrerar bilder med partiellt överlappande innehåll. Bilderna sätts samman av behandlingsanordningen 20. Varje tecken i den sammansatta bilden lokaliseras och med hjälp av exempel- vis mjukvaran för neurala nätverk i behandlingsanord- ningen 20 bestäms dess motsvarande ASCII-tecken. Den text som på detta sätt konverteras till teckenkodat format kan lagras, i form av en textsträng, i den handhållna anord- ningen 300 eller överföras över länken 301 till datorn 200. Skannerfunktionaliteten beskrivs närmare i sökandens publikation nr WO98/20446, vilken häri införlivas genom denna hänvisning.The scanner functionality is used to register text. The user moves the input device 300 over the text he wants to register. The optical sensor 8 detects images with partially overlapping content. The images are composed of the processing device 20. Each character in the composite image is located and by means of, for example, the software for neural networks in the processing device 20, its corresponding ASCII character is determined. The text thus converted to character-coded format can be stored, in the form of a text string, in the handheld device 300 or transmitted over the link 301 to the computer 200. The scanner functionality is described in more detail in the applicant's publication No. WO98 / 20446, which is incorporated herein by this reference.

Musfunktionaliteten kan användas för att styra en markör på datorns 200 bildskärm 201. När den handhållna anordningen 300 förflyttas över en yttre basyta, regist- rerar den optiska sensorn 8 ett flertal partiellt över- lappande bilder. Behandlingsanordningen 20 bestämmer 10 15 20 25 30 522 437 ll positionssignaler för datorns 200 markör baserat på de registrerade bildernas relativa positioner, vilka bestäms med hjälp av bildernas innehåll. Musfunktionaliteten beskrivs närmare i sökandens publikation nr WO99/60469, vilken härmed införlivas genom denna hänvisning.The mouse functionality can be used to control a cursor on the computer screen 200 of the computer 200. When the handheld device 300 is moved over an outer base surface, the optical sensor 8 detects a plurality of partially overlapping images. The processing device 20 determines position signals for the marker of the computer 200 based on the relative positions of the recorded images, which are determined by the content of the images. The mouse functionality is described in more detail in the applicant's publication No. WO99 / 60469, which is hereby incorporated by reference.

Ytterligare bildbaserade tjänster kan åstadkommas av den handhàllna anordningen 300, exempelvis traditionell bild- eller videokamerafunktionalitet, ritverktyg, över- sättning av skannad text, adressbok, kalender eller e~post/fax/SMS ("Short Messages Services") via en mobil- telefon, såsom en GSM-telefon ("Global System for Mobile Communications", vilket inte visas i fig 1).Additional image-based services can be provided by the handheld device 300, such as traditional image or camcorder functionality, drawing tools, translation of scanned text, address book, calendar or Short Messages Services via a mobile phone , such as a GSM telephone ("Global System for Mobile Communications", not shown in Fig. 1).

B. Projicerande geometri Detta kapitel introducerar de huvudsakliga geomet- riska idéer och beteckningar som är nödvändiga för för- ståelse av det material som täcks i återstoden av denna specifikation.B. Projecting Geometry This chapter introduces the main geometric ideas and designations necessary for understanding the material covered in the remainder of this specification.

Introduktion Vid euklidisk geometri, kan koordinatparet (LJO i den euklidiska rymden R2 representera en punkt i real- planet. Det är således vanligt att identifiera ett plan med R2. Om man betraktar R2 som en vektorrymd, kan koordinater identifieras som vektorer. Denna sektion kommer att introducera homogen representation för punkter och linjer i ett plan. Den homogena representationen åstadkommer en konsekvent beteckning för projicerande avbildningar av punkter och linjer. Denna beteckning kommer att användas för att förklara avbildningar mellan olika representationer av plan.Introduction In Euclidean geometry, the coordinate pair (LJO in Euclidean space R2 can represent a point in the real plane. Thus, it is common to identify a plane with R2. If one considers R2 as a vector space, coordinates can be identified as vectors. This section will to introduce homogeneous representation of points and lines in a plane.The homogeneous representation provides a consistent designation for projecting images of points and lines.This designation will be used to explain images between different representations of planes.

Homogena koordinater En linje i ett plan representeras av ekvationen ax+by+c==0, där olika val på a, b och c ger upphov till olika linjer. Vektorrepresentationen av denna linje är "' íf-šš lâcïi (j: åífšš lšíllíišfﬁë -Éší-i. ÉWI íïx-'íß 10 15 20 25 12 l=(mb4ﬂT. Å andra sidan representerar även ekvationen (hÛx+(MÛy+kc=O samma linje för en konstant k, skild från noll. Således är överensstämmelsen mellan linjer och vektorer inte identisk, eftersom tvà vektorer som är relaterade genom en övergripande skalning anses vara lika. En ekvivalensklass med vektorer under detta ekviva- lensförhállande är känd som homogena vektorer. Uppsätt- ningen ekvivalensklasser för vektorer i R3 -{0ßÅDT bildar den projicerande rymden P2. Beteckningen -(OQJDT betyder att vektorn (OQÅDT är utesluten.Homogeneous coordinates A line in a plane is represented by the equation ax + by + c == 0, where different choices on a, b and c give rise to different lines. The vector representation of this line is "'íf-šš lâcïi (j: åífšš lšíllíišf ﬁ ë -Éší-i. ÉWI íïx-'íß 10 15 20 25 12 l = (mb4 ﬂ T. On the other hand, also represents the equation (hÛx + (MÛy + kc = O the same line for a constant k, different from zero.Thus, the correspondence between lines and vectors is not identical, since two vectors related by an overall scaling are considered to be equal.A class of equivalents with vectors under this equivalence ratio is known as homogeneous vectors. The set of equivalence classes for vectors in R3 - {0ßÅDT forms the projecting space P2. The designation - (OQJDT means that the vector (OQÅDT is excluded.

En punkt som representeras av vektorn x=(x¿0T ligger pà linjen l=(mb¿jT om och endast om ax+by+c==0. Denna ekvation kan skrivas som en skalärprodukt av tvá vekto- rer, (x¿@UQLhcf~=0. Här är punkten representerad som en tre-vektor (LyJ), genom addition av en sista koordinat pà l till 2-vektorn. Med användning av samma terminologi som ovan, noterar vi att (knk%kjQLhcf'=0, vilket betyder att vektorn kQLyJ) representerar samma punkt som Qnyl) för alla konstanter k som är skilda från noll. Således kan vektoruppsättningen kU¿%DT betraktas som den homogena representationen av punkten (x¿ÛT i R2. En godtycklig homogen vektor som representerar en punkt har formen T x=(x1,x2,x3) . xl xz T ' 2 Denna vektor representerar punkten ( x, x) 1 R , 3 3 om x3#O.A point represented by the vector x = (x¿0T lies on the line l = (mbøjt if and only if ax + by + c == 0. This equation can be written as a scalar product of two vectors, (x¿ @ UQLhcf ~ = 0. Here the point is represented as a three-vector (LyJ), by adding a last coordinate of l to the 2-vector Using the same terminology as above, we note that (knk% kjQLhcf '= 0 , which means that the vector kQLyJ) represents the same point as Qnyl) for all constants k that are non-zero, thus the vector set kU¿% DT can be considered as the homogeneous representation of the point (x¿ÛT in R2. An arbitrary homogeneous vector representing a point has the form T x = (x1, x2, x3) xl xz T '2 This vector represents the point (x, x) 1 R, 3 3 if x3 # 0.

En punkt som representeras som en homogen vektor är således också en del av den projicerande rymden P2. Ett specialfall av en punkt x=(xUxpxQT i P2 är när x3=O.A point represented as a homogeneous vector is thus also part of the projecting space P2. A special case of a point x = (xUxpxQT in P2 is when x3 = 0.

Detta representerar inte en finit punkt R2. I P2 är dessa (F. C931 t flm: lO 15 20 25 522 457 13 punkter kända som idealpunkter, eller oändlighetspunkter.This does not represent a finite point R2. In P2, these (F. C931 t flm: 10 15 15 25 25 522 457 13 points are known as ideal points, or infinity points.

Uppsättningen av samtliga idealpunkter representeras av x=(xvxP0fÄ Denna uppsättning ligger pá en enda linje som är känd som oändlighetslinjen och betecknas av vektorn La=(QOJY . Genom beräkningar kan man verifiera att ljx = (o,o,1)(x1,x2 ,o)f = o .The set of all ideal points is represented by x = (xvxP0fÄ This set lies on a single line known as the infinity line and is denoted by the vector La = (QOJY. Through calculations one can verify that ljx = (o, o, 1) (x1, x2 , o) f = o.

Homografier eller projicerade avbildningar När punkter avbildas frán ett plan till ett annat är det slutgiltiga målet att hitta en enda funktion som avbildar varje punkt från det första planet unikt till en punkt i det andra planet.Homographs or projected images When points are mapped from one plane to another, the ultimate goal is to find a single function that images each point from the first plane uniquely to a point in the second plane.

En projektivitet är en inverterbar avbildning h fràn P2 -+ P2 sàdan att xl, xz och x3 ligger pà samma linje om och endast om hßq), hßq) och hßg) gör det (se Hartley, R. och Zissermann, A., "Multiple View Geometry in computer vision", Cambridge University Press, 2000). En projekti- vitet kallas också för en kolineärisering, en projektiv transformering eller en homografi.A projectivity is an invertible mapping h from P2 - + P2 such that x1, xz and x3 are on the same line if and only if hßq), hßq) and hßg) do so (see Hartley, R. and Zissermann, A., " Multiple View Geometry in computer vision ", Cambridge University Press, 2000). A projectivity is also called a colonization, a projective transformation or a homography.

Denna avbildning kan också skrivas som Mk)=Lﬁg där x, MX) e P2 och H'är en icke-singulär 3x3-matris. H kallas en homografimatris. Från och med nu kommer vi att beteckna x'=h(x), vilket ger: eller enbart x'=Hx. 10 15 20 25 522 437 14 Eftersom både x' och x är homogena representationer av punkter, kan H förändras genom multiplicering av en godtycklig konstant skild fràn noll utan att homografi- transformationen ändras. Detta betyder att H endast be- stäms upp till en skala. En matris som denna kallas en homogen matris. Följaktligen har H endast åtta frihets- grader, och skalan kan väljas så att en av dess bestånds- delar (exempelvis hg) kan antas vara 1. Om emellertid koordinatsystemets origo av H avbildas till en punkt vid oändligheten, kan det bevisas att h9=0 och skalning av H sä att h9=0 kan därför leda till instabila resultat. Ett annat sätt att välja en representation för en homografi- matris är att kräva att beloppet ßﬂ=1.This image can also be written as Mk) = L ﬁ g where x, MX) e P2 and H 'are a non-singular 3x3 matrix. H is called a homography matrix. From now on we will denote x '= h (x), which gives: or only x' = Hx. Since both x 'and x are homogeneous representations of points, H can be changed by multiplying any arbitrary constant from zero without changing the homography transformation. This means that H is only determined up to a scale. A matrix like this is called a homogeneous matrix. Consequently, H has only eight degrees of freedom, and the scale can be chosen so that one of its constituents (for example, hg) can be assumed to be 1. However, if the coordinate system's origin of H is mapped to a point at infinity, it can be proved that h9 = 0 and scaling H so that h9 = 0 can therefore lead to unstable results. Another way to select a representation for a homography matrix is to require that the amount ß ﬂ = 1.

Kameraprojektionsmatris En kamera är en avbildning fràn den tredimensionella världen till den tvàdimensionella bilden. Denna avbild- ning kan beskrivas som: x P11 P12 P13 P14 y = P21 P22 P23 P24 Z P31 P32 Paz P34 ~N'~<:>< eller kortare, x==PX. X är den homogena representationen av punkten i den tredimensionella koordinatramen. x är motsvarande homogena representation av punkten i den två- dimensionella koordinatramen. P är den homogena 3x4 kame- raprojektionsmatrisen. För en fullständig härledning av P (se Hartley, R., and Zissermann, A., "Multiple View Geo- 10 20 25 522 457 15 metry in computer vision", Cambridge University Press, 2000, sid 139-144), där kameraprojektionsmatrisen för den grundläggande hàlkameran härleds. P kan faktoriseras som: P=1 I detta fall är K den 3x3-kalibreringsmatris, som innehåller kamerans inre parametrar. R är en 3x3-rota- tionsmatris och t är 3xl translateringsvektorn. Denna faktorisering kommer att användas nedan.Camera projection matrix A camera is an image from the three-dimensional world to the two-dimensional image. This image can be described as: x P11 P12 P13 P14 y = P21 P22 P23 P24 Z P31 P32 Paz P34 ~ N '~ <:> <or shorter, x == PX. X is the homogeneous representation of the point in the three-dimensional coordinate frame. x is the corresponding homogeneous representation of the point in the two-dimensional coordinate frame. P is the homogeneous 3x4 camera projection matrix. For a complete derivation of P (see Hartley, R., and Zissermann, A., "Multiple View Geometry in Computer Vision", Cambridge University Press, 2000, pp. 139-144), where the camera projection matrix for the basic holding camera is derived. P can be factorized as: P = 1 In this case, K is the 3x3 calibration matrix, which contains the camera's internal parameters. R is a 3x3 rotation matrix and t is the 3xl translation vector. This factorization will be used below.

Pà plan Antag att vi endast är intresserade av att avbilda punkter fràn den världskoordinatram som ligger i samma plan n. Eftersom vi är fria att välja vàr världskoordi- natram som vi önskar, kan vi exempelvis definiera n:Z==O.On plane Assume that we are only interested in mapping points from the world coordinate frame that is in the same plane n. Since we are free to choose our world coordinate frame as we wish, we can, for example, define n: Z == O.

Detta reducerar ekvationen ovan. Om vi betecknar kolum- nerna i kameraprojektionsmatrisen med p,, får vi: x y :bl P2 P3 P4 1 .X :bn P2 Pa Y' 1 Avbildningen mellan punkterna x”=(ÅQYJ)T pà n, och deras motsvarande punkter pà bilden x', är en vanlig plan homografi x'=Hx,,, där H=[p1 pz p,].This reduces the equation above. If we denote the columns in the camera projection matrix with p ,, we get: xy: bl P2 P3 P4 1 .X: bn P2 Pa Y '1 The image between the points x ”= (ÅQYJ) T on n, and their corresponding points on the image x ', is a common plane homography x' = Hx ,,, where H = [p1 pz p,].

Ytterligare begränsningar Om vi har en kalibrerad kamera, kommer kalibrerings- matrisen K att vara känd och vi kan erhålla ytterligare information. Eftersom P=1 \ 1131:; 20:12 üjä ífäië-§.f;f';«<: 10 15 20 25 30 522 437 16 och kalibreringsmatrisen K är icke inverterbar, kan vi erhålla: K_1P=R[I i "t]=K_liP1 P2 p: P4i=K_l[h1 hz P3 hai- De två första kolumnerna i rotationsmatrisen R är ekvivalenta med de tvà första kolumnerna hos K"Uï. Om man betecknar dessa tvà kolumner med r, och rz, får vi: [fl rzi=K4|ïhl hzi' Eftersom rotationsmatrisen är ortogonal, bör Q och 13 vara ortogonala och av samma längd. Som vi emellertid har nämnt förut, är H endast bestämd upp till skala, vilket betyder att Q och Q inte kommer att vara norma- liserade, utan att de fortfarande skall vara av samma längd.Additional limitations If we have a calibrated camera, the calibration matrix K will be known and we can obtain additional information. Since P = 1 \ 1131 :; 20:12 üjä ífäië-§.f; f '; «<: 10 15 20 25 30 522 437 16 and the calibration matrix K is not invertible, we can obtain: K_1P = R [I i" t] = K_liP1 P2 p: P4i = K_l [h1 hz P3 hai- The first two columns of the rotation matrix R are equivalent to the first two columns of K "Uï. If we denote these two columns by r, and rz, we get: [fl rzi = K4 | ïhl hzi 'Since the rotation matrix is orthogonal, Q and 13 should be orthogonal and of the same length. However, as we have mentioned before, H is only determined up to scale, which means that Q and Q will not be normalized, but will still be of the same length.

Slutsats: Med en kalibrerad kamera erhàller vi tvà ytterligare begränsningar pà H: där [fl rzizK-lihx hz] ~ C. Lösning för homografimatrisen H Det första som skall beaktas när man löser ekvatio- nen för homografimatrisen H, är hur många motsvarande punkter x' éè x som behövs. Som nämndes i sektion B, har H àtta frihetsgrader. Eftersom vi arbetar i 2D har varje punkt begränsningar i två riktningar, och således har varje punktöverensstämmelse tvà frihetsgrader. Detta betyder att en nedre gräns av fyra överensstämmande punkter i de två olika koordinatramarna behövs för att ~ » 5:22: far? -§=:=»;~ "*;';=f;':r 522 437 17 beräkna homografimatrisen H. Denna sektion kommer att visa olika sätt att lösa ekvationen för H.Conclusion: With a calibrated camera we obtain two additional constraints on H: where [fl rzizK-lihx hz] ~ C. Solution for the homography matrix H The first thing to consider when solving the equation for the homography matrix H is how many corresponding points x 'éè x needed. As mentioned in section B, H has eight degrees of freedom. Because we work in 2D, each point has limitations in two directions, and thus each point correspondence has two degrees of freedom. This means that a lower limit of four matching points in the two different coordinate frames is needed to ~ »5:22: father? -§ =: = »; ~" *; '; = f;': r 522 437 17 calculate the homography matrix H. This section will show different ways of solving the equation for H.

Algoritmen för direkt linjär transformering (DLT) För varje punktöverensstämmelse har vi ekvationen 5 x'i=Hx,. . Notera att eftersom vi arbetar med homogena vektorer kan x',. och Hxi skilja sig upp till skala.The algorithm for direct linear transformation (DLT) For each point correspondence we have the equation 5 x'i = Hx ,. . Note that since we are working with homogeneous vectors, x ',. and Hxi differ in scale.

Ekvationen kan också uttryckas som en vektorkryssprodukt x',.> skalfaktorn kommer att avlägsnas. Om vi betecknar den 10 jzte raden i H med hfT, så kan Hxl. uttryckas som: h1Tx.The equation can also be expressed as a vector cross product x ',.> The scale factor will be removed. If we denote the 10th row in H with hfT, then Hxl. expressed as: h1Tx.

I Hx- = hzïx.I Hx- = hzïx.

I I h3TX.I I h3TX.

I Om man använder samma terminologi som i sektion B, kan kryssprodukten ovan uttryckas som: 3T 2T y: h Xi _W'i h Xi x',.xHx,. = w',.h1Tx,. -xßhaïxi = 0 . 2T lr x'ih Xí-yßh X,- 15 Eftersom hfTxi=xirhj för j=1..3, kan vi arrangera om ekvationen och erhålla: OT - w',. xf y', xf h* x',.> - ' xT x'.x.T OT h3 I l I I Vi stàr nu inför tre linjära ekvationer med åtta obekanta element (de nio elementen i H minus ett pà grund få? Lffnåé ïf}:".âï_=_1i ÉPÉ iä'-C-'f:ï';Ii§Äﬁ.?2 íšïšíïnšwšrfi-'JZ 10 15 20 25 522 437 18 av skalfaktorn). Eftersom emellertid den tredje raden är linjärt beroende av de övriga två raderna, tillhandahål- ler endast tvà av ekvationerna användbar information.Using the same terminology as in section B, the cross product above can be expressed as: 3T 2T y: h Xi _W'i h Xi x ', .xHx ,. = w ',. h1Tx ,. -xßhaïxi = 0. 2T lr x'ih Xí-yßh X, - 15 Since hfTxi = xirhj for j = 1..3, we can rearrange the equation and obtain: OT - w ',. xf y ', xf h * x',.> - 'xT x'.xT OT h3 I l II We now face three linear equations with eight unknown elements (the nine elements in H minus one due to few? Lffnåé ïf} : ". âï _ = _ 1i ÉPÉ iä'-C-'f: ï '; Ii§Ä ﬁ.? 2 íšïšíïnšwšrfi-'JZ 10 15 20 25 522 437 18 of the scale factor). Since, however, the third line is linearly dependent on the others two lines, provides only two of the equations useful information.

Därför ger varje punktöverensstämmelse oss två ekvatio- ner. Om vi använder fyra punktöverensstämmelser kommer vi att få àtta ekvationer med àtta okända element. Detta system kan nu lösas med hjälp av gaussisk eliminering.Therefore, each point correspondence gives us two equations. If we use four point correspondences, we will have eight equations with eight unknown elements. This system can now be solved using Gaussian elimination.

Ett annat sätt att lösa systemet kan vara med hjälp av SVD, sàsom beskrivs nedan.Another way to solve the system can be with the help of SVD, as described below.

Singulärvärdesfaktorisering (SVD) I det verkliga livet fär vi vanligtvis inte punk- ternas positioner att vara exakta, pà grund av brus i bilden. Lösningen till H kommer således att vara inexakt.Singular value factorization (SVD) In real life, we usually do not get the positions of the points to be exact, due to noise in the image. The solution to H will thus be inaccurate.

För att få ett H som är mer noggrant, kan vi använda fler än fyra punktöverensstämmelser och sedan lösa ett över- bestämt system. Om à andra sidan punkterna är exakta, kommer systemet att ge upphov till ekvationer som är linjärt beroende av varandra, varvid vi återigen kommer att ha àtta ekvationer som är linjärt oberoende.To get an H that is more accurate, we can use more than four point matches and then solve a definite system. If, on the other hand, the points are exact, the system will give rise to equations that are linearly dependent on each other, whereby we will again have eight equations that are linearly independent.

Om vi har n tal för punktöverensstämmelse, kan vi beteckna ekvationsuppsättningen med Ah==0, där A är 2nx9 matris, och Ett sätt att lösa detta system är genom minimering av den euklidiska normalen "Ah" istället, under begräns- ningen "ML=k, där k är en konstant skild från noll. Denna sista begränsning är pà grund av att H är homogen. Mini- 10 15 20 25 30 522 457 19 mering av normalen "Ah" är samma som optimering av pro- blemet: nﬁnAh.If we have n numbers for point agreement, we can denote the equation set with Ah == 0, where A is 2nx9 matrix, and One way to solve this system is by minimizing the Euclidean normal "Ah" instead, under the constraint "ML = k, where k is a constant from zero This last limitation is due to H being homogeneous Minimizing the normal "Ah" is the same as optimizing the problem: n ﬁ nAh.

Hﬂﬂ En lösning pà detta problem kan erhållas genom SVD.H ﬂﬂ A solution to this problem can be obtained through SVD.

En detaljerad beskrivning av SVD ges i Golub, G. H., och Van Loan, C. F., "Matrix Computations", 3:e upplagan, The John Hopkins University Press, Baltimore, MD, 1996.A detailed description of SVD is given in Golub, G. H., and Van Loan, C. F., "Matrix Computations", 3rd Edition, The John Hopkins University Press, Baltimore, MD, 1996.

Med hjälp av SVD kan matrisen A faktoriseras till: A==USVT, där den sista kolumnen med V ger lösningen pà h.With the help of SVD, the matrix A can be factorized to: A == USVT, where the last column with V gives the solution of h.

Restriktioner pà de överensstämmande punkterna Om tre punkter, av de fyra punktöverensstämmelserna, är kolinjära, kommer de att ge upphov till ett under- bestämt system (se Hartley, R., och Zissermann, A., "Multiple View Geometry in computer vision", Cambridge University Press, 2000, sid 74), och lösningen från SVD kommer att degenerera. Vi kommer därmed att vara begrän- sade när vi väljer vara särdragspunkter, sà att vi inte väljer kolinjära punkter.Constraints on the corresponding points If three points, of the four point similarities, are collinear, they will give rise to a sub-determined system (see Hartley, R., and Zissermann, A., "Multiple View Geometry in computer vision"). Cambridge University Press, 2000, page 74), and the solution from SVD will degenerate. We will thus be limited when we choose to be feature points, so that we do not choose collinear points.

D. Särdragsrestriktioner En viktig fråga är hur man ska hitta särdrag i objekt. Eftersom resultaten företrädesvis antas vara applicerbara pà redan existerande skyltar, är det önsk- värt att hitta särdrag vilka ofta används och som är enkla att detektera på en bild. Ett bra särdrag skall uppfylla sà många som möjligt av följande kriterier.D. Feature Restrictions An important question is how to find features in objects. Since the results are preferably assumed to be applicable to already existing signs, it is desirable to find features which are often used and which are easy to detect on an image. A good feature must meet as many of the following criteria as possible.

Vara enkelt att detektera; Vara enkelt att särskilja; Vara placerat i en användbar uppställning.Be easy to detect; Be easy to distinguish; Be placed in a useful arrangement.

I denna sektion hittas nàgra olika typer av särdrag som kan användas för beräkning av homografimatrisen H. 10 15 20 25 30 522 437 20 Särdragen skall på något vis vara associerade med punk- ter, eftersom punktöverensstämmelserna används för att beräkna H1 Program för att finna särdrag, där användaren bara kan ändra några få konstanter, implementeras enligt föreliggande uppfinning och lagras i området 23 för defi- nition av objektssärdrag i lagringsorganet 21, för att anpassa särdragssökaren för specifika objekt.In this section you will find some different types of features that can be used to calculate the homography matrix H. 10 15 20 25 30 522 437 20 The features should in some way be associated with points, since the point matches are used to calculate H1 Program to find features , where the user can only change a few constants, is implemented according to the present invention and stored in the area 23 for defining object features in the storage means 21, to adapt the feature finder for specific objects.

Ett mycket vanligt särdrag i de flesta skyltar är olika kombinationer av linjer. De flesta skyltar omges av en kant, vilken ger upphov till en linje. Många skyltar har också ramar kring dem, vilket ger upphov till dubbla linjer som är parallella. Oavsett vilken typ av särdrag som hittas, är det viktigt att samla så mycket informa- tion som möjligt från varje enskilt särdrag. Eftersom linjer är ofta använda särdrag, kommer en beskrivning av hur olika typer av linjer kan hittas att ges i sektion E.A very common feature in most signs is different combinations of lines. Most signs are surrounded by an edge, which gives rise to a line. Many signs also have frames around them, giving rise to double lines that are parallel. Regardless of the type of feature found, it is important to gather as much information as possible from each individual feature. Since lines are frequently used features, a description of how different types of lines can be found will be given in section E.

Antal särdrag Eftersom bilderna år av 2D-plan och har fångats av en handhållen kamera 300, är scen- och bildplanen rela- terade genom en planprojektiv transformation. I sektion C kom vi fram till att åtminstone fyra punktöverensstäm- melser behövs för beräkning av H. Om fyra punkter i scenplanet och de fyra motsvarande punkterna på bilden hittas, så kan H beräknas. Problemet är att vi inte vet om vi har de korrekta motsvarande punkterna. Därför måste en verifieringsprocedur för att kontrollera huruvida H är korrekt utföras. För att göra detta kan H verifieras med ännu fler punktöverensstämmelser. Om kameran är kalibre- rad, kan en verifiering av H med kamerans inre parametrar såsom förklaras vid slutet av sektion B. 24 utföras, Restriktioner på linjer I 2D har linjer tvà frihetsgrader och på samma sätt som med punkter kan fyra linjer - där inga tre linjer 10 15 20 25 30 5222 437 21 sammanfaller - användas för beräkning av homografimatri- sen. Emellertid màste beräkningen modifieras litet, eftersom linjer transformeras som P=I¥4l, till skillnad mot punkter som transformeras som x%=Hk, för samma homo- grafimatris H (se Hartley, R., och Zissermann, A., "Multiple View Geometry in computer vision", Cambridge University Press, 2000, sid 15).Number of features Since the images are of 2D plane and have been captured by a handheld camera 300, the scene and image plane are related through a plane projective transformation. In section C, we concluded that at least four point matches are needed to calculate H. If four points in the stage plane and the four corresponding points in the image are found, then H can be calculated. The problem is that we do not know if we have the correct corresponding points. Therefore, a verification procedure to check whether H is correct must be performed. To do this, H can be verified with even more point matches. If the camera is calibrated, a verification of H with the camera's internal parameters as explained at the end of section B. 24 can be performed. Restrictions on lines I 2D have lines two degrees of freedom and in the same way as with dots can four lines - where no three lines 10 15 20 25 30 5222 437 21 coincide - used to calculate the homography matrix. However, the calculation needs to be slightly modified, since lines are transformed as P = I ¥ 4l, as opposed to points that are transformed as x% = Hk, for the same homography matrix H (see Hartley, R., and Zissermann, A., "Multiple View Geometry in computer vision ", Cambridge University Press, 2000, p. 15).

Det är till och med möjligt att blanda särdrags- punkter och -linjer när man beräknar homografimatrisen.It is even possible to mix feature points and lines when calculating the homography matrix.

Det finns emellertid ytterligare nägra begränsningar när man gör detta, eftersom punkter och linjer är beroende av varandra. Såsom bevisats i sektion C, har fyra punkter, och pà samma sätt fyra linjer, åtta frihetsgrader. Tre linjer och en punkt är geometriskt ekvivalent med fyra punkter, eftersom tre icke sammanfallande linjer definie- rar en triangel, och triangelns hörnpunkter unikt defi- nierar tre punkter. Pà samma sätt är tre icke kolinjära punkter och en linje ekvivalent med fyra linjer, vilka har åtta frihetsgrader. Två punkter och tvà linjer kan emellertid inte användas för beräkning av homografimatri- sen. Anledningen är att totalt fem linjer och fem punkter kan bestämmas unikt utifràn tvà punkter och tvá linjer.However, there are some additional limitations when doing this, as points and lines are interdependent. As proved in section C, four points, and in the same way four lines, have eight degrees of freedom. Three lines and a point are geometrically equivalent to four points, since three non-coincident lines define a triangle, and the corner points of the triangle uniquely define three points. In the same way, three non-collinear points and one line are equivalent to four lines, which have eight degrees of freedom. However, two points and two lines cannot be used to calculate the homography matrix. The reason is that a total of five lines and five points can be determined uniquely from two points and two lines.

Problemet är emellertid att fyra av fem linjer är sam- manfallande och att fyra av de fem punkterna är kolin- jära. Dessa två system degenererar sàledes och kan inte användas för beräkning av homografimatrisen.The problem, however, is that four out of five lines coincide and that four of the five points are collinear. These two systems thus degenerate and cannot be used to calculate the homography matrix.

Välj hörnpunkter I den föredragna utföringsformen används inte lin- jernas ekvation i samband med beräkning av homografi- matrisen. Istället beräknas linjernas skärningspunkter, och sàledes används endast dessa punkter i beräkningarna.Select corner points In the preferred embodiment, the equation of the lines is not used in connection with the calculation of the homography matrix. Instead, the intersection points of the lines are calculated, and thus only these points are used in the calculations.

En av anledningarna att göra detta är pà grund av propor- tionerna mellan koordinaterna (a, b och c) hos linjerna. lO 15 20 25 30 5212 437 22 I en bild med VGA-upplösning, kommer värdena pà en norma- liserad linjes koordinater att vara 0ShL b|s1, mån o s |c{ s «/64o2 + 4802 = soo _ Detta betyder att c-koordinaten inte står i propor- tion till a- och b-koordinaterna. Effekten av detta blir att en liten variation hos linjens gradient (dvs a- och b-koordinaterna) kan resultera i en stor variation av komponenten c. Detta gör det svàrt att verifiera linje- överensstämmelser.One of the reasons for doing this is due to the proportions between the coordinates (a, b and c) of the lines. lO 15 20 25 30 5212 437 22 In an image with VGA resolution, the values of the coordinates of a normalized line will be 0ShL b | s1, if os | c {s «/ 64o2 + 4802 = soo _ This means that c the coordinates are not in proportion to the a and b coordinates. The effect of this is that a small variation of the line gradient (ie the a- and b-coordinates) can result in a large variation of component c. This makes it difficult to verify line matches.

Problemet med dessa proportionerliga koordinater försvinner inte när linjernas skärningspunkter används istället för linjernas parameter, det flyttas bara. Detta är bara ett sätt att normalisera parametrar, sä att de enkelt kan jämföras med varandra vid verifieringsproce- duren.The problem with these proportional coordinates does not disappear when the intersection points of the lines are used instead of the parameters of the lines, it is just moved. This is just a way to normalize parameters, so that they can be easily compared with each other during the verification procedure.

E. Linjedetektion Med hänvisning till fig 4 och 5 kommer nu detaljer om hur man bestämmer kandidater till särdragspunkter (dvs steg 33 i fig 3). Steg 41 och 42 i fig 4 beskrivs i denna sektion, medan steg 43 kommer att beskrivas i nästa sektion.E. Line Detection With reference to Figs. 4 and 5, details will now be given on how to determine candidates for feature points (ie step 33 in Fig. 3). Steps 41 and 42 in Fig. 4 will be described in this section, while steps 43 will be described in the next section.

Kanter definieras som punkter där bildens gradienter är stora i termer av gràskala, färg, intensitet eller luminescens. Sá snart alla kantpunkter i en bild har er- hållits, kan de analyseras för att avgöra hur många av dem som ligger pä en rak linje. Dessa punkter kan sedan användas som grund för en linje.Edges are defined as points where the gradients of the image are large in terms of grayscale, color, intensity or luminescence. Once all edge points in an image have been obtained, they can be analyzed to determine how many of them are in a straight line. These points can then be used as the basis for a line.

Extraktion av kantpunkter Det finns flera olika sätt att extrahera punkter fràn bilden. De flesta baseras pä tröskling, omràdestill- 10 15 20 25 522 437 23 växt, omràdesuppdelning och sammanslagning ("region growing, region splitting and merging") (se Gonzalez, R.Extraction of border points There are several ways to extract points from the image. Most are based on threshing, area growth, region splitting and merging (see Gonzalez, R.).

C., och Woods, R. E., “Digital Image Processing", Addison Wesley, Reading, MA, 1993, sid 414). I praktiken är det vanligt att köra en mask genom bilden. Definitionen av en kant är skärningspunkten mellan tvà olika homogena regio- ner. Således är maskerna vanligtvis baserade på beräkning av en lokal derivata. Digitala bilder absorberar vanligt- vis en obestämd mängd brus, som ett resultat av sampling.C., and Woods, RE, "Digital Image Processing", Addison Wesley, Reading, MA, 1993, page 414). In practice, it is common to run a mask through the image. The definition of an edge is the point of intersection between two different homogeneous regions Thus, the masks are usually based on the calculation of a local derivative, and digital images usually absorb an indefinite amount of noise as a result of sampling.

Därför är en utjämningsmask också att föredra före den deriverande masken, i syfte att reducera bruset. En ut- jämningsmask, som ger mycket trevliga resultat, är den gaussiska kärnan Ga: G@(x)=---:- där G är standardavvikelsen (eller kärnans bredd) och X är avståndet från den punkt som är under utredning.Therefore, a smoothing mask is also preferable to the derivative mask, in order to reduce the noise. An equalization mask, which gives very nice results, is the Gaussian nucleus Ga: G @ (x) = ---: - where G is the standard deviation (or the width of the nucleus) and X is the distance from the point under investigation.

Istället för att först köra en utjämningsmask över bilden och sedan derivera den, är det fördelaktigt att bara ta bildens faltning med derivatan hos den gaussiska kärnan: íGa'(x)=__x_ 1 Öx 02 \/2n'o'2 e-xz/Zaz Fig 5 visar åGJx) för a=1.2 .Instead of first running a smoothing mask over the image and then deriving it, it is advantageous to just take the convolution of the image with the derivative of the Gaussian core: íGa '(x) = __ x_ 1 Öx 02 \ / 2n'o'2 e-xz / Zaz Fig. 5 shows åGJx) for a = 1.2.

Eftersom bilder är 2D, används filtret i bàde x- och y-riktningarna. För att särskilja kantpunkterna n, väljs de filtrerade punkterna f(n), dvs resultatet av bildens faltning med derivatan av den gaussiska kärnan, där g; ,. lO 15 20 25 522 437 24 f("-1) f(ﬂ)> f(ﬂ+1), thres där thres är en vald tröskel.Since images are 2D, the filter is used in both the x and y directions. To distinguish the edge points n, the filtered points f (n) are selected, i.e. the result of the convolution of the image with the derivative of the Gaussian core, where g; ,. lO 15 20 25 522 437 24 f ("- 1) f (ﬂ)> f (ﬂ + 1), threshold where threshold is a selected threshold.

I fig 7 är samtliga de kantpunkter som detekterats fràn en ursprungsbild 102 (fig 6) markerade med ett "+"- tecken, vilket indikeras av hänvisningsbeteckningen 103.In Fig. 7, all the edge points detected from an original image 102 (Fig. 6) are marked with a "+" sign, as indicated by the reference numeral 103.

En gaussisk kärna med 0ﬂ=L2 och thres=5 har använts här.A Gaussian core with 0ﬂ = L2 and threshold = 5 has been used here.

Extraktion av linjeinformation Sá snart alla punkter har erhållits, är det möjligt att hitta ekvationen för den linje som de kan utgöra en del av. Gradienten hos en punkt i bilden är en vektor som pekar i den riktning, i vilken bildintensiteten vid den aktuella punkten minskar mest. Denna vektor är i samma riktning som normalen till den möjliga linjen. Därför mäste samtliga kantpunkters gradient hittas. För att extrahera X-koefficienten hos kantpunkten, anbringas derivatan av den gaussiska kärnan i 2D, Û X 2 z 2 :G x, :__ _e-(x +y )/2o' I öx °( JO 2ﬂa4 på bilden kring kantpunkterna. I denna mask är (my) avståndet från kantpunkten. _ _ n _ -3a Vanligtvis anvands ett omrade pà -3a där G är standardavvikelsen.Extraction of line information Once all points have been obtained, it is possible to find the equation of the line of which they may form a part. The gradient of a point in the image is a vector that points in the direction in which the image intensity at the current point decreases the most. This vector is in the same direction as the normal to the possible line. Therefore, the gradient of all edge points must be found. To extract the X coefficient of the edge point, the derivative of the Gaussian core is applied in 2D, Û X 2 z 2: G x,: __ _e- (x + y) / 2o 'I öx ° (JO 2ﬂ a4 in the image around the edge points. In this mask, (my) is the distance from the edge point. _ _ N _ -3a Usually an area of -3a is used where G is the standard deviation.

Pá samma sätt kan y-koefficienten extraheras. Sàsom nämnts ovan har linjens normal samma riktning som gra- dienten. Således har a- och b-koefficienterna för linjen erhàllits. Den sista koordinaten c kan enkelt beräknas, eftersom ax+by+c==0. Företrädesvis kommer linjens ekva- 10 15 20 25 522 457 25 tion att vara normaliserad, sà att linjens normal kommer att ha längden 1: __ (a,b,c)T JM +b2) ' Detta betyder att c-koordinaten kommer att ha samma l värde som avståndet fràn linjen till origo.In the same way, the y coefficient can be extracted. As mentioned above, the normal of the line has the same direction as the gradient. Thus, the a and b coefficients of the line have been obtained. The last coordinate c can be easily calculated, since ax + by + c == 0. Preferably, the equation of the line will be normalized, so that the normal of the line will have the length 1: __ (a, b, c) T JM + b2) 'This means that the c-coordinate will have same l value as the distance from the line to the origin.

Klustring av kantpunkter till linjer För att ta reda pà om kantpunkter utgör del av en linje, mäste begränsningar pà punkterna tillämpas. Det finns tvà huvudsakliga begränsningar: - Punkterna skall ha samma gradient.Clustering of edge points to lines To find out if edge points form part of a line, restrictions on the points must be applied. There are two main constraints: - The points must have the same gradient.

- Den föreslagna linjen skall löpa genom punkterna.- The proposed line shall run through the points.

Eftersom bilden kommer att vara oskarp, behöver dessa begränsningar endast uppfyllas inom gränsen för en viss tröskel. Tröskeln kommer naturligtvis att bero pà under vilka omständigheter bilden togs, bildens upplös- ning samt objektet på bilden. Eftersom samtliga data för punkterna är kända är det enda som behöver göras att gruppera punkterna tillsammans och anpassa linjer till dem (steg 42 i fig 4). Följande algoritm används enligt den föredragna utföringsformen: För ett visst antal slingor, Steg 1: Välj slumpvist en punkt p=(x¿%DT, med linjedata l=(mb¿ÛT; Steg 2: Hitta, med hjälp av pnï-l övriga punkter pn=(rvyMDT, med linjedata ln=(aMbMcnYﬂ vilka ligger på samma linje; Steg 3: Se, med hjälp av (aMbn)(mbf'>(L-%nm2), om dessa punkter har samma gradient som p; 3 - 22-1; *IÉ 1 Pa-*Jçi CÉÉ 2 š5..~:=ÉÉ_“-_'¿fï I* l iii) f: i *'-'=»*"<=:zi;'=ïf.' 'F Û ITT ~ 13 -- il:- ï-íš-í _ ”fi” lO 15 20 25 30 522 437 26 Steg 4: Anpassa, med hjälp av SVD, en ny linje l=(mb¿ﬂT utifrån alla de punkter som uppfyller villkoren i steg 2 och steg 3, pn Upprepa steg 2-3; Steg 5: Upprepa steg 2-4 tvà gånger; Steg 6: Om det finns åtminstone en viss mängd punk- ter som uppfyller dessa villkor, definiera dessa punkter till att utgöra en linje; Slut. Repetera med àterstàende punkter.Since the image will be blurry, these restrictions only need to be met within the limit of a certain threshold. The threshold will of course depend on the circumstances under which the image was taken, the resolution of the image and the object in the image. Since all the data for the points are known, all that needs to be done is to group the points together and align lines to them (step 42 in Fig. 4). The following algorithm is used according to the preferred embodiment: For a certain number of loops, Step 1: Randomly select a point p = (x¿% DT, with line data l = (mb¿ÛT; Step 2: Find, using pnï-l others points pn = (rvyMDT, with line data ln = (aMbMcnY ﬂ which lie on the same line; Step 3: See, using (aMbn) (mbf '> (L-% nm2), if these points have the same gradient as p; 3 - 22-1; * IÉ 1 Pa- * Jçi CÉÉ 2 š5 .. ~: = ÉÉ _ “-_ '¿fï I * l iii) f: i *' - '=» * "<=: zi;' = ïf. '' F Û ITT ~ 13 - il: - ï-íš-í _ ”fi” lO 15 20 25 30 522 437 26 Step 4: Adjust, using SVD, a new line l = (mb¿ ﬂ T based on all the points that meet the conditions in step 2 and step 3, pn Repeat steps 2-3; Step 5: Repeat steps 2-4 twice; Step 6: If there are at least a certain number of points that meet these conditions, define these points to form a line; End.Repeat with the remaining points.

Denna algoritm väljer slumpvist en punkt. Ekvationen för den linje, som denna punkt kan utgöra en del av, är redan känd. Nu hittar algoritmen alla andra punkter som har samma gradient och ligger pà samma linje som den första punkten. Bàda dessa kontroller mäste utföras inom ett visst tröskelvärde. I steg 2 kontrollerar algoritmen om punkten ligger närmare linjen än avståndet thresl. I steg 3 kontrollerar algoritmen om de tvà punkternas gra- dienter är samma. Om de är det, skall produkten av gra- dienterna vara 1. Återigen är det, pä grund av onoggrann- het, tillräckligt om produkten är större än Üf4hnß2).This algorithm randomly selects a point. The equation for the line of which this point may form is already known. Now the algorithm finds all other points that have the same gradient and are on the same line as the first point. Both of these checks must be performed within a certain threshold value. In step 2, the algorithm checks if the point is closer to the line than the distance thresl. In step 3, the algorithm checks whether the gradients of the two points are the same. If they are, the product of the gradients must be 1. Again, due to inaccuracy, it is sufficient if the product is greater than Üf4hnß2).

Eftersom kantpunkterna inte är exakt placerade, och eftersom gradienterna inte kommer att ha det exakta värdet, beräknas i steg 4 en ny linje. Denna linje be- räknas, med hjälp av SVD, pä följande sätt, utifrån samt- liga de punkter, som uppfyller villkoren i steg 2 och steg 3. Punkterna skall också uppfylla villkoret (x¿¿D@Lb¿ÛT=4l Således kan en nx3-matris bestående av dessa punkter sättas samman, och i likhet med sektion C optimeras med hjälp av SVD nﬁnAl, IMS för att erhålla bättre noggrannhet upprepas steg 2 och steg 3. För att ytterligare öka noggrannheten inträffar 10 15 20 25 30 522 437 27 ännu en rekursion. Såsom inses av fackmannen, kommer vär- dena pà tröskeltalen att behöva bestämmas beroende på en verklig tillämpning.Because the edge points are not exactly positioned, and because the gradients will not have the exact value, a new line is calculated in step 4. This line is calculated, using SVD, in the following way, based on all the points that meet the conditions in step 2 and step 3. The points must also meet the condition (x¿¿D @ Lb¿ÛT = 4l Thus, an nx3 matrix consisting of these points is assembled, and like section C is optimized using SVD n ﬁ nAl, IMS to obtain better accuracy, step 2 and step 3 are repeated. To further increase the accuracy, 10 15 20 25 30 522 437 occurs As will be appreciated by those skilled in the art, the values of the threshold numbers will need to be determined depending on an actual application.

Fig 8 visar de linjer 104 som hittades, samt kant- punkterna 103 som användes i exemplet ovan.Fig. 8 shows the lines 104 found, as well as the edge points 103 used in the example above.

Om de använda kantpunkterna utelämnas, är det enk- lare att se hur bra approximeringen av de skattade lin- jerna är, se fig 9.If the edge points used are omitted, it is easier to see how good the approximation of the estimated lines is, see Fig. 9.

F. Från linjerna utvunnen information För att beräkna homografimatrisen H behövs fyra mot- svarande punkter fràn de tvâ koordinatramarna. Eftersom mànga linjer är tillgängliga, kan ytterligare information åstadkommas.F. Information extracted from the lines To calculate the homography matrix H, four corresponding points are needed from the two coordinate frames. Since many lines are available, additional information can be provided.

Krysspunkter Hörn utgör vanliga särdrag pá skyltar. Det finns emellertid vanligtvis mänga hörn pà en skylt, vilka inte är intressanta, om det exempelvis finns text pà skylten, kommer bokstäverna att ge upphov till en massa hörn som är ointressanta. När nu de linjer som utgörs av kanter har erhållits, kan kanternas hörnpunkter enkelt beräknas (steg 43 i fig 4) genom att ta kryssprodukten för tvà linjer: xc=li> Vektorn XC kommer att vara den homogena represen- tanten för den punkt i vilken linjerna L och lj skär var- andra. Om den tredje koordinaten pà xc=0, så är xc oänd- lighetspunkten, och linjerna L och lj är parallella.Crossing points Corners are common features on signs. However, there are usually many corners on a sign that are not interesting, for example if there is text on the sign, the letters will give rise to a lot of corners that are uninteresting. Now that the lines consisting of edges have been obtained, the corner points of the edges can be easily calculated (step 43 in Fig. 4) by taking the cross product for two lines: xc = li> The vector XC will be the homogeneous representative of the point at which lines L and lj intersect. If the third coordinate of xc = 0, then xc is the point of infinity, and the lines L and lj are parallel.

Dessa krysspunkter, kombinerat med informationen från linjerna, kommer att åstadkomma ännu mer informa- tion. En verifikation av huruvida linjerna verkligen har kantpunkter vid krysspunkterna, eller huruvida skärnings- punkten ligger i linjernas förlängning, kan utföras. 10 15 20 25 30 522 437 28 Denna information kan sedan jämföras med de särdrags- punkter som söks, eftersom information rörande huruvida de skall ha kantpunkter vid krysspunkterna är känd. På detta sätt kan krysspunkter som är ointressanta elimi- neras. Punkter som är ointressanta kan ha olika ursprung.These intersections, combined with the information from the lines, will provide even more information. A verification of whether the lines really have edge points at the intersection points, or whether the intersection point is in the extension of the lines, can be performed. 10 15 20 25 30 522 437 28 This information can then be compared with the feature points sought, since information regarding whether they should have edge points at the intersections is known. In this way, crossroads that are uninteresting can be eliminated. Items that are uninteresting can have different origins.

En möjlighet är att de utgör krysspunkter som skall vara där, men som i detta specifika fall inte används. En annan möjlighet är att de alstras av linjer, vilka inte skall finnas, men vilka ändå har uppstått pà grund av störande element i bilden.One possibility is that they constitute crossroads that should be there, but which in this specific case are not used. Another possibility is that they are generated by lines, which should not exist, but which have nevertheless arisen due to disturbing elements in the image.

I fig 10 har alla krysspunkter markerats med ett "+"-tecken, såsom syns vid 105. Ramens verkliga hörn är markerade med ett "*"-tecken, såsom syns vid 106.In Fig. 10, all intersections are marked with a "+" sign, as seen at 105. The actual corners of the frame are marked with a "*" sign, as seen at 106.

Parallella linjer Ett annat vanligt särdrag på skyltar är ramar, vilka ger upphov till parallella linjer. Om endast linjer som har sitt ursprung i ramar är intressanta, kan alla linjer som inte har en parallell motsvarighet, dvs en linje med en näraliggande normal i motsatt riktning, kasseras.Parallel lines Another common feature of signs is frames, which give rise to parallel lines. If only lines originating in frames are interesting, all lines that do not have a parallel equivalent, ie a line with a nearby normal in the opposite direction, can be discarded.

Eftersom linjen är transformerad, kan parallella linjer i 3D-värden tyckas vara icke-parallella pà 2D-bildscenen.Because the line is transformed, parallel lines in 3D values may appear to be non-parallel on the 2D image scene.

Emellertid kommer linjer som ligger nära varandra fort- farande att vara parallella, inom en viss felmarginal.However, lines that are close to each other will still be parallel, within a certain margin of error.

Resultatet av en algoritm som hittar parallella linjer 107, 107' visas i fig ll.The result of an algorithm which finds parallel lines 107, 107 'is shown in Fig. 11.

När samtliga uppsättningar parallella linjer har hittats, är det möjligt att lista ut vilka linjer som är kandidater till att vara en linje som svarar mot en rams inre kant. Om kryssprodukterna för samtliga dessa linjer beräknas, erhålles en uppsättning punkter som är möjliga kandidater till inre hörnpunkter i en ram, såsom marke- rats med "*"-tecken vid 108 i fig 12.Once all the sets of parallel lines have been found, it is possible to figure out which lines are candidates for being a line corresponding to the inner edge of a frame. If the cross products for all these lines are calculated, a set of points is obtained which are possible candidates for inner corner points in a frame, as marked with "*" characters at 108 in Fig. 12.

ZílíIÃi» <3? -víïfš fZê .Z r ïšïšïš? 10 15 20 25 30 522 437 29 Fortlöpande kantpunkter Av en tillfällighet, är det möjligt att linjedetek- tionsalgoritmen alstrar en linje som faktiskt bestär av en massa smä kanter som ligger pä en rak linje. Exempel- vis kan kanter pà bokstäver som skrivits pä en rak linje ge upphov till en sådan linje. Om endast linjer som be- står av fortlöpande kantpunkter är intressanta, är det önskvärt att eliminera dessa andra linjer. Ett sätt att göra detta är att ta medelpunkten för samtliga kant- punkter pà linjen. Från denna punkt extrapoleras ytter- ligare nägra punkter längs linjen. Nu kontrolleras skill- naden i intensitet pà båda sidor av linjen vid de valda punkterna. Om intensitetsskillnaderna vid punkterna inte överskrider ett visst tröskelvärde, är linjen inte upp- byggd av fortlöpande kantpunkter.ZílíIÃi »<3? -víïfš fZê .Z r ïšïšïš? 10 15 20 25 30 522 437 29 Continuous edge points By chance, it is possible that the line detection algorithm generates a line that actually consists of a lot of small edges that lie on a straight line. For example, edges of letters written on a straight line can give rise to such a line. If only lines consisting of continuous edge points are interesting, it is desirable to eliminate these other lines. One way to do this is to take the midpoint of all the edge points on the line. From this point, a few more points along the line are extrapolated. Now the difference in intensity is checked on both sides of the line at the selected points. If the intensity differences at the points do not exceed a certain threshold value, the line is not made up of continuous edge points.

Med denna algoritm kommer inte bara linjer som här- rör fràn icke fortlöpande kantpunkter att elimineras, utan algoritmen kommer också att eliminera tunna linjer pä bilden. Detta är en positiv effekt, om endast kant- linjer som härrör fràn tjocka ramar skall användas som särdrag. I fig 13 har samma algoritmer som användes tidi- gare anbringats pà bilden 102 som visas i fig 6. Den enda skillnaden i algoritmerna är att ingen kontroll har ut- förts beträffande huruvida linjerna består av fortlöpande kantpunkter längs kanter.With this algorithm, not only lines derived from non-continuous edge points will be eliminated, but the algorithm will also eliminate thin lines on the image. This is a positive effect if only borders derived from thick frames are to be used as features. In Fig. 13, the same algorithms used previously have been applied to Figure 102 shown in Fig. 6. The only difference in the algorithms is that no check has been made as to whether the lines consist of continuous edge points along edges.

Fig 14 visar en förstoring av resultatet från algo- ritmen, med kontroller av fortlöpande kantpunkter an- bringat på linjen 109 vid den nedre delen av talen "12345678". Algoritmen gav ett negativt resultat, i termer av huruvida det var fortlöpande kantmönster eller inte. Fig 15 är en förstoring av samma algoritm, an- bringad pà linjen 110 vid den nedre delen av ramen. Här iii-lv íïïå íëïà iii-z 23:“=2>znt"ÅI“¿>šJäêšáš»låílíšlääíšlíïëfi-â .Sšiï -Iïš «.;“~.-=':-:=:¿:J>k<: 212352 (Lä '_35 íl<šE'*«ï.f.í;':<: lO 15 20 25 30 522 437 30 gav algoritmen ett positivt resultat, genom att kant- punkterna är fortlöpande.Fig. 14 shows an enlargement of the result from the algorithm, with checks of continuous edge points placed on line 109 at the lower part of the numbers "12345678". The algorithm gave a negative result, in terms of whether there was a continuous edge pattern or not. Fig. 15 is an enlargement of the same algorithm, applied to line 110 at the lower part of the frame. Här iii-lv íïïå íëïà iii-z 23: “= 2> znt" ÅI “¿> šJäêšáš» låílíšlääíšlíïëfi-â .Sšiï -Iïš «.;“ ~ .- = ': -: =: ¿: J> k < : 212352 (Lä '_35 íl <šE' * «ï.f.í; ': <: lO 15 20 25 30 522 437 30 gave the algorithm a positive result, in that the edge points are continuous.

G. Beräkning av homografimatrisen H Så snart särdragskandidaterna pà bilden har erhål- lits, måste de matchas mot särdrag från den ursprungliga skylten, vilka har kända koordinater. Om fyra särdrags- kandidater har hittats, kan deras koordinater matchas med motsvarande koordinater för särdragspunkter hos objektet, vilka lagrats i området 23 i lagringsorganet 21, och homografimatrisen H kan beräknas. Eftersom sannolikt fler kandidater till de intressanta särdragen än de avsedda kommer att hittas, måste en verifieringsprocedur utföras.G. Calculation of the homography matrix H Once the feature candidates in the image have been obtained, they must be matched against features from the original plate, which have known coordinates. If four feature candidates have been found, their coordinates can be matched with the corresponding coordinates of feature points of the object, which are stored in the area 23 of the storage means 21, and the homography matrix H can be calculated. Since more candidates for the interesting features than the intended ones are likely to be found, a verification procedure must be performed.

Denna procedur måste verifiera att de valda överensstäm- melserna för särdragspunkter har utförts med korrekt matchning. Om det således finns många kandidater för möj- liga särdragspunkter, bör homografimatrisen beräknas flera gånger och verifieras varje gång, för att kontrol- lera huruvida det är den riktiga punktöverensstämmelsen eller inte.This procedure must verify that the selected features for feature points have been performed with correct matching. Thus, if there are many candidates for possible feature points, the homography matrix should be calculated several times and verified each time, to check whether it is the correct point correspondence or not.

Med fördel optimeras denna matchningsprocedur med hjälp av Fischler och Bolles RANSAC-algoritm (se Fischler, M. A., och Bolles, R. C., "Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography", Comm.Advantageously, this matching procedure is optimized using Fischler and Bolles' RANSAC algorithm (see Fischler, M. A., and Bolles, R. C., "Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography", Comm.

Assoc. Comp. Mach., 24(6):38l-395, 1981).Assoc. Comp. Mach., 24 (6): 38l-395, 1981).

RANSAC RANSAC (The RANdom Sample and Consensus algorithm) är en skattningsalgoritm, som har förmåga att arbeta med mycket stora uppsättningar av möjliga överensstämmelser.RANSAC RANSAC (The RANdom Sample and Consensus algorithm) is an estimation algorithm, which has the ability to work with very large sets of possible matches.

Det bästa sättet att bestämma homografimatrisen H är att beräkna H för samtliga möjliga kombinationer, verifiera varje lösning och sedan använda den överensstämmelse som har det bästa verifieringsresultatet. Verifieringsproce- 10 15 20 25 30 522 437 31 durer kan, såsom beskrivs nedan, göras på olika sätt.The best way to determine the homography matrix H is to calculate H for all possible combinations, verify each solution and then use the conformity that has the best verification result. Verification procedures, as described below, can be performed in various ways.

Eftersom beräkning av H för varje möjlig kombination är mycket tidsödande, är detta inte en särskilt bra ansats när algoritmen skall utföras i realtid. RANSAC-algoritmen är också en algoritm för hypotes och verifiering, men den arbetar på ett annat sätt. Istället för att systematiskt arbeta sig igenom de möjliga särdragspunkterna, väljer den slumpvis sina överensstämmelsepunkter och beräknar sedan homografimatrisen samt utför verifieringarna.Since calculating H for each possible combination is very time consuming, this is not a very good approach when the algorithm is to be performed in real time. The RANSAC algorithm is also an algorithm for hypothesis and verification, but it works in a different way. Instead of systematically working its way through the possible features, it randomly selects its points of conformity and then calculates the homography matrix and performs the verifications.

RANSAC skall upprepa denna procedur ett visst antal gånger och sedan bestämma sig för att använda den upp- sättning av överensstämmelser som har det bästa verifie- ringsresultatet.RANSAC shall repeat this procedure a certain number of times and then decide to use the set of conformities which has the best verification result.

Fördelarna med RANSAC-proceduren är att den är mer robust när det finns många möjliga särdragspunkter, och att den testar överensstämmelser i slumpvis ordning. Om punktöverensstämmelserna testas i systematisk ordning, och algoritmen råkar starta med en punkt som är felaktig, mäste samtliga överensstämmelser som denna punkt kan ge upphov till verifieras av algoritmen. Detta händer inte med RANSAC, eftersom en punkt endast kommer att matchas med en enda möjlig punktöverensstämmelse, och nya sär- dragspunkter därefter kommer att väljas för matchning med varandra. RANSAC~matchningsproceduren utförs endast ett specificerat antal gånger, och den bästa lösningen väljs.The advantages of the RANSAC procedure are that it is more robust when there are many possible features, and that it tests conformities in random order. If the point matches are tested systematically, and the algorithm happens to start with a point that is incorrect, all matches that this point can give rise to must be verified by the algorithm. This does not happen with RANSAC, as a point will only be matched with a single possible point match, and new feature points will then be selected for matching with each other. The RANSAC matching procedure is performed only a specified number of times, and the best solution is selected.

Eftersom punkterna väljs slumpvis, har ibland rätt matchning, eller åtminstone en som är nära den rätta, valts, och dessa punkt överensstämmelser kan användas för beräkning av H.Because the points are randomly selected, sometimes the correct match, or at least one that is close to the correct one, has been selected, and these point matches can be used to calculate H.

Verifieringsprocedurer När homografimatrisen har beräknats, måste det veri- fieras att korrekta punktöverensstämmelser har använts.Verification procedures Once the homography matrix has been calculated, it must be verified that correct point conformities have been used.

Detta kan göras på några olika sätt. 10 15 20 25 30 522 457 32 Ett femte särdrag Det vanligaste sättet att verifiera H'är att använda ytterligare särdragspunkter. I detta fall mäste ännu fler än de fyra särdragspunkterna från de ursprungliga objek- ten vara kända. De återstående punkterna från det ur- sprungliga objektet kan transformeras till bildens ko- ordinatsystem. Därefter kan en verifieringsprocedur ut- föras, för att kontrollera huruvida punkterna har hittats på bilden. Ju fler ytterligare särdrag som hittas, desto större är sannolikheten att den korrekta uppsättningen punktöverensstämmelser har valts.This can be done in a few different ways. 10 15 20 25 30 522 457 32 A fifth feature The most common way to verify H 'is to use additional feature points. In this case, even more than the four features of the original objects must be known. The remaining points from the original object can be transformed into the coordinate system of the image. Then a verification procedure can be performed, to check whether the points have been found in the image. The more additional features found, the greater the likelihood that the correct set of point matches has been selected.

Kamerans inre parametrar Om kameran är kalibrerad, är det möjligt att veri- fiera den förmodade homografimatrisen med kamerans inre parametrar 24, vilka lagrats i lagringsorganet 21 (se diskussionen i föregående sektioner). Detta inför ytter- ligare begränsningar på de valda särdragspunkterna. Om punkterna representerar en rektangels hörn, kommer de första och andra raderna, q och rz att ge upphov till samma värde, om punkterna är korrekt matchade upp till ett rotationsfel för rektangeln på 180 grader. Detta är uppenbart eftersom om en rektangel roteras 180 grader kommer den att ge upphov till exakt samma rektangel. På samma sätt kan en kvadrat roteras 90, 180 eller 270 grader, och ändå ge upphov till exakt samma kvadrat. I alla dessa fall kommer Q och rz fortfarande att vara ortogonala. Även om denna verifieringsprocedur kan ha ett rota- tionsfel, om hörn på en rektangel används som särdrags- punkter, är den fortfarande mycket användbar, eftersom rektanglar är vanliga särdrag. Rotationsfelet kan enkelt kontrolleras senare. lf rv- . *VV 10 15 20 25 30 Verifieringsfel Beroende på hur särdragspunkterna väljs, kan det fortfarande inträffa fel när särdragspunkterna verifie- ras. Såsom nämnts ovan, är homografimatrisen en homogen matris, och endast bestämd upp till en skala. Om objektet har punkter som är vid exakt samma uppställning som sär- drags- och verifieringspunkterna, förutom att de är rote- rade och/eller upp till skala, kommer verifieringsproce- duren att ge upphov till exakt samma värden som om den korrekta punktöverensstämmelsen hade hittats. Därför är det viktigt att välja särdragspunkter som är så distinkta som möjligt.Camera internal parameters If the camera is calibrated, it is possible to verify the putative homography matrix with the camera's internal parameters 24, which are stored in the storage means 21 (see the discussion in the previous sections). This introduces further restrictions on the selected feature points. If the points represent the corner of a rectangle, the first and second rows, q and rz will give rise to the same value, if the points are correctly matched up to a rotation error of the rectangle of 180 degrees. This is obvious because if a rectangle is rotated 180 degrees, it will give rise to the exact same rectangle. In the same way, a square can be rotated 90, 180 or 270 degrees, and still give rise to the exact same square. In all these cases, Q and rz will still be orthogonal. Although this verification procedure may have a rotational error, if corners of a rectangle are used as feature points, it is still very useful, since rectangles are common features. The rotation error can easily be checked later. lf rv-. * VV 10 15 20 25 30 Verification errors Depending on how the feature points are selected, errors can still occur when the feature points are verified. As mentioned above, the homography matrix is a homogeneous matrix, and only determined up to a scale. If the object has points that are at exactly the same layout as the feature and verification points, except that they are rotated and / or up to scale, the verification procedure will give rise to exactly the same values as if the correct point conformity had been found . Therefore, it is important to choose features that are as distinctive as possible.

Restriktioner pà RANSAC RANSAC är endast baserat på slumpmässighet. Om ännu mer information finns tillgänglig, bör denna uppenbar- ligen användas för att optimera RANSAC-algoritmen. Några restriktioner som kan läggas till är följande.Restrictions on RANSAC RANSAC is based on randomness only. If even more information is available, this should obviously be used to optimize the RANSAC algorithm. Some restrictions that can be added are as follows.

Stopp om lösningen hittats Istället för att upprepa beräkningarna i proceduren ett specifikt antal gånger, är det möjligt att stoppa om verifieringen indikerar att en lösning som är bra har hittats. För att bestämma om en lösning är bra eller inte kan ett villkor göras, som om åtminstone en viss mängd särdragspunkter i verifieringsproceduren har hittats, så mäste detta vara den korrekta homografimatrisen. Om kame- rans inre parametrar används som verifieringsprocedur, kan ett stopp göras om rl och r, är mycket nära att ha samma längd och vara ortogonala.Stop if the solution is found Instead of repeating the calculations in the procedure a specific number of times, it is possible to stop if the verification indicates that a solution that is good has been found. In order to determine whether a solution is good or not, a condition can be made, as if at least a certain number of feature points have been found in the verification procedure, this must be the correct homography matrix. If the camera's internal parameters are used as a verification procedure, a stop can be made if rl and r are very close to have the same length and are orthogonal.

Särdragspunkter på samma linje Begränsningen att endast en sådan uppsättning sär- dragspunkter skall användas, där inga tre punkter tillåts ligga på samma linje, kan inkluderas i RANSAC-algoritmen.Feature points on the same line The restriction that only such a set of feature points should be used, where no three points are allowed to lie on the same line, can be included in the RANSAC algorithm.

Efter att fyra punkter slumpvis valts är det möjligt att 10 15 20 25 30 522 457 34 kontrollera om tre av dem ligger pà samma linje, innan man fortsätter med beräkning av homografimatrisen. Kombi- nerat med de följande tvâ restriktionerna är denna kon- troll mycket tidseffektiv.After four points are randomly selected, it is possible to check whether three of them are on the same line, before proceeding with the calculation of the homography matrix. Combined with the following two restrictions, this control is very time efficient.

Konvext hölje Det konvexa höljet för en godtycklig uppsättning S punkter är den minsta konvexa polygon IQ för vilken varje punkt i S ligger antingen på gränsen av P; eller i dess inre. Tvä av de vanligaste algoritmerna som används för beräkning av det konvexa höljet är Graham's avsökning och Jarvis's marsch. Båda dessa algoritmer använder en teknik som kallas "roterande svepning" (se Cormen, T. H., Leiserson, C. E., och Rivest, R. L., "Introduction to Algorithms", The Massachusetts Institute of Technology, 1990, sid 898). När man beräknar det konvexa höljet, kommer dessa algoritmer också att åstadkomma hörnens ordning, säsom de framträder pà höljet, i motsols ord- ning. Graham's avsökning arbetar i Obﬂgn)-tid, till skillnad från Jarvis's marsch som arbetar i 00%)-tid, där n är antalet punkter och h är antalet hörn.Convex envelope The convex envelope of any set of S points is the smallest convex polygon IQ for which each point in S lies either on the boundary of P; or in its interior. Two of the most common algorithms used to calculate the convex envelope are Graham's scan and Jarvis' march. Both of these algorithms use a technique called "rotary sweeping" (see Cormen, T. H., Leiserson, C. E., and Rivest, R. L., "Introduction to Algorithms", The Massachusetts Institute of Technology, 1990, p. 898). When calculating the convex casing, these algorithms will also bring about the order of the corners, as they appear on the casing, in the counterclockwise order. Graham's scan works in Ob ﬂ gn) time, unlike Jarvis's march which works in 00%) time, where n is the number of points and h is the number of corners.

Eftersom projicerande avbildningar är linjebevaran- de, mäste de också bevara det konvexa höljet. I en upp- sättning med fyra punkter, där inga tre punkter ligger pà samma linje, kommer det konvexa höljet att bestå av an- tingen tre eller fyra av punkterna. Detta betyder att i tvä uppsättningar med motsvarande punkter, kommer båda deras konvexa höljen att bestà av antingen tre eller fyra punkter. En kontroll av detta kan, efter att de tvâ upp- sättningarna pá fyra punkter har valts, inkluderas i RANSAC-algoritmen.Because projecting images are line-preserving, they must also preserve the convex envelope. In a set of four points, where no three points lie on the same line, the convex housing will consist of either three or four of the points. This means that in two sets of corresponding points, both of their convex housings will consist of either three or four points. A check of this can, after the two sets of four points have been selected, be included in the RANSAC algorithm.

Qi' / ~. 10 15 20 25 30 522 457 35 Systematisk sökning Principen bakom RANSAC är att slumpvist välja fyra punkter, matcha dem med fyra förmodat överensstämmande punkter, vilka också valts slumpmässigt, samt sedan för- kasta dessa punkter och välja nya. Det är möjligt att modifiera denna algoritm och inkludera några systematiska operationer. Så snart de två uppsättningarna med fyra punkter har valts, kan samtliga möjliga kombinationer av matchning mellan dessa punkter testas. Detta betyder att det finns 4h=24 olika kombinationer att testa. Om rest- riktionerna ovan har inkluderats, kan detta antal reduce- ras kraftigt. Först kan man säkerställa att inga tre punkter av de fyra punkterna i varje uppsättning ligger på samma linje. Sedan kan man kontrollera om båda upp- sättningarna har samma mängd punkter i det konvexa höl- jet. Om så är fallet, kommer också ordningen av punkterna på höljet att erhållas, och nu kan alltså punkterna endast matchas med varandra på antingen tre eller fyra olika sätt, beroende på hur många punkter höljet består av.Qi '/ ~. 10 15 20 25 30 522 457 35 Systematic search The principle behind RANSAC is to randomly select four points, match them with four supposedly matching points, which are also chosen at random, and then reject these points and select new ones. It is possible to modify this algorithm and include some systematic operations. Once the two sets of four points have been selected, all possible combinations of matching between these points can be tested. This means that there are 4h = 24 different combinations to test. If the above restrictions have been included, this number can be greatly reduced. First, one can ensure that no three points of the four points in each set are on the same line. Then you can check if both sets have the same number of points in the convex housing. If this is the case, the order of the points on the casing will also be obtained, and now the points can only be matched with each other in either three or four different ways, depending on how many points the casing consists of.

Således, av 24 möjliga kombinationer, har O, 3 eller 4 förmodade punktöverensstämmelser nåtts. Naturligtvis är beräkning av det konvexa höljet och säkerställande av att inga tre punkter ligger på samma linje tidsödande, men det är obetydligt jämfört med att beräkna homografi- matrisen 24 gånger.Thus, out of 24 possible combinations, 0, 3 or 4 putative point matches have been reached. Of course, calculating the convex envelope and ensuring that no three points are on the same line is time consuming, but this is insignificant compared to calculating the homography matrix 24 times.

H. Extraktion av målområde Så snart homografimatrisen är känd, kan vilket om- råde som helst från bilden extraheras, så att det kommer att verka som om bilden var tagen från ett ställe som ligger rakt framför den. För att göra denna extraktion kommer alla punkterna inom det intressanta området att transformeras till bildplanet enligt den valda lösningen. fa:- -~ f: lå :Z-š <3 : Éšlïgåíilffiïfhš :sï-Q-Lííäké: Zíï “(29 - -å .='.“i<§<: 10 15 20 25 30 522 437 36 Eftersom bilden är en diskret koordinatram, består den av bildpunkter med heltalsvärden. De transformerade punk- terna kommer emellertid förmodligen inte att vara heltal.H. Target Area Extraction Once the homography matrix is known, any area from the image can be extracted so that it will appear as if the image was taken from a location directly in front of it. To do this extraction, all the points within the area of interest will be transformed into the image plane according to the selected solution. fa: - - ~ f: lå: Z-š <3: Éšlïgåíilffiïfhš: sï-Q-Lííäké: Zíï “(29 - -å. = '.“ i <§ <: 10 15 20 25 30 522 437 36 Because the picture is a discrete coordinate frame, it consists of pixels with integer values, but the transformed points will probably not be integers.

Därför måste en bilinjär interpolation (se exempelvis Heckbert, P. S., "Graphics Gems IV", Academic Press, Inc., 1994) göras för att erhålla intensiteten fràn bil- den. Den transformerade bilden kan àterskapas fràn an- tingen gràskaleintensiteten, eller kan samtliga tre intensitetsnivàer erhållas fràn den ursprungliga bilden i färg.Therefore, a bilinear interpolation (see, for example, Heckbert, P. S., "Graphics Gems IV", Academic Press, Inc., 1994) must be made to obtain the intensity from the image. The transformed image can be recreated from either the grayscale intensity, or all three intensity levels can be obtained from the original color image.

Fig 16 visar bildens 102 i fig 6 màlomràde 101, som hittats av algoritmerna ovan.Fig. 16 shows the target area 101 of the image 102 in Fig. 6, found by the algorithms above.

I fig 17 har målområdet 101' transformerats så att exempelvis OCR- eller streckkodstolkning kan följa (steg 36 och 37 i fig 3). I detta exempel valdes en upplösning pà 128 bildpunkter i x-riktningen.In Fig. 17, the target area 101 'has been transformed so that, for example, OCR or bar code interpretation can follow (steps 36 and 37 in Fig. 3). In this example, a resolution of 128 pixels in the x-direction was selected.

I. Alternativa utföringsformer Uppfinningen har beskrivits ovan under hänvisning till en föredragen utföringsform. Emellertid är andra utföringsformer än den som beskrivits ovan lika möjliga inom uppfinningens skyddsomfàng, sàsom detta definieras enligt de bifogade patentkraven. Specifikt observeras att uppfinningen kan utföras i andra portabla anordningar än den som beskrivs ovan, exempelvis mobiltelefoner, por- tabla digitala assistenter (PDA), handdatorer, kalendrar, kommunikationsanordningar, etc.I. Alternative Embodiments The invention has been described above with reference to a preferred embodiment. However, embodiments other than those described above are equally possible within the scope of the invention, as defined in the appended claims. Specifically, it is observed that the invention may be practiced in portable devices other than that described above, for example, mobile telephones, portable digital assistants (PDAs), PDAs, calendars, communication devices, etc.

Vidare är det möjligt att, inom uppfinningens skyddsomfàng, utföra några av stegen enligt uppfinnings- förfarandet i den externa datorn 200, snarare än i den handhàllna anordningen 300. Exempelvis är det möjligt att överföra det transformerade målområdet 101 som en digital bild (JPEG, GIF, TIFF, BMP, EPS, etC) Över länken 301 till datorn 200, vilken sedan kommer att utföra själva lO 15 522 457 37 behandlingen av det transformerade málomràdet 101, för att extrahera den önskade informationen (OCR-text, streckkod, etc).Furthermore, it is possible, within the scope of the invention, to perform some of the steps of the inventive method in the external computer 200, rather than in the handheld device 300. For example, it is possible to transmit the transformed target area 101 as a digital image (JPEG, GIF , TIFF, BMP, EPS, etc.) Over the link 301 to the computer 200, which will then perform the actual processing of the transformed target area 101, to extract the desired information (OCR text, bar code, etc).

Naturligtvis kan datorn 200, pà konventionellt sätt, vara ansluten till ett lokalt nätverk, eller ett globalt nätverk, sàsom Internet, vilket gör det möjligt för den extraherade informationen att vidarebefordras till ytter- ligare applikationer utanför den handhàllna anordningen 300 och datorn 200. Alternativt kan den extraherade in- formationen kommuniceras genom en mobiltelefon, vilken är funktionellt ansluten till den handhàllna anordningen 300 genom IrDA, Bluetooth eller kabel (visas ej på ritningar- na). "“('"~í?$ï*Of course, the computer 200 may, in a conventional manner, be connected to a local network, or a global network, such as the Internet, which enables the extracted information to be passed on to further applications outside the handheld device 300 and the computer 200. Alternatively, the extracted information is communicated via a mobile telephone, which is functionally connected to the handheld device 300 via IrDA, Bluetooth or cable (not shown in the drawings). "" ('"~ Í? $ Ï *

Claims

10 15 20 25 30 522 457 38 PATENT REQUIREMENTS

A method of extracting information from a target area (101) within a two-dimensional graphic object (100) having a plurality of predetermined features (23) having, in a first plane, known characteristics, characterized by the steps of: reading an image (102) in which said object (100) is located in a second plane, said second plane being known in advance, in said image identifying a plurality of candidates (108) for said predetermined features (23) in said second plane, that from said identified plurality of feature candidates, calculating a transformation matrix (H) for projecting imaging between said second and first planes, transforming said target area (101) from said second plane to said first plane, and processing said target area to extract said information.

The method of claim 1, wherein said plurality of predetermined features (23) are read from a memory (21) before said plurality of feature candidates (108) are identified.

The method of claim 1 or 2, wherein said plurality of predetermined features (23) include at least four features.

The method of claim 3, wherein said at least four predetermined features are four points, four lines, three points and one line, or one point and three lines. 10 15 20 25 30 522 437 39

The method of claim 3, wherein said at least four predetermined features are four points, said plurality of feature candidates (108) being identified by: locating edge points (103) as points in said image (102) having large gradients, that clustering said edge points to lines (104), and A to determine said plurality of feature candidates as intersection points (105, 106, 108) between any two of said lines.

The method of claim 5, wherein said intersection points (105, 106, 108) are at four corner points on a frame in said two-dimensional graphic object.

A method according to any one of the preceding claims, wherein said transformation matrix (H) is calculated by: selecting from said identified plurality of feature candidates randomly as many feature candidates as in said plurality of feature candidates (23), calculating a hypothetical transformation matrix for said randomly selected candidates. and said plurality of predetermined features, verifying the hypothetical transformation matrix, repeating the steps above several times, and as said information matrix (H) selecting the specific hypothetical transformation matrix with best results from the verification step.

A method according to claim 6 or 7, wherein the hypothetical transformation matrix is verified by means of at least one further predetermined feature.

A method according to any one of claims 6-8, wherein said plurality of predetermined features (23) comprises íï (I: fi. "'_L'flï' lll 1,2 S355 ÉÉÉÉ iè '-." -' ~. ':: = ï í {rl¿>. Zíï '321 (f- “š» ﬁ fšíšêß. iï 10 15 20 25 30 522 437 40 at least four points and wherein said step of random selection is limited to a set of four feature candidates, which does not include three aligned points.

The method of claim 9, wherein said step of randomly selecting is further limited by calculating the complex envelope of said feature candidates.

A method according to any one of the preceding claims, wherein said plurality of predetermined features (23) comprise at least one point having a value for grayscale, color, intensity or luminescence, which is clearly distinct from surrounding points in said two-dimensional graphic objects (100).

A method according to any one of the preceding claims, wherein said two-dimensional graphic object (100) is a sign.

A method according to any preceding claim, wherein said processing step comprises optical character recognition (OCR) of said target area (101).

A method according to any preceding claim, wherein said processing step comprises bar code interpreting said target area (101).

A method according to any preceding claim, wherein said processing step comprises transferring said target area (101) to an external computer (200).

A method according to any one of the preceding claims, wherein said first plane is the image plane of said read image (102).

A method according to any one of claims 1-15, wherein said first plane is the image plane of a previously read image.

A method according to claims 1-17, wherein said plurality of predetermined features (23) are obtained by direct measurement at said previously read image. 10 15 20 25 30 522 4s7ö 41

A computer program product which is directly chargeable to an internal memory (21) in a processing device (20), the computer program product comprising program code (22) for performing the steps according to any of claims 1-18 when executed by said processing device.

A computer program product according to claim 19, comprising on a computer readable medium (21).

The handheld image generating device (300) having storage means (21) and a processing device (20), the storage means containing program code (22) for performing the steps according to any one of claims 1-18 when executed by said processing device.

An apparatus for extracting information from a target area (101) within a two-dimensional graphic object (100), having a plurality of predetermined features (23) having, in a first plane, known characteristics, the apparatus comprising an image sensor (8). ), a processing device (20) and storage means (21), characterized by a first area (25) in said storage means (21), said first area being adapted for storing an image (102) recorded by said image sensor (8), in which said object (100) is located in a second plane, said second plane being known in advance, and a second area (23) in said storage means (21), said second area being adapted for storage of said plurality of predetermined features, wherein: said processing device (20) is arranged to read said image (102) from said first area (25), to read said plurality of predetermined features from said second area (23), to identify in said image a plural candidates to the said feature in said second plan, that from said identified feature candidates .--. f '«- v, v, -., -, ~» vy y, _... _ f ~ »--'-_ -; -¿ '-. f ~ «-,» ~. . <'~ .- ^ -: f. - «-; = q-x, ..., _,., ~, ._> _, _», 1 ~~. f- a; .f-.fï,, _ ._ Law-_: ~ ut! .get Izæf. 1.; - gvïu: ger: -Åchß 12min; ïcuitffi ﬂ i SJ; of; : šz - vir-nflfz 2 = .m-¿'f-J v ..- aqf ﬂ iwàízi. Calculating a transformation matrix (H) for projecting imaging between said second and first planes, transforming said target area (101) in said object from said second plane to said first plane, and after the transformation extracting said information from said màlomràde.

The apparatus of claim 22, further comprising an optical character recognition (OCR) module (29) adapted to extract said information from said target area (101).

The apparatus of claim 22, further comprising a module (29 ') for bar code interpretation, which is adapted to extract said information from said target area (101).

A device according to any one of claims 22-24 in the form of a handheld device (300).

The device of any of claims 22-24, wherein said device comprises a handheld device (300) and a computer (200). ; -; 1 = ^ <;;;: = .- k =; : Elf íï