WO2017091060A1

WO2017091060A1 - A system and method for detecting objects from image

Info

Publication number: WO2017091060A1
Application number: PCT/MY2016/050068
Authority: WO
Inventors: Hamam MOKAYED; Kim Meng Liang; Hock Woon Hon; Yan Chai HUM; Kelvin Yir Siang LO
Original assignee: Mimos Berhad
Priority date: 2015-11-27
Filing date: 2016-10-14
Publication date: 2017-06-01
Also published as: MY174684A

Abstract

This disclosure describes a system and method for detecting objects from at least an image. Such invention is useful in detecting vehicle license plate and identifying the characters on the license plate. The method comprises steps of conducting edge-based technique (210) on the image for identifying Binary Large Objects (BLOBs) that potentially represent the objects upon binarizing the image; performing dynamic dilation (220) that determines different bodies of which the objects are respectively located thereon for identifying BLOBs of different objects that appear in a single image; applying group-based filtering (230) on the BLOBs that groups similar BLOBs and filters away unwanted noise for determining entirety of the BLOBs for each object; and aggregating the groups of BLOBs that belong to the same object (240). The aggregated BLOBs are representations of each object that are individually, and entirely or substantially entirely detected from the images despite having gaps in between the different bodies or spaces, or in between both of the different bodies and spaces on each object. Furthermore, this invention enables multiple objects to be detected from a single image.

Description

A SYSTEM AND METHOD FOR DETECTING OBJECTS FROM IMAGE

FIELD OF INVENTION This disclosure describes a system and method for detecting one or more objects from at least an image. Such invention is applicable for detecting license plates of vehicles in order to identify the content which includes numeric or alphanumeric characters on the surface of the plates.

BACKGROUND OF THE INVENTION

Detection of static or moving objects within a geographical area is a challenging task that involves the field of computer vision or digital image processing. Generally, detection is conducted on images of these objects and their surrounding environment that are captured by surveillance cameras. Algorithms are executed by detection systems to effectively locate the object of interest from the images. One of such example is the License Plate Detection (LPD) algorithm that aids detection and identification of vehicle license plates. The implementation of LPD to the detection system facilitates a variety of applications including access-control, security, traffic monitoring, law enforcement, parking management and ticketing.

Prior art US20140369566 discloses a method for electronically identifying a license plate. This method utilizes multiple image capture devices that are positioned within a predetermined geographical area to capture the image containing the license plate. Polygon algorithm is employed to locate the license plate within the captured image that contains alpha-numeric characters. Upon recognizing the characters, the characters are compared to a predetermined database that comprises vehicle identification values. On the other hand, patent application WO201419320 describes a system and method for identifying multiple license plates from multiple video streams. The system adopts an edge detection based plate identification module to process multiple images of the videos streams parallel in order to locate the license plates. Data on the plate is extracted by a region-of-interest (ROI) based extraction module and sent to an optical character recognition (OCR) unit to identify the characters from the license plate area.

Despite the development of various detection methods and systems, the accuracy of detection are often affected by different conditions such as close distances between the objects, existence of spaces among content on object and noise around the objects. In the example of detecting vehicle license plate, these conditions include vehicle tailgating, spaces among characters of the license identifier, number of borders on the license plate and noise surrounding the vehicles and the license plates. Solutions to these limitations are required for improving performance and accuracy of existing detection systems and methods. Therefore, the invention proposed herein is designed to address the issues listed above

SUMMARY OF INVENTION

One aspect of this invention is to provide a system and method that is capable of identifying more than one objects in a single image regardless of the distance between the objects. In the example of vehicle license plate detection, this invention allows license plates of tailgating vehicles to be detected.

Another aspect of this invention is to provide a system and method that enables separation of the objects of interest from surrounding noise that is caused by binarization and other pre-processing stages for enhancing detection accuracy. Still another aspect of this invention is to provide a system and method that connects splitted parts of the same object in order to efficiently detect the entirety of the object, whereby the existence of borders, lines and spaces on the object are possible causes for the splitting of the object.

Yet another aspect of this invention is to provide a system and method that has the ability to detect and identify content on the object even when there is space within the content. In the application of vehicle license plate detection, this invention aims to enable the characters on the license plate to be entirely detected and identified without being affected by the space between the characters.

Also another aspect of this invention is to provide a system and method that is able to fulfil speed and accuracy demands through fully detecting the objects at the shortest time possible.

At least one of the preceding aspects is met, in whole or in part, by this invention, in which the preferred embodiment of this invention describes system for detecting one or more objects from at least one image that comprises one or more servers having at least one processor for managing processes executed by the system and a database for storing data; an image capturing device for capturing the image of the objects; and a detection engine including an edge detection module for conducting edge-based technique on the image to identify Binary Large Objects (BLOBs) that potentially represent the objects upon binarizing the image; a dynamic dilation module for performing dynamic dilation that determines different bodies of which the objects are respectively located thereon to identify BLOBs of different objects that appear in a single image; a filtration module for applying group-based filtering on the BLOBs that groups similar BLOBs and filters away unwanted noise to determine entirety of the BLOBs for each object; and an aggregation module for aggregating the groups of BLOBs that belong to the same object; wherein the aggregated BLOBs are representations of each object that are individually, and entirely or substantially entirely detected from the images despite having gaps in between the different bodies or spaces, or in between both of the different bodies and spaces on each object.

In accordance with the aforementioned aspects, this invention also relates to a method for detecting one or more objects from at least one image which comprises the steps of conducting edge-based technique on the image for identifying Binary Large Objects (BLOBs) that potentially represent the objects upon binarizing the image; performing dynamic dilation that determines different bodies of which the objects are respectively located thereon for identifying BLOBs of different objects that appear in a single image; applying group-based filtering on the BLOBs that groups similar BLOBs and filters away unwanted noise for determining entirety of the BLOBs for each object; and aggregating the groups of BLOBs that belong to the same object; wherein the aggregated BLOBs are representations of each object that are individually, and entirely or substantially entirely detected from the images despite having gaps in between the different bodies or spaces, or in between both of the different bodies or spaces on each object.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 is a diagram showing the architecture of an overall system for detecting objects and identifying the content on the object.

Figure 2 is a flow chart showing the overall method for detecting one or more objects from an image.

Figure 3 illustrates an image of vehicles that is captured for vehicle license plate detection. Figure 4 illustrates an image that has undergone edge-based detection for the detection of vehicle license plate.

Figure 5 is a flow chart showing the dynamic dilation process. Figure 6 illustrates an image that has undergone dynamic dilation for the detection of vehicle license plate.

Figure 7 shows the effect of dynamic dilation by using the detection of vehicle license plate as an example.

Figure 8 is a flow chart showing the group-based filtration process.

Figure 9 illustrates an image that has undergone group-based filtration for the detection of vehicle license plate.

Figure 10 shows the effect of group-based filtration by using the detection of vehicle license plate as an example.

Figure 11 is a flow chart showing the similar BLOBs aggregation process.

Figure 12 illustrates an image that has undergone similar BLOBs aggregation for the detection of vehicle license plate.

Figure 13 shows the effect of similar BLOBs aggregation by using the detection of vehicle license plate as an example.

DETAILED DESCRIPTION OF THE INVENTION

For a better understanding of the invention, preferred embodiments of the invention that are illustrated in the accompanying drawings will be described in detail.

This disclosure describes a computer-implemented system and method for detecting one or more objects from at least an image. Such invention is useful in locating objects within an area. Commonly, image capturing devices (1000) like surveillance cameras are used for capturing image (110) that contains the object along with its surrounding environment. As manual detection from images consumes time and human resources, and is also prone to errors, automatic detection by computer systems is more widely used for large-scale and long duration detection.

It should be noted that the number and types of objects, as well as the types of body where the object is positioned and the area in which the object is located are not limited to the example described herein. To ease the understanding of the invention, the detection of vehicle license plate is used as an example as the object to be detected in .this disclosure. The term 'body' in this example refers to the vehicle of which the license plate is located thereon.

Figure 1 illustrates the architecture of a preferred embodiment of an overall system for detecting objects and identifying the content on the objects. The object to be detected as shown in Figure 1 is the vehicle license plate, whereas the content to be identified is the characters on the license plate. The main focus of this invention lies on the detection engine (2000) that is part of this system. The system is operable by one or more servers that have at least one processor for managing processes executed by the system including all devices, engines and modules, and a database for storing data.

Prior to the detection stage (20) of the object is the initialization stage (10) in which the image of the object is captured (110) by an image capturing device (1000) which can be still image camera or video camera. The object can be in static or moving status. A region of interest ( OI) is set such that when the body of the object passes by the ROI, the image capturing device (1000) will be initialized, thereby capturing the image (110) of the object. If the object is static, both the still image and video cameras can be used, whereas if the object is moving, a video camera is required. An image of the moving object is grabbed from the video for detection purposes.

Upon obtaining the image, the image capturing device (1000) sends the image to the detection engine (2000) which comprises at least four main modules, namely the edge detection module (2001), dynamic dilation module (2002), filtration module (2003) and aggregation module (2004). Figure 2 is a flow chart showing the overall method for detecting one or more objects in an image. For the binarization of the image to be conducted through the edge-based technique (210), the color of the image is first converted to grayscale (120). The binarized image turns into the form of Binary Large Object (BLOB) after the binarization process. The edge-based detection (210) serves to identify edges of the object of interest. Regions with high edges variance or change in brightness are considered as BLOBs that potentially represent the objects. Scattered noise is also removed in this process. An example of the edge-based detection for detection of vehicle license plate is depicted in Figure 4 whereby the edges of the vehicles and their license plates are evidently shown. The accuracy in the number of detected objects can be affected in conventional detection systems where there are very close gaps in between objects or the bodies of the objects that cause them to seem connected with each other. For instance in Figure 3, a second vehicle can be seen tailgating a first vehicle. Due to the angle of the shot of the image, both vehicles look connected with each other while in actual, there is a gap in between the two vehicles. Such situation would likely cause conventional detection systems to wrongly determine the two vehicles as one. Standard dilation methods could also cause the license plate of the first vehicle to connect with the second vehicle, thus causing error in the detection. This issue is solved by the dynamic dilation module (2001) through performing dynamic dilation (220) that determines the different bodies where the objects are respectively located thereon in order to identify BLOBs of different objects that appear in a single image. With reference to Figure 5, once the BLOBs that potentially represent the object is detected by the edge detection module (2001), the dynamic dilation module (2002) calculates densities (221) of surrounding pixels in at least four directions which include left, right, top and bottom of all pixels of the BLOBs. Next, the direction (222) and value (223) of the dilation are respectively determined based on the calculated densities to join bodies of the objects that have substantially similar shapes but of different sizes with lines. The lines show the distances between the bodies, indicating the existence of gap between the bodies. The number of gap allows the amount of bodies to be determined. Therefore, the BLOBs of different objects that appear in a single image can be accurately identified. Assuming iDirOffset is the recommended offset value to scan the surrounding pixels, the range of each direction is determined based on the following equations:

Left: (i+ii *JWidth÷j ; ii : [-iDirOffset . -1] ;

DirOffsei*2] ; j:[iDirOffset*2 . Width -iDirOffset*2] „ . ,

^{J 1} JJ ^■ - Equation 1

Right: (i+ii)*_iWidih+j ; ii :[ J, iDirOffset] ; i: [iDirOffset* 2, Height

; j: [iDirOffset*! . Width -iDirOffset*!] ^"

- Equation 2

Bottom: (i)*_iyidth+-j+ii j ii :[ -iDirOffset , -1] ; i: [iDirOffset*!, Height - iDirOffset*!] j:[iDirOffset*2 , Width -iDirOffset* 2] Equation 3

Top: (i)*_iWidth+3₊ii ; ii :[ 1, iDirOffset] ; i:[iDii-Offset*2,Height -iDirOffset*!] ; j: [iDirOffset*! . Width -iDirOffset* ]

Equation 4

In Figure 6, the joining line (600) between the first and second vehicles show that the two vehicle bodies are not connected to each other. Hence, the dynamic dilation (220) process is able to detect the amount of vehicle license plates appearing in the image. Figure 7 shows the effect of dynamic dilation (220) by using the detection of vehicle license plate as an example. The license plates of the two tailgating vehicles can be detected without being confused as a similar entity as the vehicles in front of them. Figure 8 is a flow chart showing the group-based filtration (230) process which is conducted by the filtration module (2003) for applying group-based filtering (230) on the BLOBs after the dynamic dilation (220) process. This process groups similar BLOBs and filters away unwanted noise to determine entirety of the BLOBs. Upon scanning the BLOBs (231), the values of horizontal and vertical spaces among the BLOBs are computed (232) based on Equations 5 to 8. iVerDiffi = in_vecMiny[ii] - ui_vecMiiiy[i] - Equation 5 iVerDiffi = in_vecMaxy[ii] - m_vecMaxy[i] - Equation 6 iHorDifJl = m_vecMaxx[ii] - iii_vecMiiix[i3 - Equation 7 iHorDiffi = ni_vecMaxx:[i] - m_vecMinx[ii] - Equation 8

By comparing the computed values with a predetermined threshold value (233), the BLOBs are arranged into groups (234). BLOBs which are near together are arranged in the same group according to the following rule: if (iVerDiffi < Th && iVerDiffi < Til && (iHorDiffi < Th || iHorDiffi < Th))

Both blobs (i, ii) belong to tiie same group After arranging each BLOBs into groups (234), the features including compactness, ratio, density and dimension of each group are calculated (235). These features are denoted as iMiny, iMinx, iMaxx, iMaxy, iWidth, iHeight, iArea, iCompatness and iRatio which are used as inputs for the filtration process through rules such as compactness rule, ratio rule, white pixel density rule dimension-based rule to remove unwanted noises. Upon verifying validity of the BLOBs (236) after the filtration process, the entirety of the BLOBS for each group is determined.

Figure 9 illustrates an image that has undergone the group-based filtration (230) stage for the detection of vehicle license plate. The entire vehicle license plates can be seen clearly after the filtration (230). Comparison between the pre and post group-based filtration (230) process is shown in Figure 10. The complete detection of the vehicle license plates indicates the effect of this process.

Spaces between the content on the object would result in certain parts of similar objects to be left out or identified as another object during the detection stage (20). The content could be characters that consist of alphabets, numbers, figures or combination thereof. The groups as arranged in the group-based filtration process (230) are divided by the spaces. Thus, in order to combine groups of BLOBs that belong to the same object, the aggregation (240) process is performed. With reference to Figure 11, the aggregation module (2004) calculates features (241) including compactness, intensity and location of each group, and aggregates groups of BLOBs (240) which have similar features upon comparing the features (242). The aggregated BLOBs are representations of each object that are individually, and entirely or substantially entirely detected from the images despite having gaps in between the different bodies or spaces, or in between both of the different bodies and spaces on each object.

Figure 12 illustrates an image that has undergone similar BLOBs aggregation process (240) for the detection of vehicle license plate. The content on vehicle license plate which is a string of characters is divided into two groups by a space in between. Through the aggregation process (240), the two groups are identified to be of the same license plate. Therefore, the groups are aggregated (240) into a single BLOB that represents the entire license plate. Figure 13 shows the effect of similar BLOBs aggregation (240) by using the detection of vehicle license plate as an example.

The aggregation (240) process not only ensures that all parts of the object can be fully detected but also allows the complete content of the object to be identified. After the BLOB of the object is detected, the BLOB is sent to an identification engine (3000) for identifying the content on the object (30). In a preferred embodiment of the invention, the identification engine (3000) comprises a segmentation module (3001) for segmenting the content in the detected object into individual units; a recognition module (3002) for recognizing the individual units with characters; and an analyzing module (3003) for analyzing all the recognized characters to determine accuracy of the content. Such method enables the characters on a vehicle license plate to be identified.

A display device (4000) of the system is used to show the outcome of the detection (40) that includes the detected object and the recognized content of the object. Vehicle license plates that are detected can be observed from the display device (4000) along with the identified characters on the plates.

While this invention has been described in its preferred form, it is understood that the embodiments of the preferred form and the accompanying drawings are not to be regarded as a departure from the invention and it may be modified within the scope of the appended claims.

Claims

A system for detecting one or more objects from at least one image comprising one or more servers having at least one processor for managing processes executed by the system and a database for storing data;

an image capturing device (1000) for capturing the image of the objects; and a detection engine (2000) including :

an edge detection module (2001) for conducting edge-based technique (210) on the image to identify Binary Large Objects (BLOBs) that potentially represent the objects upon binarizing the image;

a dynamic dilation module (2002) for performing dynamic dilation (220) that determines different bodies of which the objects are respectively located thereon to identify BLOBs of different objects that appear in a single image;

a filtration module (2003) for applying group-based filtering (230) on the BLOBs that groups similar BLOBs and filters away unwanted noise to determine entirety of the BLOBs for each object; and an aggregation module (2004) for aggregating the groups of BLOBs that belong to the same object (240);

wherein the aggregated BLOBs are representations of each object that are individually, and entirely or substantially entirely detected from the images despite having gaps in between the different bodies or spaces, or in between both of the different bodies and spaces on each object.

The system according to claim 1, wherein the object is a license plate with characters thereon.

3. The system according to claim 1, wherein the bodies are vehicles.

A method for detecting one or more objects from at least one image comprising the steps of:

conducting edge-based technique (210) on the image for identifying Binary Large Objects (BLOBs) that potentially represent the objects upon binarizing the image;

performing dynamic dilation (220) that determines different bodies of which the objects are respectively located thereon for identifying BLOBs of different objects that appear in a single image;

applying group-based filtering (230) on the BLOBs that groups similar BLOBs and filters away unwanted noise for determining entirety of the BLOBs for each object; and

aggregating the groups of BLOBs that belong to the same object (240);

The method according to claim 4, wherein the step of performing dynamic dilation (220) includes :

calculating densities (221) of surrounding pixels in at least four directions which are left, right, top and bottom of all pixels of the BLOBs; and determining direction (222) and value (223) of the dilation based on the calculated densities to identify different bodies of the objects that appear in the image.

The method according to claim 4, wherein the step of applying group-based filtering (230) includes :

computing values of horizontal and vertical spaces among the BLOBs (232) upon scanning the BLOBs;

arranging the BLOBs in groups (234) after comparing the computed values with a predetermined threshold value (233);

calculating features (235) including compactness, ratio, density and dimension of each group; and

determining entirety of the BLOBs for each group upon verifying validity of the BLOBs (236) and filtering away unwanted noises through rules based on the calculated features.

7. The method according to claim 6, wherein the rules include one or combination of compactness rule, ratio rule, white pixels density rule and dimension-based rule.

8. The method according to claim 4, wherein the step of aggregating BLOBs that belong to the same object (240) includes:

calculating features (241) including compactness, intensity and location of each group; and

aggregating groups of BLOBs (240) which have similar features upon comparing the features (242).

9. The method according to claim 4, further comprising the step of converting the image to grayscale (120) before the step of conducting edge-based technique

(210) on the image.

10. The method according to claim 4, further comprising the step of identifying content (30) on the detected objects using the aggregated BLOBs.

11. The method according to claim 4, wherein the object is either static or moving.

12. The method according to claim 4, wherein the images are captured (110) from a video that has recorded the objects with an image capturing device.

13. The method according to claim 4, wherein the edge-based technique identifies potential BLOBs that represent the objects based on regions with high edges variance or change in brightness.