WO2021107763A1 - System and method for processing moving image - Google Patents

System and method for processing moving image Download PDF

Info

Publication number
WO2021107763A1
WO2021107763A1 PCT/MY2020/050122 MY2020050122W WO2021107763A1 WO 2021107763 A1 WO2021107763 A1 WO 2021107763A1 MY 2020050122 W MY2020050122 W MY 2020050122W WO 2021107763 A1 WO2021107763 A1 WO 2021107763A1
Authority
WO
WIPO (PCT)
Prior art keywords
patch
image
noise
patches
image patch
Prior art date
Application number
PCT/MY2020/050122
Other languages
French (fr)
Inventor
Shang Li YUEN
Hamam MOKAYED
Hasmarina HASAN
Den Fairol SAMAON
Hock Woon Hon
Original Assignee
Mimos Berhad
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mimos Berhad filed Critical Mimos Berhad
Publication of WO2021107763A1 publication Critical patent/WO2021107763A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing
    • G06V30/164Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/18105Extraction of features or characteristics of the image related to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/625License plates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • the disclosures made herein relate generally to the field of moving image processing and, more particularly, to a system and method for processing a moving image for recognizing identification data from the moving image.
  • Imaging technologies have resulted in the ability to quickly and easily process still and moving images, in support of a wide variety of applications.
  • One of the moving image processing applications includes surveillance such as traffic, workplaces and the like.
  • surveillance such as traffic, workplaces and the like.
  • imaging systems When imaging systems are used for surveillance of an object or person, it may be highly desirable for the systems to quickly identify the objects and/or people captured in the surveillance images. Especially, in situations where image processing delays the intended progress, such as a traffic pursuit or an unauthorized person entering a surveillance area.
  • RFID radio frequency identification
  • IOT internet-of-things
  • Cikon patent number: CN 101408942 B discloses a method for locating a vehicle license plate in complex background, wherein two adjacent frames of the surveillance video are processed for maximum removal of complex background interference and comprehensive application of color and grayscale images to locate a license plate within the image frames. This approach is faster and efficient in locating the license plate, however it is highly time consuming and requires huge amount of processing resources as two adjacent frames of the video need to be processed.
  • the present invention relates to a system and method for processing a moving image.
  • the system comprises an input unit for receiving an image frame of the moving image, a parsing unit for parsing the image frame into one or more image patches of preset dimensions, and a filtering unit for processing each image patch to identify and extract an identification (ID) data if the ID data is captured in the image patch.
  • a storage device connected to the filtering unit stores the pre-classified noise patches, wherein the filtering unit updates the storage device with image patches identified as noise patches during each process cycle.
  • a character recognizing unit recognizes characters in the extracted ID data.
  • the filtering unit includes an adaptive noise filter, a white noise filter and a color distance graph based noise filter.
  • the adaptive noise filter compares each image patch with the pre-classified noise patches in the storage device and passes the image patch to the white noise filter if the image patch does not match with any of the pre-classified noise patches.
  • the white noise filter for detects if the image patch is a plain and uniform noise patch and passes the image patch to the color distance graph based noise filter if the image patch is not a plain and uniform noise patch.
  • the color distance graph based noise filter detects if the image patch is a color distance graph based noise patch and extracts the image patch as the ID data if the image patch is not a color distance graph based noise patch.
  • the present invention also includes a method for processing a moving image, wherein the method comprises receiving the image frame of the moving image, parsing the image frame into multiple image patches and processing each of the image patches to identify and extract an identification (ID) data if said ID data is captured in the image patch. Further, characters in the extracted ID data are recognized.
  • ID identification
  • each image patch is compared with a set of pre-classified noise patches stored in a storage device and is passed to a white noise filter if the image patch does not match with any of the pre-classified noise patches.
  • the white noise filter it is identified if the image patch is a plain and uniform noise patch. If no, it is determined whether the image patch is a noise patch. Then, the image patch is extracted as the ID data if the image patch is not a color distance graph based noise patch.
  • the present invention accurately identifies image patches containing the ID data, such that a need for processing entire image frame to recognize the ID data is avoided, and thus enabling faster recognition of the ID data from the moving image without using high performance image processing resources.
  • FIGURE 1 illustrates a block representation of the system for processing a moving image, in accordance with an exemplary embodiment of the present invention.
  • FIGURE 2 illustrates a flow diagram of the method for processing a moving image, in accordance with an exemplary embodiment of the present invention.
  • FIGURE 3 illustrates a set of parent contours and child contours thereof, in accordance with an exemplary embodiment of the present invention.
  • the description hereinafter, of the specific embodiment will so fully reveal the general nature of the embodiments herein that others can, by applying current knowledge, readily modify or adapt or perform both for various applications such specific embodiment without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments.
  • the phraseology or terminology employed herein is for the purpose of description and not of limitation.
  • the present invention may be embodied as a system, method or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware or programmable instructions) or an embodiment combining software and hardware aspects that may all generally be referred to herein as an “unit,” “module,” or “system.”
  • Moving image A continuous sequence of image frames captured with or without audio using a video camera, closed-circuit television and the like. It includes but not limited to movie and surveillance video.
  • Identification A physical device attached to an object/asset such as vehicle, package and the like, or worn by a person e.g. police, security personnel, etc. for identifying the object carrying the device or the person wearing the device. It includes but not limited to license plate, employee ID, barcode label, badge, etc.
  • the present invention provides a system and a method for processing a moving image.
  • the system comprises a parsing unit for parsing an image frame of the moving image into multiple image patches of preset dimensions and a filtering unit for processing each image patch to identify and extract an identification (ID) data if the ID data is captured in the image patch.
  • ID identification
  • the present invention is capable of eliminating a need for processing the entire image frame to recognize the ID data, and thus enabling faster recognition of the ID data from the moving image without using high performance image processing resources.
  • FIGURE 1 illustrates a block representation of the system for processing a moving image, in accordance with an exemplary embodiment of the present invention.
  • the system (10) comprises an input unit (11) for receiving an image frame of the moving image captured using an imaging device (1) such as video camera or closed-circuit television (CCTV) device.
  • the input unit (11) is any device capable of capturing one more image frames of the moving image from the imaging device (1).
  • the input unit (11) may be a user interface for enabling a user to select an image frame of the moving image.
  • a parsing unit (12) receives the image frame from the input unit (11) and parses the image frame into one or more image patches of preset dimensions.
  • the dimensions are set based on a shape and dimensions of an identification (ID) device to be captured using the imaging device (1).
  • ID an identification
  • the dimensions are set based on a shape and dimensions of the license plate as prescribed by the corresponding governing body such as traffic department. Additionally, multiple sets of dimensions may be preloaded into the system (10) and the user is allowed to select the dimensions as per requirements.
  • a filtering unit (13) processes each of the image patches to identify and extract an ID data if the ID data is captured in the image patch.
  • a character recognizing unit (15) such as an optical character recognizer (OCR) recognizes one or more characters in the extracted ID data. If no ID data is identified in the image patch, then the filtering unit (13) stores the image patch as a noise patch in a storage device (14), whereby the storage device (14) storing a set of pre-classified noise patches.
  • OCR optical character recognizer
  • the storage device (14) is a database remotely connected to the filtering unit (13).
  • the storage device (14) may be a magnetic, optical or solid-state drive or any other storage means capable of being updated during each cycle of ID recognition.
  • the filtering unit (13) includes three sub-components including an adaptive noise filter (13a), white noise filter (13b) and color distance graph based noise filter (13c).
  • the adaptive noise filter (13a) compares each image patch with the pre-classified noise patches in the storage device (14) and passes the image patch to the white noise filter (13b) if the image patch does not match with any of the pre-classified noise patches.
  • the adaptive noise filter (13a) converts the image patch into a first grayscale patch and compares the first grayscale patch with each pre-classified noise patch, wherein the pre-classified noise patches are received from the storage device (14).
  • the adaptive noise filter (13a) computes a structural similarity score for the first grayscale patch with respect to each pre-classified noise patch.
  • a pre-classified noise patch is considered as the most similar noise patch if the computed structural similarity score against the pre-classified noise patch reaches the highest value.
  • the adaptive noise filter (13a) compares the highest structural similarity score with a similarity threshold. If the highest structure similarity score is higher than the similarity threshold, there is a match between the first grayscale patch and the most similar noise patch. Otherwise, there is no match between the first grayscale patch and the most similar noise patch.
  • the similarity threshold is defined manually or automatically based on results of empirical studies or experiments of similar and non-similar matches of pre classified patches or plot distribution graph based on such results.
  • the adaptive noise filter (13a) determines the image patch as a noise patch and discards the image patch if the corresponding grayscale patch matches with one or more of the pre-classified noise patches. Additionally, a reset counter (not shown) in the adaptive noise filter (13a) is incremented by one and all the pre-classified noise patches received from the storage device (14) are deleted if the reset counter reaches a reset threshold set based on a frame rate of the moving image.
  • the adaptive noise filter (13a) passes the image patch to the white noise filter (13b) for further filtering, if the corresponding grayscale patch does not match with any of the pre-classified noise patches.
  • a null-action counter of the most similar noise patch is incremented by one and the most similar noise patch is deleted from the storage device (14) if the null-action counter reaches a null-action threshold.
  • the null-action threshold is user- defined based on one or more streaming factors such as camera frame rate, network speed and the like.
  • the white noise filter (13b) detects if the image patch is a plain and uniform noise patch and passes the image patch to the color distance graph based noise filter (13c) if the image patch is not a plain and uniform noise patch.
  • the white noise filter (13b) updates the storage device (14) with the image patch as a noise patch if the image patch is detected as a plain and uniform noise patch.
  • the white noise filter (13b) Upon receiving an image patch from the adaptive noise filter (13a), the white noise filter (13b) converts the image patch into a first binary image and then into a first edge image using a standard edge extraction algorithm.
  • the white noise filter (13a) filters noise in the first edge image using a morphological opening and closing operation, wherein the filtered noise is a small and trivial edge resulting from the conversion process.
  • the white noise filter (13a) inverts the first edge image and computes a total white region in the inverted edge image.
  • the white noise filter (13b) computes a median result by executing an XOR operation between the filtered edge image and the first binary image. Additionally, the white noise filter (13b) computes a converge ratio between the white region and a total area in the median result and an expansion ratio as a product of the converge ratio and the white region.
  • the white noise filter (13b) classifies the image patch as a plain and uniform noise patch.
  • the first white noise threshold is defined manually or automatically based results of empirical studies or experiments of pre-classified patches or plot distribution graph based on such results. If the coverage ratio is higher than the first white noise threshold, the white noise filter (13b) compares the converge ratio with a second white noise threshold. The white noise filter (13b) determines the image patch as not a plain and uniform noise patch, if the coverage ratio is less than the second white noise threshold. If the coverage ratio is higher than the second white noise threshold, the white noise filter (13b) compares the corresponding expansion ratio with a third white noise threshold.
  • the second white noise threshold is set manually or automatically based on the first white noise threshold and results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results. If the expansion ratio is less than the third white noise threshold, then the white noise filter (13b) classifies the image patch as a plain and uniform noise patch.
  • the third white noise threshold is set manually or automatically based on the first and second white noise thresholds and results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results. If the expansion ratio is higher than the third white noise threshold, the white noise filter (13b) determines the image patch as not a plain and uniform noise patch. Upon determining that the image patch is not a plain and uniform noise patch, the white noise filter (13b) passes the image patch to the color distance graph based noise filter (13c) for further filtering.
  • the color distance graph based noise filter (13c) detects if the image patch is a color distance graph based noise patch and extracts the image patch as the ID data if the image patch is not a color distance graph based noise patch.
  • the color distance graph based noise filter (13c) updates the storage device (14) with the image patch as a noise patch if the image patch is detected as a color distance graph based noise patch.
  • a color distance graph is a collection of difference value between mean color of a parent contour and a child contour.
  • a mean color of contour means average value of average red, green and blue of contour from original image.
  • the outermost contour (A) is parent of the intermediate contour (B) which itself is parent of the innermost contour (C).
  • the color distance graph based noise filter (13c) Upon receiving the image patch from the white noise filter (13b), the color distance graph based noise filter (13c) converts the image patch into a second grayscale image and then converts the second grayscale image into a second binary image using a fixed thresholding method and into a second edge image. Also, the color distance graph based noise filter (13c) converts the image patch into a third binary image using an adaptive thresholding method and computes contours from the third binary image. The color distance graph based noise filter (13c) computes a difference of mean color for each contour and its parent contour and records the difference into flow graph.
  • the color distance graph based noise filter (13c) creates a first matrix, wherein a value of a cell in the first matrix is set to 1 when value of color distance graph is more than 0 and add all rows into a first column ‘VertMatl’.
  • the color distance graph based noise filter (13c) creates a second matrix, wherein a value of a cell of the second matrix is set to 1 when value of color distance graph is more than a color distance threshold and adds all rows into a second column ‘VertMat2’.
  • the color distance threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
  • the color distance graph based noise filter (13c) generates an empty image and also computes a filter ratio for each contour for identifying a character region, wherein the filter ratio is a ratio of VertMat2 and vertMatl .
  • the filter ratio is compared to a contour threshold and the character region is plotted on the empty image if the filter ratio is greater than the contour threshold. Otherwise, the color distance graph based noise filter (13c) moves on to next contour.
  • the contour threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
  • the color distance graph based noise filter (13c) After plotting the character region, the color distance graph based noise filter (13c) creates a third edge image by intersecting the second edge image and the empty image.
  • the color distance graph based noise filter (13c) filters noise in the third edge image by performing morphological opening and closing operation, wherein the filtered noise is a small and trivial edge resulting from edge extraction algorithm.
  • the color distance graph based noise filter (13c) adds up all row values of the third edge image into a single column vertMat and all column values of the third edge image into a single row horzMat.
  • the color distance graph based noise filter (13c) computes a vertical difference (VDiff) and a horizontal difference (HDiff) by comparing minimum and maximum values in the vertMat and horzMat, respectively. If the VDiff or HDiff is greater than a difference threshold, the color distance graph based noise filter (13c) extracts the image patch as the ID data. Otherwise, the color distance graph based noise filter (13c) proceeds to further processing.
  • the difference threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
  • the color distance graph based noise filter (13c) computes an edge ratio of a number of white pixel with respect to a total image area in the third edge image. If the edge ratio is greater than an edge threshold, the color distance graph based noise filter (13c) crops and resizes the image patch based on a non-zero region of the empty image. Otherwise, the color distance graph based noise filter (13c) clones and resizes the image patch and stores resized patch as ROI Image.
  • the edge threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
  • the color distance graph based noise filter (13c) classifies the ROIImage as the ID data.
  • the prediction threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results. Otherwise, the color distance graph based noise filter (13c) classifies the image patch as a color distance graph based noise and uploads the same in the storage device (14).
  • the color distance graph based noise filter (13c) includes a machine learning-based classification module (not shown) for calculating the predication probability.
  • the classification module may be any conventional image classification model that functions based on a machine learning algorithm.
  • the character recognizing unit recognizes one or more characters in the extracted ID data, wherein the characters are further processed to identify an object or a person carrying the ID device.
  • the present invention is capable of eliminating a need for processing the entire image frame to recognize the ID data, and thus enabling faster recognition of the ID data from the moving image without using high performance image processing resources.
  • FIGURE 2 shows a flow diagram of the method for processing a moving image, in accordance with the exemplary embodiment of the present invention.
  • the method comprises the steps of: receiving an image frame of the moving image (21), parsing the image frame into multiple image patches of preset dimensions (22) and processing each image patch to identify and extract an identification (ID) data if the ID data is captured in the image patch (23).
  • ID identification
  • the image patch is compared with a set of pre-classified noise patches stored in a storage device and passed to a white noise filter if the image patch does not match with any of the pre-classified noise patches.
  • the image patch is checked at the white noise filter if it is a plain and uniform noise patch. If no, it is determined whether the image patch is a color distance graph based noise patch. If no, the image patch is extracted as the ID data. If the image patch is identified as a plain and uniform noise or a color distance graph based noise patch, then the storage device is updated with the image patch as a noise patch. The updated storage device is used in the next image processing cycle for the step of comparing the image patch with the pre-classified noise patches.
  • the image patch is converted into a first grayscale patch and the first grayscale patch is compared with each pre-classified noise patch.
  • a structural similarity score is computed for the first grayscale patch with respect to each pre-classified noise patch.
  • the pre-classified noise patch against which the computed structural similarity score reaches the highest value is considered as the most similar noise patch.
  • the highest structural similarity score is compared with a similarity threshold. If the highest structure similarity score is higher than the similarity threshold, there is a match between the first grayscale patch and the most similar noise patch. Otherwise, there is no match between the first grayscale patch and the most similar noise patch.
  • the similarity threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
  • the image patch is determined as a noise patch and is discarded if the corresponding grayscale patch matches with one or more of the pre-classified noise patches. Otherwise, the image patch is passed to the white noise filter for further filtering. Additionally, a null-action counter of the most similar noise patch is incremented by one and the most similar noise patch is deleted from the storage device if the null-action counter reaches a null-action threshold.
  • the null-action threshold is user-defined based on one or more streaming factors such as camera frame rate, network speed and the like.
  • the passed image patch is checked at the white noise filter if the image patch is a plain and uniform noise patch and is passed to a color distance graph based noise filter if the image patch is not a plain and uniform noise patch.
  • the image patch is converted into a first binary image at the white noise filter and then into a first edge image using a standard edge extraction algorithm. Noise in the first edge image is filtered using a morphological opening and closing operation, wherein the filtered noise is a small and trivial edge resulting from the conversion process.
  • the first edge image is inverted and a total white region in the inverted edge image is computed.
  • a median result is computed by executing an XOR operation between the filtered edge image and the first binary image. Additionally, a converge ratio between the white region and a total area in the median result is computed and an expansion ratio as a product of the converge ratio and the white region is calculated.
  • the image patch is classified as a plain and uniform noise patch. If the coverage ratio is higher than the first white noise threshold, then the converge ratio is compared with a second white noise threshold. The image patch is determined as not a plain and uniform noise patch, if the coverage ratio is less than the second white noise threshold.
  • the first white noise threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
  • the second white noise threshold is set manually or automatically based on the first white noise threshold and results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
  • the corresponding expansion ratio is compared with a third white noise threshold.
  • the third white noise threshold is set manually or automatically based on the first and second white noise thresholds and results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results. If the expansion ratio is less than the third white noise threshold, then the image patch is classified as a plain and uniform noise patch. If the expansion ratio is higher than the third white noise threshold, the image patch is determined as not a plain and uniform noise patch. Upon determining that the image patch is not a plain and uniform noise patch, the image patch is passed to the color distance graph based noise filter for further filtering.
  • the image patch Upon receiving the image patch from the white noise filter, the image patch is converted into a second grayscale image at the color distance graph based noise filter. Further, the second grayscale image is converted into a second binary image using a fixed thresholding method and then into a second edge image. Additionally, the image patch is converted into a third binary image at the color distance graph based noise filter using an adaptive thresholding method and contours are computed from the third binary image.
  • a difference of mean color is computed for each contour and its parent contour and the difference is recorded into a flow graph.
  • a first matrix is created at the color distance graph based noise filter, wherein a value of a cell in the first matrix is set to 1 when value of color distance graph is more than 0 and all rows are added into a first column ‘VertMatl’.
  • a second matrix is created at the color distance graph based noise filter, wherein a value of a cell of the second matrix is set to 1 when value of color distance graph is more than a color distance threshold and all rows are added into a second column ‘VertMat2’.
  • the color distance threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
  • An empty image is generated and a filter ratio for each contour is computed for identifying a character region, wherein the filter ratio is a ratio of VertMat2 and vertMatl .
  • the filter ratio is compared to a contour threshold and the character region is plotted on the empty image if the filter ratio is greater than the contour threshold. Otherwise, a next contour is processed.
  • the contour threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
  • a third edge image is created at the color distance graph based noise filter by intersecting the second edge image and the empty image.
  • Noise in the third edge image is filtered by performing morphological opening and closing operation. All row values of the third edge image are added up into a single column vertMat and all column values of the third edge image are added up into a single row horzMat.
  • a vertical difference (VDiff) and a horizontal difference (HDiff) are computed by comparing minimum and maximum values in the vertMat and horzMat, respectively. If the VDiff or HDiff is greater than a difference threshold, the image patch is extracted as the ID data. Otherwise, the process is moved on to further steps.
  • the difference threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
  • An edge ratio is computed of a number of white pixel with respect to a total image area in the third edge image, at the color distance graph based noise filter. If the edge ratio is greater than an edge threshold, the image patch is cropped and resized based on a non-zero region of the empty image. Otherwise, the image patch is cropped, resized and stored as ROIImage.
  • the edge threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
  • a prediction threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
  • a machine learning-based classification module is used for calculating the predication probability, wherein the classification module may be any conventional image classification model that functions based on a machine learning algorithm.
  • the present invention is capable of eliminating a need for processing the entire image frame to recognize the ID data, and thus enabling faster recognition of the ID data from the moving image without using high performance image processing resources.
  • the present invention may also be applied for other applications.
  • Such applications include but not limited to identifying ID tags attached to animals, birds and/or plants/trees in a farm, plantation, nursery and/or forest.
  • the imaging device may also be attached to an unmanned vehicle used for monitoring of such animals, birds and/or plants/trees.
  • An apparatus for practicing various embodiments of the present invention may involve one or more computers (or one or more processors within a single computer) and storage systems containing or having network access to computer program(s) coded in accordance with various methods described herein, and the method steps of the disclosure could be accomplished by modules, routines, subroutines, or subparts of a computer program product.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to a system and method for for processing a moving image. The system (10) comprises an input unit (11) for receiving an image frame of the moving image, a parsing unit (12) for parsing the image frame into one or more image patches of preset dimensions, and a filtering unit (13) for processing each image patch to identify and extract an identification (ID) data if the ID data is captured in the image patches. A storage device (14) connected to the filtering unit (13) stores the pre-classified noise patches, wherein the filtering unit (13) updates the storage device (14) with image patches identified as noise patches during each process cycle.

Description

SYSTEM AND METHOD FOR PROCESSING MOVING IMAGE
FIELD OF THE DISCLOSURE
The disclosures made herein relate generally to the field of moving image processing and, more particularly, to a system and method for processing a moving image for recognizing identification data from the moving image.
BACKGROUND
Recent developments in imaging technologies have resulted in the ability to quickly and easily process still and moving images, in support of a wide variety of applications. One of the moving image processing applications includes surveillance such as traffic, workplaces and the like. When imaging systems are used for surveillance of an object or person, it may be highly desirable for the systems to quickly identify the objects and/or people captured in the surveillance images. Especially, in situations where image processing delays the intended progress, such as a traffic pursuit or an unauthorized person entering a surveillance area.
A numerous approaches have been developed for this purpose such as radio frequency identification (RFID) tags, smart internet-of-things (IOT) devices and the like. However, these solutions are highly expensive and sophisticated in terms of installation as they require high level processing. Therefore, these are not suitable for monitoring a huge number of objects/people. For example, connecting all vehicles in traffic to internet or to any single system is highly complicated. A possible simple solution would be to attach an ID device such as license plate, ID tags, badges and the like, to the object/person under surveillance.
Even though it is easy to identify the surveyed item by viewing the ID device attached to the item, automation of this process is very complicated as faster recognition of information printed on the ID device requires extremely complex image processing. Chinese patent number: CN 101408942 B discloses a method for locating a vehicle license plate in complex background, wherein two adjacent frames of the surveillance video are processed for maximum removal of complex background interference and comprehensive application of color and grayscale images to locate a license plate within the image frames. This approach is faster and efficient in locating the license plate, however it is highly time consuming and requires huge amount of processing resources as two adjacent frames of the video need to be processed.
Hence, there is a need for a system and method for processing a moving image for recognizing identification data from the moving image without using high performance image processing resources while reducing time consumption.
SUMMARY
The present invention relates to a system and method for processing a moving image. The system comprises an input unit for receiving an image frame of the moving image, a parsing unit for parsing the image frame into one or more image patches of preset dimensions, and a filtering unit for processing each image patch to identify and extract an identification (ID) data if the ID data is captured in the image patch. A storage device connected to the filtering unit stores the pre-classified noise patches, wherein the filtering unit updates the storage device with image patches identified as noise patches during each process cycle. A character recognizing unit recognizes characters in the extracted ID data.
In a preferred embodiment, the filtering unit includes an adaptive noise filter, a white noise filter and a color distance graph based noise filter. The adaptive noise filter compares each image patch with the pre-classified noise patches in the storage device and passes the image patch to the white noise filter if the image patch does not match with any of the pre-classified noise patches.
The white noise filter for detects if the image patch is a plain and uniform noise patch and passes the image patch to the color distance graph based noise filter if the image patch is not a plain and uniform noise patch. Finally, the color distance graph based noise filter detects if the image patch is a color distance graph based noise patch and extracts the image patch as the ID data if the image patch is not a color distance graph based noise patch.
The present invention also includes a method for processing a moving image, wherein the method comprises receiving the image frame of the moving image, parsing the image frame into multiple image patches and processing each of the image patches to identify and extract an identification (ID) data if said ID data is captured in the image patch. Further, characters in the extracted ID data are recognized.
In one aspect of the present invention, each image patch is compared with a set of pre-classified noise patches stored in a storage device and is passed to a white noise filter if the image patch does not match with any of the pre-classified noise patches. In the white noise filter, it is identified if the image patch is a plain and uniform noise patch. If no, it is determined whether the image patch is a noise patch. Then, the image patch is extracted as the ID data if the image patch is not a color distance graph based noise patch.
The present invention accurately identifies image patches containing the ID data, such that a need for processing entire image frame to recognize the ID data is avoided, and thus enabling faster recognition of the ID data from the moving image without using high performance image processing resources.
Various objects, features, aspects and advantages of the inventive subject matter will become more apparent from the following detailed description of preferred embodiments, along with the accompanying drawing figures in which like numerals represent like components.
BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS
In the figures, similar components and/or features may have the same reference label. Further, various components of the same type may be distinguished by following the reference label with a second label that distinguishes among the similar components. If only the first reference label is used in the specification, the description is applicable to any one of the similar components having the same first reference label irrespective of the second reference label.
FIGURE 1 illustrates a block representation of the system for processing a moving image, in accordance with an exemplary embodiment of the present invention.
FIGURE 2 illustrates a flow diagram of the method for processing a moving image, in accordance with an exemplary embodiment of the present invention.
FIGURE 3 illustrates a set of parent contours and child contours thereof, in accordance with an exemplary embodiment of the present invention.
DETAILED DESCRIPTION
In accordance with the present invention, there is provided a system and a method for processing a moving image, which will now be described with reference to the embodiment shown in the accompanying drawings. The embodiment does not limit the scope and ambit of the disclosure. The description relates purely to the exemplary embodiment and its suggested applications.
The embodiment herein and the various features and advantageous details thereof are explained with reference to the non-limiting embodiment in the following description. Descriptions of well-known components and processing techniques are omitted so as to not unnecessarily obscure the embodiments herein. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiment herein may be practiced and to further enable those of skill in the art to practice the embodiment herein. Accordingly, the description should not be construed as limiting the scope of the embodiment herein.
The description hereinafter, of the specific embodiment will so fully reveal the general nature of the embodiments herein that others can, by applying current knowledge, readily modify or adapt or perform both for various applications such specific embodiment without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. It is to be understood that the phraseology or terminology employed herein is for the purpose of description and not of limitation. As will be appreciated by one skilled in the art, the present invention may be embodied as a system, method or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware or programmable instructions) or an embodiment combining software and hardware aspects that may all generally be referred to herein as an “unit,” “module,” or “system.”
Various terms as used herein are defined below. To the extent a term used in a claim is not defined below, it should be given the broadest definition persons in the pertinent art have given that term as reflected in printed publications and issued patents at the time of filing.
Definitions:
Moving image: A continuous sequence of image frames captured with or without audio using a video camera, closed-circuit television and the like. It includes but not limited to movie and surveillance video. Identification (ID): A physical device attached to an object/asset such as vehicle, package and the like, or worn by a person e.g. police, security personnel, etc. for identifying the object carrying the device or the person wearing the device. It includes but not limited to license plate, employee ID, barcode label, badge, etc.
The present invention provides a system and a method for processing a moving image. The system comprises a parsing unit for parsing an image frame of the moving image into multiple image patches of preset dimensions and a filtering unit for processing each image patch to identify and extract an identification (ID) data if the ID data is captured in the image patch. By this way, the present invention is capable of eliminating a need for processing the entire image frame to recognize the ID data, and thus enabling faster recognition of the ID data from the moving image without using high performance image processing resources.
Referring to the accompanying drawings, FIGURE 1 illustrates a block representation of the system for processing a moving image, in accordance with an exemplary embodiment of the present invention. The system (10) comprises an input unit (11) for receiving an image frame of the moving image captured using an imaging device (1) such as video camera or closed-circuit television (CCTV) device. In a preferred embodiment, the input unit (11) is any device capable of capturing one more image frames of the moving image from the imaging device (1). Alternatively, the input unit (11) may be a user interface for enabling a user to select an image frame of the moving image.
A parsing unit (12) receives the image frame from the input unit (11) and parses the image frame into one or more image patches of preset dimensions. In a preferred embodiment, the dimensions are set based on a shape and dimensions of an identification (ID) device to be captured using the imaging device (1). For example, if the ID device to be captured is a vehicle license plate, then the dimensions are set based on a shape and dimensions of the license plate as prescribed by the corresponding governing body such as traffic department. Additionally, multiple sets of dimensions may be preloaded into the system (10) and the user is allowed to select the dimensions as per requirements.
A filtering unit (13) processes each of the image patches to identify and extract an ID data if the ID data is captured in the image patch. A character recognizing unit (15) such as an optical character recognizer (OCR) recognizes one or more characters in the extracted ID data. If no ID data is identified in the image patch, then the filtering unit (13) stores the image patch as a noise patch in a storage device (14), whereby the storage device (14) storing a set of pre-classified noise patches.
In a preferred embodiment, the storage device (14) is a database remotely connected to the filtering unit (13). Alternatively, the storage device (14) may be a magnetic, optical or solid-state drive or any other storage means capable of being updated during each cycle of ID recognition. The filtering unit (13) includes three sub-components including an adaptive noise filter (13a), white noise filter (13b) and color distance graph based noise filter (13c). The adaptive noise filter (13a) compares each image patch with the pre-classified noise patches in the storage device (14) and passes the image patch to the white noise filter (13b) if the image patch does not match with any of the pre-classified noise patches.
The adaptive noise filter (13a) converts the image patch into a first grayscale patch and compares the first grayscale patch with each pre-classified noise patch, wherein the pre-classified noise patches are received from the storage device (14). The adaptive noise filter (13a) computes a structural similarity score for the first grayscale patch with respect to each pre-classified noise patch. A pre-classified noise patch is considered as the most similar noise patch if the computed structural similarity score against the pre-classified noise patch reaches the highest value. The adaptive noise filter (13a) compares the highest structural similarity score with a similarity threshold. If the highest structure similarity score is higher than the similarity threshold, there is a match between the first grayscale patch and the most similar noise patch. Otherwise, there is no match between the first grayscale patch and the most similar noise patch. The similarity threshold is defined manually or automatically based on results of empirical studies or experiments of similar and non-similar matches of pre classified patches or plot distribution graph based on such results.
The adaptive noise filter (13a) determines the image patch as a noise patch and discards the image patch if the corresponding grayscale patch matches with one or more of the pre-classified noise patches. Additionally, a reset counter (not shown) in the adaptive noise filter (13a) is incremented by one and all the pre-classified noise patches received from the storage device (14) are deleted if the reset counter reaches a reset threshold set based on a frame rate of the moving image.
On the other hand, the adaptive noise filter (13a) passes the image patch to the white noise filter (13b) for further filtering, if the corresponding grayscale patch does not match with any of the pre-classified noise patches. Additionally, a null-action counter of the most similar noise patch is incremented by one and the most similar noise patch is deleted from the storage device (14) if the null-action counter reaches a null-action threshold. In a preferred embodiment, the null-action threshold is user- defined based on one or more streaming factors such as camera frame rate, network speed and the like.
The white noise filter (13b) detects if the image patch is a plain and uniform noise patch and passes the image patch to the color distance graph based noise filter (13c) if the image patch is not a plain and uniform noise patch. The white noise filter (13b) updates the storage device (14) with the image patch as a noise patch if the image patch is detected as a plain and uniform noise patch.
Upon receiving an image patch from the adaptive noise filter (13a), the white noise filter (13b) converts the image patch into a first binary image and then into a first edge image using a standard edge extraction algorithm. The white noise filter (13a) filters noise in the first edge image using a morphological opening and closing operation, wherein the filtered noise is a small and trivial edge resulting from the conversion process. The white noise filter (13a) inverts the first edge image and computes a total white region in the inverted edge image.
Further, the white noise filter (13b) computes a median result by executing an XOR operation between the filtered edge image and the first binary image. Additionally, the white noise filter (13b) computes a converge ratio between the white region and a total area in the median result and an expansion ratio as a product of the converge ratio and the white region.
If the coverage ratio is less than a first white noise threshold, then the white noise filter (13b) classifies the image patch as a plain and uniform noise patch. The first white noise threshold is defined manually or automatically based results of empirical studies or experiments of pre-classified patches or plot distribution graph based on such results. If the coverage ratio is higher than the first white noise threshold, the white noise filter (13b) compares the converge ratio with a second white noise threshold. The white noise filter (13b) determines the image patch as not a plain and uniform noise patch, if the coverage ratio is less than the second white noise threshold. If the coverage ratio is higher than the second white noise threshold, the white noise filter (13b) compares the corresponding expansion ratio with a third white noise threshold. The second white noise threshold is set manually or automatically based on the first white noise threshold and results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results. If the expansion ratio is less than the third white noise threshold, then the white noise filter (13b) classifies the image patch as a plain and uniform noise patch. The third white noise threshold is set manually or automatically based on the first and second white noise thresholds and results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results. If the expansion ratio is higher than the third white noise threshold, the white noise filter (13b) determines the image patch as not a plain and uniform noise patch. Upon determining that the image patch is not a plain and uniform noise patch, the white noise filter (13b) passes the image patch to the color distance graph based noise filter (13c) for further filtering.
Finally, the color distance graph based noise filter (13c) detects if the image patch is a color distance graph based noise patch and extracts the image patch as the ID data if the image patch is not a color distance graph based noise patch. The color distance graph based noise filter (13c) updates the storage device (14) with the image patch as a noise patch if the image patch is detected as a color distance graph based noise patch.
A color distance graph is a collection of difference value between mean color of a parent contour and a child contour. A mean color of contour means average value of average red, green and blue of contour from original image. Suppose there are three contours (A - C, shown in FIGURE 3). The outermost contour (A) is parent of the intermediate contour (B) which itself is parent of the innermost contour (C).
For contour (A), average value (R,G,B) is (0, 0, 0); and mean value = (R+G+B)/3 = 0 For contour (B), average value (R,G,B) is (255, 255, 255); and mean value = (R+G+B)/3 = 255 For contour (C), average value (R,G,B) is (0, 0, 0); and mean value = (R+G+B)/3 = 0.
Difference value between each parent contour and corresponding child contour in FIGURE 3 is shown in Table 1.
Table 1. Difference values between the parent contours and corresponding child contours
Figure imgf000012_0001
Upon receiving the image patch from the white noise filter (13b), the color distance graph based noise filter (13c) converts the image patch into a second grayscale image and then converts the second grayscale image into a second binary image using a fixed thresholding method and into a second edge image. Also, the color distance graph based noise filter (13c) converts the image patch into a third binary image using an adaptive thresholding method and computes contours from the third binary image. The color distance graph based noise filter (13c) computes a difference of mean color for each contour and its parent contour and records the difference into flow graph. The color distance graph based noise filter (13c) creates a first matrix, wherein a value of a cell in the first matrix is set to 1 when value of color distance graph is more than 0 and add all rows into a first column ‘VertMatl’. Similarly, the color distance graph based noise filter (13c) creates a second matrix, wherein a value of a cell of the second matrix is set to 1 when value of color distance graph is more than a color distance threshold and adds all rows into a second column ‘VertMat2’. The color distance threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
The color distance graph based noise filter (13c) generates an empty image and also computes a filter ratio for each contour for identifying a character region, wherein the filter ratio is a ratio of VertMat2 and vertMatl . The filter ratio is compared to a contour threshold and the character region is plotted on the empty image if the filter ratio is greater than the contour threshold. Otherwise, the color distance graph based noise filter (13c) moves on to next contour. The contour threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
After plotting the character region, the color distance graph based noise filter (13c) creates a third edge image by intersecting the second edge image and the empty image. The color distance graph based noise filter (13c) filters noise in the third edge image by performing morphological opening and closing operation, wherein the filtered noise is a small and trivial edge resulting from edge extraction algorithm. The color distance graph based noise filter (13c) adds up all row values of the third edge image into a single column vertMat and all column values of the third edge image into a single row horzMat. The color distance graph based noise filter (13c) computes a vertical difference (VDiff) and a horizontal difference (HDiff) by comparing minimum and maximum values in the vertMat and horzMat, respectively. If the VDiff or HDiff is greater than a difference threshold, the color distance graph based noise filter (13c) extracts the image patch as the ID data. Otherwise, the color distance graph based noise filter (13c) proceeds to further processing. The difference threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
The color distance graph based noise filter (13c) computes an edge ratio of a number of white pixel with respect to a total image area in the third edge image. If the edge ratio is greater than an edge threshold, the color distance graph based noise filter (13c) crops and resizes the image patch based on a non-zero region of the empty image. Otherwise, the color distance graph based noise filter (13c) clones and resizes the image patch and stores resized patch as ROI Image. The edge threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
Further, if prediction probability is greater than a prediction threshold the color distance graph based noise filter (13c) classifies the ROIImage as the ID data. The prediction threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results. Otherwise, the color distance graph based noise filter (13c) classifies the image patch as a color distance graph based noise and uploads the same in the storage device (14). The color distance graph based noise filter (13c) includes a machine learning-based classification module (not shown) for calculating the predication probability. The classification module may be any conventional image classification model that functions based on a machine learning algorithm.
Finally, the character recognizing unit (15) recognizes one or more characters in the extracted ID data, wherein the characters are further processed to identify an object or a person carrying the ID device. By this way, the present invention is capable of eliminating a need for processing the entire image frame to recognize the ID data, and thus enabling faster recognition of the ID data from the moving image without using high performance image processing resources.
FIGURE 2 shows a flow diagram of the method for processing a moving image, in accordance with the exemplary embodiment of the present invention. The method comprises the steps of: receiving an image frame of the moving image (21), parsing the image frame into multiple image patches of preset dimensions (22) and processing each image patch to identify and extract an identification (ID) data if the ID data is captured in the image patch (23).
To extract the ID data, the image patch is compared with a set of pre-classified noise patches stored in a storage device and passed to a white noise filter if the image patch does not match with any of the pre-classified noise patches. The image patch is checked at the white noise filter if it is a plain and uniform noise patch. If no, it is determined whether the image patch is a color distance graph based noise patch. If no, the image patch is extracted as the ID data. If the image patch is identified as a plain and uniform noise or a color distance graph based noise patch, then the storage device is updated with the image patch as a noise patch. The updated storage device is used in the next image processing cycle for the step of comparing the image patch with the pre-classified noise patches.
During the comparison step, the image patch is converted into a first grayscale patch and the first grayscale patch is compared with each pre-classified noise patch. A structural similarity score is computed for the first grayscale patch with respect to each pre-classified noise patch. The pre-classified noise patch against which the computed structural similarity score reaches the highest value is considered as the most similar noise patch. The highest structural similarity score is compared with a similarity threshold. If the highest structure similarity score is higher than the similarity threshold, there is a match between the first grayscale patch and the most similar noise patch. Otherwise, there is no match between the first grayscale patch and the most similar noise patch. The similarity threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
The image patch is determined as a noise patch and is discarded if the corresponding grayscale patch matches with one or more of the pre-classified noise patches. Otherwise, the image patch is passed to the white noise filter for further filtering. Additionally, a null-action counter of the most similar noise patch is incremented by one and the most similar noise patch is deleted from the storage device if the null-action counter reaches a null-action threshold. In a preferred embodiment, the null-action threshold is user-defined based on one or more streaming factors such as camera frame rate, network speed and the like.
The passed image patch is checked at the white noise filter if the image patch is a plain and uniform noise patch and is passed to a color distance graph based noise filter if the image patch is not a plain and uniform noise patch. The image patch is converted into a first binary image at the white noise filter and then into a first edge image using a standard edge extraction algorithm. Noise in the first edge image is filtered using a morphological opening and closing operation, wherein the filtered noise is a small and trivial edge resulting from the conversion process. The first edge image is inverted and a total white region in the inverted edge image is computed.
Further, a median result is computed by executing an XOR operation between the filtered edge image and the first binary image. Additionally, a converge ratio between the white region and a total area in the median result is computed and an expansion ratio as a product of the converge ratio and the white region is calculated.
If the coverage ratio is less than a first white noise threshold, then the image patch is classified as a plain and uniform noise patch. If the coverage ratio is higher than the first white noise threshold, then the converge ratio is compared with a second white noise threshold. The image patch is determined as not a plain and uniform noise patch, if the coverage ratio is less than the second white noise threshold. The first white noise threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results. The second white noise threshold is set manually or automatically based on the first white noise threshold and results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
If the coverage ratio is higher than the second white noise threshold, the corresponding expansion ratio is compared with a third white noise threshold. The third white noise threshold is set manually or automatically based on the first and second white noise thresholds and results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results. If the expansion ratio is less than the third white noise threshold, then the image patch is classified as a plain and uniform noise patch. If the expansion ratio is higher than the third white noise threshold, the image patch is determined as not a plain and uniform noise patch. Upon determining that the image patch is not a plain and uniform noise patch, the image patch is passed to the color distance graph based noise filter for further filtering.
Finally, it is detected at the color distance graph based noise filter if the image patch is a color distance graph based noise patch and the image patch is extracted as the ID data if the image patch is not a color distance graph based noise patch.
Upon receiving the image patch from the white noise filter, the image patch is converted into a second grayscale image at the color distance graph based noise filter. Further, the second grayscale image is converted into a second binary image using a fixed thresholding method and then into a second edge image. Additionally, the image patch is converted into a third binary image at the color distance graph based noise filter using an adaptive thresholding method and contours are computed from the third binary image.
A difference of mean color is computed for each contour and its parent contour and the difference is recorded into a flow graph. A first matrix is created at the color distance graph based noise filter, wherein a value of a cell in the first matrix is set to 1 when value of color distance graph is more than 0 and all rows are added into a first column ‘VertMatl’. Similarly, a second matrix is created at the color distance graph based noise filter, wherein a value of a cell of the second matrix is set to 1 when value of color distance graph is more than a color distance threshold and all rows are added into a second column ‘VertMat2’. The color distance threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
An empty image is generated and a filter ratio for each contour is computed for identifying a character region, wherein the filter ratio is a ratio of VertMat2 and vertMatl . The filter ratio is compared to a contour threshold and the character region is plotted on the empty image if the filter ratio is greater than the contour threshold. Otherwise, a next contour is processed. The contour threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
After plotting the character region, a third edge image is created at the color distance graph based noise filter by intersecting the second edge image and the empty image. Noise in the third edge image is filtered by performing morphological opening and closing operation. All row values of the third edge image are added up into a single column vertMat and all column values of the third edge image are added up into a single row horzMat. A vertical difference (VDiff) and a horizontal difference (HDiff) are computed by comparing minimum and maximum values in the vertMat and horzMat, respectively. If the VDiff or HDiff is greater than a difference threshold, the image patch is extracted as the ID data. Otherwise, the process is moved on to further steps. The difference threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
An edge ratio is computed of a number of white pixel with respect to a total image area in the third edge image, at the color distance graph based noise filter. If the edge ratio is greater than an edge threshold, the image patch is cropped and resized based on a non-zero region of the empty image. Otherwise, the image patch is cropped, resized and stored as ROIImage. The edge threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results.
Further, if prediction probability is greater than a prediction threshold the ROIImage is classified as the ID data. Otherwise, the image patch is classified as a color distance graph based noise and is uploaded to the storage device. The prediction threshold is set manually or automatically based on results of empirical studies or experiments on ID and non-ID patches or plot distribution graph based on such results. In a preferred embodiment, a machine learning-based classification module is used for calculating the predication probability, wherein the classification module may be any conventional image classification model that functions based on a machine learning algorithm.
Finally, one or more characters in the extracted ID data are recognized, wherein the characters are further processed to identify an object or a person carrying the ID device. By this way, the present invention is capable of eliminating a need for processing the entire image frame to recognize the ID data, and thus enabling faster recognition of the ID data from the moving image without using high performance image processing resources.
Applications Even though the above embodiments show the present invention as being applied to identify vehicles in traffic and humans under surveillance, it is to be understood that the present invention may also be applied for other applications. Such applications include but not limited to identifying ID tags attached to animals, birds and/or plants/trees in a farm, plantation, nursery and/or forest. Further, the imaging device may also be attached to an unmanned vehicle used for monitoring of such animals, birds and/or plants/trees.
The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" may be intended to include the plural forms as well, unless the context clearly indicates otherwise.
The terms "comprises," "comprising," “including,” and “having,” are inclusive and therefore specify the presence of stated features, integers, steps, operations, elements, or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or groups thereof. The method steps, processes, and operations described herein are not to be construed as necessarily requiring their performance in the particular order discussed or illustrated, unless specifically identified as an order of performance. It is also to be understood that additional or alternative steps may be employed. The use of the expression “at least” or “at least one” suggests the use of one or more elements, as the use may be in one of the embodiments to achieve one or more of the desired objects or results.
Various methods described herein may be practiced by combining one or more machine-readable storage media containing code that perform the steps according to the present invention with appropriate standard computer hardware to execute the code contained therein. An apparatus for practicing various embodiments of the present invention may involve one or more computers (or one or more processors within a single computer) and storage systems containing or having network access to computer program(s) coded in accordance with various methods described herein, and the method steps of the disclosure could be accomplished by modules, routines, subroutines, or subparts of a computer program product.
While the foregoing describes various embodiments of the disclosure, other and further embodiments of the disclosure may be devised without departing from the basic scope thereof. The scope of the disclosure is determined by the claims that follow. The disclosure is not limited to the described embodiments, versions or examples, which are included to enable a person having ordinary skill in the art to make and use the disclosure when combined with information and knowledge available to the person having ordinary skill in the art.

Claims

CLAIMS:
1. A system (10) for processing a moving image, comprising:
- at least one input unit (11) for receiving at least one image frame of said moving image;
- at least one parsing unit (12) for parsing said image frame into one or more image patches of preset dimensions;
- at least one filtering unit (13) for processing each of said image patches to identify and extract an identification, ID, data if said ID data is captured in said image patch;
- at least one storage device (14) connected to said filtering unit (13) and storing one or more pre-classified noise patches; and
- at least one character recognizing unit (15) for recognizing at least one character in said extracted ID data, characterized in that said filtering unit (13) includes: i. an adaptive noise filter (13a) for comparing each image patch with said pre-classified noise patches in said storage device (14) and passing said image patch to a white noise filter (13b) if said image patch does not match with any of said pre-classified noise patches; ii. said white noise filter (13b) for detecting if said image patch is a plain and uniform noise patch and for passing said image patch to a color distance graph based noise filter (13c) if said image patch is not a plain and uniform noise patch; and iii. said color distance graph based noise filter (13c) for detecting if said image patch is a color distance graph based noise patch and extracting said image patch as said ID data if said image patch is not a color distance graph based noise patch.
2. The system (10) as claimed in claim 1 , wherein said white noise filter (13b) update said storage device (14) with said image patch if said image patch is detected as a plain and uniform noise patch.
3. The system (10) as claimed in claim 1 , wherein said color distance graph based noise filter (13c) updates said storage device (14) with said image patch if said image patch is detected as a color distance graph based noise patch.
4. The system (10) as claimed in claim 1 , wherein said adaptive noise filter (13a) compares each image patch with said pre-classified noise patches by:
- converting said image patch into a grayscale patch;
- comparing said grayscale patch with each of said pre-classified noise patches; and
- determining said image patch as a noise patch if said grayscale patch matches with at least one of said pre-classified noise patches.
5. The system (10) as claimed in claim 4, wherein said adaptive noise filter (13a) passes said image patch to said white noise filter (13b) if said grayscale patch does not match with any of said pre-classified noise patches.
6. The system (10) as claimed in claim 1, wherein said ID data is a vehicle license plate data.
7. A method (20) for processing a moving image, comprising:
- receiving at least one image frame of said moving image (21 );
- parsing said image frame into one or more image patches of preset dimensions (22); and
- processing each of said image patches to identify and extract an identification, ID, data if said ID data is captured in said image patch (23), characterized in that said step of processing said image patch includes: i. comparing said image patch with a set of pre-classified noise patches stored in at least one storage device; ii. passing said image patch to a white noise filter if said image patch does not match with any of said pre-classified noise patches; iii. identifying if said image patch is a plain and uniform noise patch; iv. determining whether said image patch is a noise patch if said image patch is not a plain and uniform noise patch; v. extracting said image patch as said ID data if said image patch is not a color distance graph based noise patch.
8. The method (20) as claimed in claim 7, further comprising the step of updating said storage device with an image patch:
- if said image patch is identified as a plain and uniform noise; or
- if said image patch is determined as a color distance graph based noise patch.
9. The method (20) as claimed in claim 7, wherein said step of comparing each image patch with said pre-classified noise patches includes:
- converting said image patch into a grayscale patch;
- comparing said grayscale patch with each of said pre-classified noise patches; and
- determining said image patch as a noise patch if said grayscale patch matches with at least one of said pre-classified noise patches.
10. The method (20) as claimed in claim 7, wherein said step of passing said image patch to said white noise filter includes passing said image patch to said white noise filter if said grayscale patch does not match with any of said pre-classified noise patches.
PCT/MY2020/050122 2019-11-29 2020-10-28 System and method for processing moving image WO2021107763A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
MYPI2019007064 2019-11-29
MYPI2019007064 2019-11-29

Publications (1)

Publication Number Publication Date
WO2021107763A1 true WO2021107763A1 (en) 2021-06-03

Family

ID=76130669

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/MY2020/050122 WO2021107763A1 (en) 2019-11-29 2020-10-28 System and method for processing moving image

Country Status (1)

Country Link
WO (1) WO2021107763A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005250786A (en) * 2004-03-03 2005-09-15 Tateyama Machine Kk Image recognition method
US20140355837A1 (en) * 2003-02-21 2014-12-04 Accenture Global Services Limited Electronic Toll Management and Vehicle Identification
US20160299897A1 (en) * 2015-04-09 2016-10-13 Veritoll, Llc License plate matching systems and methods
US20180268238A1 (en) * 2017-03-14 2018-09-20 Mohammad Ayub Khan System and methods for enhancing license plate and vehicle recognition
KR102008630B1 (en) * 2018-12-13 2019-08-07 장경익 Apparatus and method for increasing image recognition rate

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140355837A1 (en) * 2003-02-21 2014-12-04 Accenture Global Services Limited Electronic Toll Management and Vehicle Identification
JP2005250786A (en) * 2004-03-03 2005-09-15 Tateyama Machine Kk Image recognition method
US20160299897A1 (en) * 2015-04-09 2016-10-13 Veritoll, Llc License plate matching systems and methods
US20180268238A1 (en) * 2017-03-14 2018-09-20 Mohammad Ayub Khan System and methods for enhancing license plate and vehicle recognition
KR102008630B1 (en) * 2018-12-13 2019-08-07 장경익 Apparatus and method for increasing image recognition rate

Similar Documents

Publication Publication Date Title
CN108053427B (en) Improved multi-target tracking method, system and device based on KCF and Kalman
CN108009473B (en) Video structuralization processing method, system and storage device based on target behavior attribute
CN108052859B (en) Abnormal behavior detection method, system and device based on clustering optical flow characteristics
US20170300786A1 (en) Methods and systems for accurately recognizing vehicle license plates
US8243991B2 (en) Method and apparatus for detecting targets through temporal scene changes
US10445885B1 (en) Methods and systems for tracking objects in videos and images using a cost matrix
CN111161312B (en) Object trajectory tracking and identifying device and system based on computer vision
MX2011002293A (en) Text localization for image and video ocr.
US11354819B2 (en) Methods for context-aware object tracking
US20220301275A1 (en) System and method for a hybrid approach for object tracking across frames.
Czyżewski et al. Multi-stage video analysis framework
Tribak et al. Remote QR code recognition based on HOG and SVM classifiers
CN111429376A (en) High-efficiency digital image processing method with high-precision and low-precision integration
US20220076022A1 (en) System and method for object tracking using feature-based similarities
CN112686248B (en) Certificate increase and decrease type detection method and device, readable storage medium and terminal
CN113392754A (en) Method for reducing false detection rate of pedestrian based on yolov5 pedestrian detection algorithm
CN112883783A (en) Video concentration method and device, terminal equipment and storage medium
Chuang et al. Aggregated segmentation of fish from conveyor belt videos
WO2021107763A1 (en) System and method for processing moving image
Jaiswal et al. Survey paper on various techniques of recognition and tracking
Ilayarajaa et al. Text recognition in moving vehicles using deep learning neural networks
EP4064218A1 (en) Vehicle identification profile methods and systems at the edge
CN111091056A (en) Method and device for identifying sunglasses in image, electronic equipment and storage medium
Swapna et al. A Hybrid Approach for Helmet Detection for Riders Safety using Image Processing, Machine Learning, Artificial Intelligence
Rakshith et al. Identification of cattle breeds by segmenting different body parts of the cow using neural network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20892496

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20892496

Country of ref document: EP

Kind code of ref document: A1