WO2023170228A1 - Method and system for identifying rows of objects in images - Google Patents

Method and system for identifying rows of objects in images Download PDF

Info

Publication number
WO2023170228A1
WO2023170228A1 PCT/EP2023/056057 EP2023056057W WO2023170228A1 WO 2023170228 A1 WO2023170228 A1 WO 2023170228A1 EP 2023056057 W EP2023056057 W EP 2023056057W WO 2023170228 A1 WO2023170228 A1 WO 2023170228A1
Authority
WO
WIPO (PCT)
Prior art keywords
objects
rows
image frames
computer
colour
Prior art date
Application number
PCT/EP2023/056057
Other languages
French (fr)
Inventor
Matthias Horst MEIER
Mithun DAS
Oliver Horeth
Thiemo BUCHNER
Antonio Weber
Original Assignee
Continental Automotive Technologies GmbH
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Continental Automotive Technologies GmbH filed Critical Continental Automotive Technologies GmbH
Publication of WO2023170228A1 publication Critical patent/WO2023170228A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/30Post-processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/457Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by analysing connectivity, e.g. edge linking, connected component analysis or slices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects

Definitions

  • the invention relates to a method and system for identifying rows of objects in images, and more specifically, the invention relates to a method and system for generating representative lines for rows of objects in images.
  • Detection of objects organized in rows is used in various applications, including the navigation of vehicles and/or robots through, for example, crop rows, roads, parking lots, warehouses, and shipping yards. Accuracy of detection of such rows of objects is important and any inaccurate or false detections could potentially lead to problems in downstream applications. For example, inaccurate or false detections may distract a steering system of an autonomous vehicle and/or robot and cause the autonomous vehicle and/or robot to crash into objects.
  • information from external systems such as a global navigation satellite system (GNSS) may be used to guide such steering systems of autonomous vehicles and/or robots, such guidance may fail if the connection to such external systems is lost, or when predefined points in the GNSS deviate from the actual situation on the ground.
  • GNSS global navigation satellite system
  • Breaks in the rows of objects caused by empty space, obstacles, obstructions, or other unwanted objects or noise may also lead to intensity variations in the rows of objects when captured on images and cause false results in image edge detection. For example, where the rows of objects are plants or crops, the presence of stones or weeds may cause false edge detections.
  • Embodiments of the present invention improve the detection of rows of objects in images by incorporating pre-processing and post-processing methods to generate a representative line for each row of objects detected in an image comprising one or more rows of objects.
  • the generated representative lines may be used for subsequent applications, such as vehicular or robotic path planning, or vehicular or robot navigation.
  • the present invention provides a computer- implemented method for generating lines representing one or more rows of objects over a plurality of image frames, the method comprising: receiving from one or more sensors a plurality of image frames comprising one or more rows of objects, wherein the plurality of image frames is captured sequentially as the one or more sensors travel along the one or more rows of objects; converting each of the received plurality of image frames into a binary image and defining edges in each binary image using a pre-processing method; identifying one or more line segments in each binary image; and generating a representative line over the plurality of image frames based on geometric line post-processing of the one or more identified line segments, wherein each representative line represents a row of objects of the one or more rows of objects.
  • edge detection algorithm identifies boundaries between objects in an image based on changes or discontinuities in the intensity of image
  • efficiency and accuracy of edge detection is increased when the changes or discontinuities in the intensity of image are maximised by the pre-processing method.
  • geometric line post-processing refines the identified one or more line segments over a plurality of frames and filters any false identifications.
  • a preferred method of the present invention is a computer-implemented method as described above, wherein the one or more rows of objects comprise at least one of: a road surface marking, a parking lot line marking, a crop row, a plant row, a harvested swath, a crop edge, a transition between a cut crop and uncut crop, a transition between a harvested and unharvested crop, a storage rack row, a row of pallets, a row of containers, or a row of storage boxes.
  • the above-described aspect of the present invention has the advantage that the above-listed rows of objects have generally uniform colours, uniform heights, or colours which are different from a ground surface which increases the accuracy of the method.
  • a preferred method of the present invention is a computer-implemented method as described above, wherein the one or more sensors comprises a visible light sensor and/or a depth sensor.
  • the above-described aspect of the present invention has the advantage that the visible light image sensor generates colour images with pixel values that correspond to the colour of the objects and its surroundings, and the depth sensor generates images or depth images with pixel values that correspond to the distance of objects from the depth sensor. These colour images and/or depth images are then able to be converted into generate binary images through a pre-processing method.
  • a preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein the pre-processing method is selected based on at least one of: a colour difference between the one or more rows of objects and a ground surface when captured by a visible light sensor; and a distance difference between the one or more rows of objects and the ground surface when captured by a depth sensor.
  • a preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein the pre-processing method converts each of the received plurality of image frames into a binary image based on colour of the one or more rows of objects and a ground surface or distance of the one or more rows of objects and a ground surface to the one or more sensors.
  • the above-described aspect of the present invention has the advantage that conversion of based on colour of the objects or distance of objects covers a wide range of objects and conditions such that the method may have various applications.
  • a preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein the pre-processing method comprises: removing noise from each of the plurality of image frames, preferably by applying a gaussian blur filter; assigning pixels of each of the plurality of image frames to a first group or a second group based on pixel colour, preferably by using k-means clustering; converting pixels of the first group to a first colour and pixels of the second group to a second colour; and defining edges between the first colour and the second colour, preferably using an edge detection algorithm.
  • the above-described aspect of the present invention has the advantage that the edges between the objects and the ground may be easily detected as long as there is a colour difference between the objects and the ground, without using predefined threshold values and without relying on a distance or height difference between the objects and the ground, as any colour difference between the objects and the ground is maximised.
  • a preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein the pre-processing method comprises: removing noise from each of the plurality of image frames, preferably by applying a gaussian blur filter; maximising a contrast of depth values of each of the plurality of image frames, preferably by applying histogram equalization; selecting areas of each of the plurality of image frames which appear closer to the one or more sensors than a ground surface, preferably by applying a predefined threshold; converting pixels within the selected area to a first colour and pixels within a remaining unselected area to a second colour; removing noise, preferably by applying a morphological transformation, and more preferably morphological opening and closing; and defining edges between the first colour and the second colour, preferably using an edge detection algorithm.
  • the above-described aspect of the present invention has the advantage that the edges between the objects and the ground may be easily detected as long as there is height difference between the objects and the ground , without using predefined threshold values and without relying on a colour or intensity difference between the objects and the ground, as such height difference would correspond to a difference in distance or depth value captured by a depth sensor, and such difference in distance or depth value is maximised.
  • a preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein identifying one or more line segments in each binary image comprises using a Hough transform, preferably probabilistic Hough transform.
  • a preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein geometric line post-processing of the one or more identified line segments comprises: merging one or more identified line segments of each binary image that are proximate to each other, preferably based on a predefined threshold; smoothening the merged line segments across the plurality of image frames, preferably by applying an infinite impulse response (IIR) filter; and filtering the smoothened merged line segments across the plurality of image frames to generate the representative line, preferably by: maintaining a counter for each smoothened merged line segment based on a number of image frames each smoothened merged line segment appears in; removing smoothened merged line segments with a counter smaller than a predefined threshold after a pre
  • IIR infinite impulse response
  • the above-described aspect of the present invention has the advantage that geometric line post-processing minimises issues of false line segments identified due to noise, duplicate line segments identified for the same row of objects and/or missing line segments due to gaps present in the one or more rows of objects.
  • a preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, further comprising: verifying the generated representative lines, preferably by identifying a deviation between the representative lines and predefined points or lines; and optionally, changing the pre-processing method depending on a result of the verification step, preferably changing the pre-processing method if an identified deviation between the generated representative lines and predefined points or lines is above a predefined threshold.
  • the above-described aspect of the present invention has the advantage that the accuracy of the generated representative lines is improved by verifying against predefined points such that the representative lines may be generated again using a different preprocessing method if the initial pre-processing method used is not performing well.
  • the invention also relates to a computer-implemented method of planning a vehicle path above at least one row of objects, the method comprising: defining a representative line for each of the at least one row of objects over a plurality of image frames captured by one or more sensors using the computer-implemented method according to the invention, wherein the one or more sensors are preferably mounted on the vehicle, and designating one of the defined representative lines as a middle of a vehicle path above at least one row of objects.
  • the above-described aspect of the present invention has the advantage that the representative line may be used to plan an accurate vehicle path above at least one row of objects without the wheels of the vehicle running over the at least one row of objects.
  • the invention also relates to a computer-implemented method of planning a vehicle path above and/or between at least one row of objects, the method comprising: defining a representative line for each row of objects over a plurality of image frames captured by one or more sensors using the computer-implemented method according to the invention, wherein the one or more sensors are preferably mounted on the vehicle; and defining a line between two adjacent representative lines, wherein the line is preferably equidistant from the two adjacent representative lines; and designating the defined line as a middle of a vehicle path above and/or between rows of objects.
  • the invention also relates to a system comprising one or more sensors, one or more processors and a memory that stores executable instructions for execution by the one or more processors, the executable instructions comprising instructions for performing a computer- implemented method according to the invention.
  • the invention also relates to a vehicle comprising a system according to the invention.
  • the invention also relates to a computer program, a machine-readable storage medium, or a data carrier signal that comprises instructions, that upon execution on one or more processors, cause the one or more processors to perform a computer-implemented method according to the invention.
  • the machine-readable storage medium may be any storage medium, such as for example, a USB stick, a CD, a DVD, a data storage device, a hard disk, or any other medium on which a program element as described above can be stored.
  • vehicle means any mobile agent capable of movement, including cars, trucks, buses, agricultural machines, forklift, robots, whether or not such mobile agent is capable of carrying or transporting goods, animals, or humans, and whether or not such mobile agent is autonomous.
  • object means any object, including plants, crops, boxes, paint, and bricks, whether or not such objects protrude from the ground.
  • Fig. l is a schematic illustration of a system for generating lines representing one or more rows of objects over a plurality of image frames, in accordance with embodiments of the present disclosure
  • FIG. 2 is a flowchart of a computer-implemented method for generating lines representing one or more rows of objects over a plurality of image frames, in accordance with embodiments of the present disclosure
  • Fig. 3 is an example of an image generated by a line segment identification module, in accordance with embodiments of the present disclosure
  • FIG. 4 is a flowchart of a first pre-processing method, in accordance with embodiments of the present disclosure
  • FIG. 5 is a flowchart of a second pre-processing method, in accordance with embodiments of the present disclosure.
  • Fig. 6 is a flowchart of geometric post-processing of line segments to generate representative lines, according to embodiments of the present disclosure;
  • Fig. 7 is a flowchart of filtering smoothened merged line segments across a plurality of image frames, in accordance with embodiments of the present disclosure
  • Fig. 8 is a flowchart of a method of planning a vehicle path above at least one row of objects, according to embodiments of the present disclosure
  • Fig. 9 is a flowchart of a method of planning a vehicle path above and/or between at least one row of objects, according to embodiments of the present disclosure.
  • Fig. 10 is a schematic illustration of a computer system within which a set of instructions, when executed, may cause one or more processors of the computer system to perform one or more of the methods described herein, in accordance with embodiments of the present disclosure.
  • FIG. 1 is a schematic illustration of a system for generating lines representing one or more rows of objects over a plurality of image frames, in accordance with embodiments of the present disclosure.
  • System 100 for generating lines representing one or more rows of objects over a plurality of image frames may comprise one or more sensors 104 coupled to one or more processors 108.
  • the one or more sensors 104 are used to collect a plurality of image frames of one or more rows of objects.
  • the one or more rows of objects are arranged transversely or substantially transversely to the one or more sensors 104.
  • the one or more sensors 104 are configured to capture the plurality of image frames as the one or more sensors 104 travel along the one or more rows of objects, preferably in a direction parallel or substantially parallel to the one or more rows of objects.
  • the one or more sensors 104 are mounted on a vehicle such that the plurality of image frames are captured by the one or more sensors 104 as the vehicle travels parallel or substantially parallel along the one or more rows of objects.
  • the one or more sensors 104 are mounted on a vehicle such that the one or more sensors 104 collect the plurality of image frames from the perspective of the vehicle.
  • the plurality of image frames may comprise object image data and ground image data.
  • the object image data may comprise image data of at least one of: a road surface marking, a parking lot line marking, a crop row, a plant row, a harvested swath, a crop edge, a transition between a cut crop and uncut crop, a transition between a harvested and unharvested crop, a storage rack row, a row of pallets, a row of containers, or a row of storage boxes.
  • the ground image data may comprise image data of the ground, soil, floor, or surface on which the one or more rows of objects are positioned on or within.
  • the one or more sensors 104 are positioned such that the field of view (FoV) of the one or more sensors 104 includes only the one or more rows of objects and the ground or surface on which the one or more rows of objects are positioned on or within.
  • the one or more sensors 104 may comprise a visible light sensor and/or a depth sensor.
  • a visible light sensor captures information relating to the colour of objects in a scene.
  • the visible light sensor may be a camera or a video camera.
  • the visible light sensor may generate colour image data wherein each pixel or group of pixels of the colour image data may be associated with a colour or an intensity level measuring the amount of visible light energy observed, reflected and/or emitted from objects and the ground within a scene or within an image representing the scene, or a portion thereof.
  • the colour image data may be RGB (red, green, blue) colour data, CMYK (cyan, magenta, yellow, black) colour data, HSV (hue, saturation, value) colour data, or image data in other colour space.
  • a depth sensor captures information relating to the distances of surfaces of objects and the ground in a scene.
  • the depth sensor may be an infrared camera or video camera.
  • the depth sensor may be a camera with two infrared lenses that calculates the distance to objects and/or the ground based on the disparity between the two infrared streams.
  • the depth sensor may generate depth image data wherein each pixel or group of pixels of the depth image data may be associated with an intensity level based on the distance of surfaces of objects or the ground in a scene.
  • the depth image data may be expressed as RGB (red, green, blue) colour data, CMYK (cyan, magenta, yellow, black) colour data, HSV (hue, saturation, value) colour data, or image data in other colour space.
  • the plurality of image frames collected by the one or more sensors 104 may be processed by one or more processors 108 to generate lines representing one or more rows of objects over the plurality of image frames.
  • the one or more processors 108 may comprise several software modules, such as an image conversion module 112, a line segment identification module 116, and a representative line generator module 120.
  • the image conversion module 112 communicates with the line segment identification module 116 and the line segment identification module 116 communicates with the representative line generator module 120.
  • the one or more processors 108 may further comprise a representative line verification module 124 which communicates with both the representative line generator module 120 and the image conversion module 112.
  • system 100 may be configured to plan a path for a vehicle.
  • the one or more processors 108 may further comprise a path planner module 128 which communicates with the representative line generator module 120.
  • the path planner module 128 may use the generated representative lines from the representative line generator module 120 and/or the verified representative lines from the representative line verification module 124 to find or plan a suitable path for the vehicle to a predefined global target.
  • System 100 configured to plan a path for a vehicle may also further comprise a vehicle guidance system 132 which uses the output from the path planner module 128, as well as other parameters such as vehicle dimensions, and/or output from additional sensors mounted on the vehicle, such as a wheel or motion sensors to guide the vehicle towards the predefined global target.
  • FIG. 2 is a flowchart of a computer-implemented method for generating lines representing one or more rows of objects over a plurality of image frames, in accordance with embodiments of the present disclosure.
  • Computer-implemented method 200 for generating lines representing one or more rows of obj ects over a plurality of image frames may be carried out by the one or more processors 108.
  • the one or more rows of objects may comprise at least one of: a road surface marking, a parking lot line marking, a crop row, a plant row, a harvested swath, a crop edge, a transition between a cut crop and uncut crop, a transition between a harvested and unharvested crop, a storage rack row, a row of pallets, a row of containers, or a row of storage boxes.
  • Method 200 commences at operation 204, wherein the one or more processors 108 receive a plurality of image frames comprising one or more rows of objects from one or more sensors 104.
  • the one or more rows of objects are arranged transversely or substantially transversely to the one or more sensors 104.
  • the plurality of image frames is captured sequentially as the one or more sensors 104 travel along the one or more rows of objects, and preferably in a direction parallel or substantially parallel to the one or more rows of objects.
  • the one or more sensors 104 may comprise a visible light sensor and/or a depth sensor.
  • a visible light sensor may be preferable where there is sufficient visible light reflected off the one or more rows of objects and the ground, such as during the day or when the lights are turned on within an enclosed space.
  • a visible light sensor may be preferable where there is insufficient height difference between the objects and the ground, such as when the objects are embedded within the ground.
  • a depth sensor may be preferable where there is insufficient visible light reflected off objects and the ground, such as during the night or when the lights are turned off within an enclosed space. In some embodiments, a depth sensor may be preferable where there is sufficient height difference between the objects and the ground, such that there is a sufficient difference in the depth values of the object image data and the ground image data.
  • method 200 may continue with operation 208, wherein the image conversion module 112 converts each image frame of the plurality of image frames received in operation 204 into a binary image and defines edges in each binary image using a pre-processing method.
  • a binary image is an image that comprises binary image data, wherein the pixels that comprise the binary image each have one of two states, or one of two colours.
  • the binary image comprises pixels that either are white or black (e.g., minimum intensity value or maximum intensity value), or any other two colours that may be on opposite ends of a colour spectrum.
  • the binary image may represent object image data of the one or more rows of objects as white and the ground image data as black, or vice versa.
  • method 200 may continue with operation 212, wherein the line segment identification module 116 identifies one or more line segments in each binary image generated by the image conversion module 112 in operation 208.
  • the one or more line segments are lines that are fit to the one or more rows of objects.
  • Various techniques for the identification of line segments may be employed.
  • the one or more line segments are identified using a Hough transform method, which is a known method to locate shapes such as lines in images.
  • Hough transformation may be configured with a minimum number of pixels of a first colour required to form a line, and a maximum gap size of pixels of a second colour in between.
  • Hough transform discards short lines caused by noise and is able to tolerate gaps between the objects forming the one or more rows of objects, such as gaps between plants making up a crop row or gaps on a broken road surface marking.
  • the one or more line segments are identified using probabilistic Hough transformation which randomly samples edge points of the defined edges to generate one or more most probable line segments. Probabilistic Hough transformation may be advantageous as it is faster than Hough transformation whilst retaining accuracy that is almost comparable to Hough transformation.
  • method 200 may continue with operation 216, wherein the representative line generator module 120 generates a representative line over the plurality of image frames based on geometric line post-processing of the one or more line segments identified by the line segment identification module 116 in operation 212.
  • Each representative line generated by the representative line generator module 120 in operation 216 represents a row of objects of the one or more rows of objects.
  • Geometric post-processing of the line segments identified by the line segment identification module 116 in operation 212 is carried out to remove false line segments identified due to noise, duplicate line segments identified for the same row of objects and/or identify and fill in gaps that the line segment identification module 116 may have missed in operation 212.
  • the incorporation of geometric line post-processing is advantageous as geometric line post-processing refines the identified one or more line segments over a plurality of frames and filters any false identifications.
  • method 200 may optionally continue with operation 220, wherein the representative line verification module 124 verifies the representative lines generated by the representative line generator module 120 in operation 216.
  • the generated representative lines may be verified by identifying a deviation between the generated representative lines and predefined points.
  • the predefined points may be points or lines that are representative of the one or more rows of objects that were previously recorded and stored.
  • the predefined points may be Global Navigation Satellite System (GNSS) points.
  • GNSS Global Navigation Satellite System
  • the predefined points may be a floor plan generated of an enclosed space comprising the one or more rows of objects.
  • GNSS Global Navigation Satellite System
  • method 200 may revert to operation 208 wherein the image conversion module 112 converts each image frame of the plurality of image frames into a binary image using a different pre-processing method from the initial pre-processing method used.
  • the pre-processing method is changed if the deviation between the generated representative lines and predefined points or lines identified by the representative line verification module 124 in operation 220 is above a predefined threshold.
  • the predefined threshold may be defined based on several factors, such as the accuracy of the generated representative lines, the accuracy of the predefined points or lines, as well as the state of the objects making up the one or more rows of objects. For example, the predefined threshold may be higher if the one or more rows of objects is wide or uneven.
  • method 200 may continue with operation 224 wherein the path planner module 128 uses the representative lines generated by the representative line generator module 120 in operation 216 or the generated representative lines verified by the representative line verification module 124 in operation 220 to plan a path for a vehicle.
  • path planner module 128 may plan a path or route for a vehicle to a predefined global target.
  • the representative lines generated by the representative line generator module 120 in operation 216 or the generated representative lines verified by the representative line verification module 124 in operation 220 may be used to plan a vehicle path above at least one row of objects.
  • the representative lines generated by the representative line generator module 120 in operation 216 or the generated representative lines verified by the representative line verification module 124 in operation 220 may be used to plan a vehicle path above and/or between rows of objects.
  • FIG. 3 is an example of an image generated by a line segment identification module, in accordance with embodiments of the present disclosure.
  • An image 300 generated by the line segment identification module 116 may comprise black portions 304 representing ground image data, white portions 308 representing object image data, and line segments 312 representing the line segments identified by the line segment identification module 116 in operation 212.
  • Fig. 4 is a flowchart of a first pre-processing method, in accordance with embodiments of the present disclosure.
  • First pre-processing method 400 may be used by the image conversion module 112 on each image frame comprising a colour image collected from a visible light sensor to convert each image frame into a binary image based on colour of the one or more rows of objects and ground surface.
  • first preprocessing method 400 may commence with operation 404 wherein noise is removed from each of the plurality of image frames. Noise may be removed using any known method or algorithm for removal of noise.
  • noise is removed from each of the plurality of image frames by applying a gaussian blur filter which is an approximation filter that reduces random image noise by smoothening an image using a Gaussian function.
  • first pre-processing method 400 may continue with operation 408 wherein pixels of each of the plurality of image frames are assigned to a first group or a second group based on pixel colour.
  • the assignment of pixels may be carried out using any known method or algorithm for grouping pixels based on colour of the pixels.
  • the pixels of each of the plurality of image frames are assigned to a first group or a second group based on pixel colour using k-means clustering which is a grouping method with reduced computing power required as compared to other pixel grouping methods.
  • Operation 408 therefore assigns the pixels to one of two groups.
  • the first group may comprise pixels representing object image data
  • the second group may comprise pixels representing ground image data.
  • the first group may comprise pixels representing plants of crop rows
  • the second group may comprise pixels representing the ground or surrounding soil.
  • the first group may comprise pixels representing road surface markings
  • the second group may comprise pixels representing the road surface.
  • first pre-processing method 400 may continue with operation 412 wherein the pixels of the first group are converted to a first colour and the pixels of the second group are converted to a second colour.
  • the first colour and second colour may be any colours.
  • first colour and second colour are either white or black which are colours on opposite ends of the colour spectrum that have either a maximum or minimum intensity value, which would reduce the computing required in subsequent operations.
  • the larger the difference in intensity the more efficient the subsequent edge detection steps detailed below.
  • subsequent edge detection operations may be employed without any predefined threshold values and even in situations where there is only a small colour difference between the objects and the ground as k-means clustering already assigned and separated the pixels based on colour.
  • first pre-processing method 400 may continue with operation 416 wherein edges between the first colour and second colour are defined.
  • the defined edges may then be used by the line segment identification module 116 subsequently in operation 212.
  • the edges are defined using an edge detection algorithm which identifies boundaries between objects in an image and are detected through changes or discontinuities in the intensity of images. As the changes or discontinuities in the intensity of images are maximised by the pre-processing method, the efficiency and accuracy of edge detection is increased.
  • the edges are defined using a canny edge detection algorithm in embodiments where a gaussian blur filter was applied in operation 404 as the application of a gaussian filter is itself a step in canny edge detection algorithm, therefore reducing the overall computing power used in edge detection.
  • Fig. 5 is a flowchart of a second pre-processing method, in accordance with embodiments of the present disclosure.
  • Second pre-processing method 500 may be used by the image conversion module 112 on each image frame comprising a depth image collected from a depth sensor to convert each image frame into a binary image based on distance of the one or more rows of objects and a ground surface to the one or more sensors.
  • second pre-processing method 500 may commence with operation 504 wherein noise is removed from each of the plurality of image frames. Any known methods or algorithms may be used to remove noise.
  • second pre-processing method 500 may continue with operation 508 wherein a contrast of depth values of each of the plurality of image frames is maximized.
  • the contrast of depth values is maximized by applying histogram equalization which is a highly efficient and simple technique of modifying the dynamic range and contrast of an image. Histogram equalization is advantageous as it evenly distributes the intensity values of pixels of an image over an available range.
  • second pre-processing method 500 may continue with operation 512 wherein areas of each of the plurality of image frames which appear closer to the one or more sensors than a ground surface are selected. These selected areas would correspond with object image data and any remaining area that is unselected (i.e., remaining unselected area) would correspond to ground image data as the ground surface would usually be further from the one or more sensors 104 than the one or more rows of objects.
  • the areas are selected by applying a predefined threshold.
  • the predefined threshold may be a distance or depth threshold, such that only pixels with depth or distance values of less than 1 meter from the one or more sensors are selected.
  • second pre-processing method 500 may continue with operation 516 wherein pixels within the selected area are converted to a first colour and pixels within the remaining unselected area are converted to a second colour.
  • the first colour and second colour may be any colours.
  • the first colour and the second colour are either white or black which are on opposite ends of the colour spectrum that have either a maximum or minimum intensity value, which would reduce the computing required in subsequent operations.
  • the larger the difference in intensity the more efficient the subsequent edge detection steps detailed below.
  • subsequent edge detection operations may be employed without any predefined threshold values and even in situations where there is only a small distance difference between the objects and the ground as histogram equalization is used to maximise the contrast of depth values.
  • second pre-processing method 500 may continue with operation 520 wherein noise is further removed.
  • noise is further removed.
  • Any known methods or algorithms may be used to further remove noise.
  • further noise that may be removed includes sporadic objects that may be found between rows of the one of more rows of objects, such as sporadic weeds growing between crop rows.
  • noise is further removed by applying a morphological transformation. Morphological transformation are operations carried out during image processing based on the image shape so that edges can be easily defined in subsequent operations. More preferably, noise is further removed by applying morphological opening and closing. Morphological opening carves out boundaries between objects that may have been merged, while morphological closing fills in small holes or gaps inside objects. Morphological opening and closing may allow subsequent edge detection operations to perform better.
  • second pre-processing method 500 may continue with operation 524 wherein edges between the two colours are defined.
  • the defined edges may then be used by the line segment identification module 116 subsequently in operation 212.
  • the edges are defined using an edge detection algorithm which identifies boundaries between objects in an image and are detected through changes or discontinuities in the intensity of images. As the changes or discontinuities in the intensity of images are maximised by the pre-processing method, the efficiency and accuracy of edge detection is increased.
  • the edges are defined using a canny edge detection algorithm in embodiments where a gaussian blur filer was applied in operation 504 as the application of a gaussian filter is itself a step in canny edge detection algorithm, therefore reducing the overall computing power used in edge detection.
  • the one or more processors 108 of system 100 may execute either first pre-processing method 400 or second pre-processing method 500.
  • the pre-processing method may be selected using a configuration file or a graphical user interface (GUI) setting.
  • GUI graphical user interface
  • the preprocessing method may be selected based on a colour difference between the one or more rows of objects and a ground surface when captured by a visible light sensor.
  • the first pre-processing method 400 may be more appropriate or accurate when there is a larger colour difference between the one or more rows of objects and the ground surface when captured by the visible light sensor as the larger difference in the intensity of pixels of object image data and ground image data captured by the visible light sensor allow easier separation by the first pre-processing method 400 based on colour.
  • the second pre-processing method 500 may be more appropriate or accurate when there is a negligible colour difference between the one or more rows of objects and the ground surface as there is only a negligible difference in the intensity of pixels of colour image data captured by the visible light sensor such that the first pre-processing method 400 may not be as accurate.
  • colour difference may affect the colour difference between the one or more rows of objects and a ground surface when captured by a visible light sensor.
  • lighting condition or other environmental conditions such as fog or haze may affect the colour difference between the one or more rows of objects and a ground surface when captured by a visible light sensor.
  • the colour difference may be more distinct as more visible light is reflected off the objects and the ground surface and captured by the visible light sensor.
  • low lighting conditions such as in the night, during a cloudy or foggy day or an enclosed space with the lights turned off, the colour difference may be less distinct as less visible light is reflected off the objects and the ground surface and captured by the visible light sensor.
  • examples of factors that may affect the colour difference between the one or more rows of objects and a ground surface when captured by a visible light sensor include the plant condition, plant type and/or plant growth stage.
  • plants in a good condition may have more vibrant colours and thus have a large colour difference as compared to its surrounding soil.
  • plants in a poor condition may be more brown or dull and thus have small or negligible colour difference as compared to its surrounding soil.
  • different plant types have different colours and thus the colour difference between the plants and its surround soil may vary depending on the plant type.
  • plants may have different colours at different growth stages, and thus the colour difference between the plants and its surrounding soil may vary across the different growth stages.
  • the pre-processing method may be selected based on a distance difference between the one or more rows of objects and a ground surface when captured by a depth sensor.
  • the second pre-processing method 500 may be preferred where the one or more rows of objects protrude from a ground surface, as such protrusion would lead to differences in depth or distance values captured by a depth sensor. For example, where the objects comprising the one or more objects are tall, the object image data of such objects would have a lower depth or distance value as compared to the ground image data as the objects would appear closer to the one or more sensors 104.
  • the 1 second pre-processing method 500 not be accurate when the one or more rows of objects do not protrude from the ground surface as there would be a negligible distance difference between the one or more rows of objects and the ground surface when captured by the depth sensor, therefore reducing the accuracy of the second pre-processing method 500.
  • the first pre-processing method 400 may be more accurate or appropriate as long as there is some colour difference between the one or more rows of objects and the ground surface when captured by the visible light sensor.
  • the second pre-processing method 500 may be more appropriate for objects that are higher or taller, while the first preprocessing method 400 may be more appropriate for objects that are shorter or flatter.
  • examples of factors that may affect the distance difference between the one or more rows of objects and a ground surface when captured by a depth sensor include the plant condition, plant type and/or plant growth stage.
  • plants in a good condition may grow taller and thus have a large distance difference to the depth sensor as compared to its surrounding soil.
  • plants in a poor condition may be limp or have poor or stunted growth and thus have a small or negligible distance difference to the depth sensor as compared to its surrounding soil.
  • plants may have different heights at different growth stages, and thus the distance difference to the depth sensor may vary across the different growth stages.
  • FIG. 6 is a flowchart of geometric post-processing of line segments to generate representative lines, according to embodiments of the present disclosure.
  • Operation 216 of geometric post-processing of line segments may be carried out by representative line generator module 120 to refine the line segments identified by the line segment identification module 116.
  • operation 216 may commence with operation 604 wherein one or more of the identified line segments that are proximate to each other are merged, preferably based on a predefined threshold.
  • the predefined threshold may be defined based on the types of line segments identified, a distance between the one or more rows of objects, and the type of objects that make up the one or more rows of objects.
  • the predefined threshold may be 1 m, such that adjacent line segments with a distance of less than 1 m between the line segments are merged. Merging proximate line segments together averages line segments that are close to each other and merges multiple detected line segments for the same row of objects into one.
  • operation 216 may continue with operation 608 wherein the merged line segments from operation 604 are smoothened across the plurality of image frames.
  • the merged line segments are smoothened by applying an infinite impulse response (IIR) filter.
  • IIR infinite impulse response
  • An IIR filter finds all merged line segments in a first image frame of a time sequence and saves each merged line segment. The IIR then finds all merged line segments in a subsequent image frame. For each merged line segment in the subsequent image frame, the IIR filter searches for a corresponding merged line segment in the first image frame.
  • the IIR filter finds a matching merged line segment, the IIR filter merges the matching merged line segments by weighting the merged line segment in the subsequent image frame with a constant alpha and the matching merged line segment in the first image frame with a (1-alpha). The IIR filter then continues across the plurality of image frames, finding and weighting matching merged line segments, thus making the merged line segments smoother and more stable across the plurality of image frames.
  • operation 216 may continue with operation 612 wherein the smoothened merged line segments are filtered across the plurality of image frames to remove any smoothened merged line segments that may be due to noise, such as randomly appearing objects not part of the one or more rows of objects.
  • noise is randomly appearing weeds among the one or more rows of plants or crops.
  • the filtering ensures that only smoothened merged line segments that appear over multiple image frames are retained. Any smoothened merged line segments remaining are the representative lines, each representative line representing a row of objects.
  • Fig. 7 is a flowchart of filtering smoothened merged line segments across a plurality of image frames, in accordance with embodiments of the present disclosure.
  • Operation 612 of filtering smoothened merged line segments across a plurality of image frames may be carried out by representative line generator module 120.
  • Operation 612 may commence with operation 716 wherein a counter is maintained for each smoothened merged line segment and each image frame in which the smoothened merged line segment appears adds to the counter.
  • operation 612 may continue with operation 720 wherein smoothened merged lines with a counter smaller than a predefined threshold are removed after a predefined number of image frames.
  • the removal of smoothened merged lines with a counter smaller than a predefined threshold is advantageous as it filters out noise.
  • the predefined threshold and the predefined number of image frames depends on the speed at which the one or more sensors are travelling, as well as the frame rate of the one or more sensors. For example, where a sensor has a frame rate of 30 fps and the sensor is travelling at a speed of 1 m/s, the predefined number of image frames may be set at 60 and the predefined threshold may be set as 5, such that a smoothened merged line is removed if it is seen in less than 5 image frames within the 60 image frames captured, the 60 image frames captured over a distance of 2 meters and within a duration of 2 seconds.
  • operation 612 may continue with operation 724 wherein the counters of all remaining smoothened merged lines is reset after a predefined number of frames. Resetting of the counters is advantageous as it reduces the storage space required. If the counters are not reset, if a line segment is detected once, it would be stored forever even if it were no longer visible to the one or more sensors.
  • the predefined number of frames depends on several factors, including the frame rate of the one or more sensors, the speed at which the one or more sensors are travelling, as well as the type of object making up the one or more rows of objects.
  • Fig. 8 is a flowchart of a method of planning a vehicle path above at least one row of objects, according to embodiments of the present disclosure.
  • Method 800 of planning a vehicle path above at least one row of objects may be carried out by path planner module 128 using the representative lines generated by the representative line generator module 120 in operation 216 or the generated representative lines verified by the representative line verification module 124 in operation 220.
  • Method 800 may commence with operation 804 wherein a representative line for each of the at least one row of objects is defined over a plurality of image frames captured by one or more sensors.
  • the generated representative line may be the representative lines generated by the representative line generator module 120 in operation 216 or the generated representative lines verified by the representative line verification module 124 in operation 220.
  • the one or more sensors 104 are mounted on the vehicle for which the vehicle path is being planned for.
  • method 800 may continue with operation 808 wherein one of the defined representative lines for a row of objects is designated as a middle of the vehicle path above at least one row of objects.
  • Fig. 9 is a flowchart of a method of planning a vehicle path above and/or between at least one row of objects, according to embodiments of the present disclosure.
  • Method 900 of planning a vehicle path above and/or between at least one row of objects may be carried out by path planner module 128 using the representative lines generated by the representative line generator module 120 in operation 216 or the generated representative lines verified by the representative line verification module 124 in operation 220.
  • Method 900 may commence with operation 904 wherein a representative line for each of the at least one row of objects is defined over a plurality of image frames captured by one or more sensors.
  • the generated representative line may be the representative lines generated by the representative line generator module 120 in operation 216 or the generated representative lines verified by the representative line verification module 124 in operation 220.
  • the one or more sensors 104 are mounted on the vehicle for which the vehicle path is being planned for.
  • method 900 may continue with operation 908 wherein a line is defined between two adjacent representative lines.
  • the defined line is equidistant from the two adjacent representative lines.
  • method 900 may continue with operation 912 wherein the defined line is designated as a middle of the vehicle path above and/or between rows of objects.
  • Fig. 10 is a schematic illustration of a computer system within which a set of instructions, when executed, may cause one or more processors of the computer system to perform one or more of the methods described herein, in accordance with embodiments of the present disclosure.
  • computer system 1000 described herein is only an example of a computer system that may be employed and computer systems with other hardware or software configurations may be employed.
  • computer system 1000 may be connected to one or more sensors, such connection to the one or more sensors may be wired or wirelessly.
  • computer system 1000 may comprise a server computer, a laptop, a personal computer, a desktop computer, or any machine capable of executing a set of instructions that specify actions to be taken by the computer system.
  • Computer system 1000 may comprise one or more processors 108 and one or more memory 1004 which communicate with each other via a bus 1008.
  • Computer system 1000 may further comprise a network interface device 1012 which allows computer system 1000 to communicate with a network 1016.
  • computer system 1000 may further comprise a disk drive unit 1020 which may include a machine-readable storage medium 1024 on which is stored one or more sets of instructions 1028 embodying one or more of the methods described herein.
  • the one or more sets of instructions 1028 may also reside in the one or more processors 108 or the one or more memory 1004.
  • the one or more sets of instructions 1028 may be received as a data carrier signal received by computer system 1000.
  • computer system 1000 may further comprise an antenna 1032 which allows computer system 1000 to connect with external sources such as a satellite navigation system.
  • computer system 1000 may be part of a vehicle such that computer system 1000 is used to plan a vehicle path for the vehicle that computer system 1000 is part of.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

A method and system for generating lines representing one or more rows of objects over a plurality of image frames, the method comprising receiving from one or more sensors a plurality of image frames comprising one or more rows of objects, wherein the plurality of image frames is captured sequentially as the one or more sensors travel along the one or more rows of objects; converting each of the received plurality of image frames into a binary image and defining edges in each binary image using a pre-processing method; identifying one or more line segments in each binary image; and generating a representative line over the plurality of image frames based on geometric line post-processing of the one or more identified line segments, wherein each representative line represents a row of objects of the one or more rows of objects.

Description

METHOD AND SYSTEM FOR IDENTIFYING ROWS OF OBJECTS IN IMAGES
TECHNICAL FIELD
[0001] The invention relates to a method and system for identifying rows of objects in images, and more specifically, the invention relates to a method and system for generating representative lines for rows of objects in images.
BACKGROUND
[0002] Detection of objects organized in rows is used in various applications, including the navigation of vehicles and/or robots through, for example, crop rows, roads, parking lots, warehouses, and shipping yards. Accuracy of detection of such rows of objects is important and any inaccurate or false detections could potentially lead to problems in downstream applications. For example, inaccurate or false detections may distract a steering system of an autonomous vehicle and/or robot and cause the autonomous vehicle and/or robot to crash into objects. Although information from external systems such as a global navigation satellite system (GNSS) may be used to guide such steering systems of autonomous vehicles and/or robots, such guidance may fail if the connection to such external systems is lost, or when predefined points in the GNSS deviate from the actual situation on the ground.
[0003] Conventional methods for object row detection generally involve using edge detection methods on images of the rows of objects. However, there are disadvantages of only using edge detection on images for row detection. One disadvantage is that if the intensity difference between the objects and their surroundings (e.g., ground surface) is not sufficiently significant, or if there are gaps present in the rows, image edge detection may fail to detect the rows of objects. For example, where the rows of objects are plants or crops, image edge detection may not accurately detect the plant or crop rows if the intensity difference between the plants and the surrounding soil captured on images is insufficiently significant. In another example, where the rows of objects are track line markings, road markings, or parking lot markings, image edge detection may not accurately detect such markings if the markings are faded or broken. Breaks in the rows of objects caused by empty space, obstacles, obstructions, or other unwanted objects or noise may also lead to intensity variations in the rows of objects when captured on images and cause false results in image edge detection. For example, where the rows of objects are plants or crops, the presence of stones or weeds may cause false edge detections.
SUMMARY
[0004] Embodiments of the present invention improve the detection of rows of objects in images by incorporating pre-processing and post-processing methods to generate a representative line for each row of objects detected in an image comprising one or more rows of objects. The generated representative lines may be used for subsequent applications, such as vehicular or robotic path planning, or vehicular or robot navigation.
[0005] To solve the above technical problems, the present invention provides a computer- implemented method for generating lines representing one or more rows of objects over a plurality of image frames, the method comprising: receiving from one or more sensors a plurality of image frames comprising one or more rows of objects, wherein the plurality of image frames is captured sequentially as the one or more sensors travel along the one or more rows of objects; converting each of the received plurality of image frames into a binary image and defining edges in each binary image using a pre-processing method; identifying one or more line segments in each binary image; and generating a representative line over the plurality of image frames based on geometric line post-processing of the one or more identified line segments, wherein each representative line represents a row of objects of the one or more rows of objects.
[0006] The incorporation of a pre-processing method to convert the image frames into binary images and detect edges within the converted binary image for the identification of one or more line segments and subsequent geometric line post-processing of the one or more line segments to generate representative lines overcomes the technical problem of inaccurate or false edge detection that may be due to several factors such as insignificant intensity difference, presence of gaps in the rows, and/or presence of obstacles, obstructions or other unwanted objects or noise. In particular, the conversion of image frames into binary images using a pre-processing method is advantageous as it increases the accuracy of edge detection and/or accuracy of the one or more line segments identified in a particular image, as well as enables edge detection to be accurately carried out as any difference between the one or more objects and its ground surface is maximised. As an edge detection algorithm identifies boundaries between objects in an image based on changes or discontinuities in the intensity of image, the efficiency and accuracy of edge detection is increased when the changes or discontinuities in the intensity of image are maximised by the pre-processing method. The incorporation of subsequent geometric line post-processing is advantageous as geometric line post-processing refines the identified one or more line segments over a plurality of frames and filters any false identifications.
[0007] A preferred method of the present invention is a computer-implemented method as described above, wherein the one or more rows of objects comprise at least one of: a road surface marking, a parking lot line marking, a crop row, a plant row, a harvested swath, a crop edge, a transition between a cut crop and uncut crop, a transition between a harvested and unharvested crop, a storage rack row, a row of pallets, a row of containers, or a row of storage boxes.
[0008] The above-described aspect of the present invention has the advantage that the above-listed rows of objects have generally uniform colours, uniform heights, or colours which are different from a ground surface which increases the accuracy of the method.
[0009] A preferred method of the present invention is a computer-implemented method as described above, wherein the one or more sensors comprises a visible light sensor and/or a depth sensor.
[0010] The above-described aspect of the present invention has the advantage that the visible light image sensor generates colour images with pixel values that correspond to the colour of the objects and its surroundings, and the depth sensor generates images or depth images with pixel values that correspond to the distance of objects from the depth sensor. These colour images and/or depth images are then able to be converted into generate binary images through a pre-processing method.
[0011] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein the pre-processing method is selected based on at least one of: a colour difference between the one or more rows of objects and a ground surface when captured by a visible light sensor; and a distance difference between the one or more rows of objects and the ground surface when captured by a depth sensor.
[0012] The above-described aspect of the present invention has the advantage that different pre-processing methods may perform better depending on one or more of the features mentioned above.
[0013] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein the pre-processing method converts each of the received plurality of image frames into a binary image based on colour of the one or more rows of objects and a ground surface or distance of the one or more rows of objects and a ground surface to the one or more sensors.
[0014] The above-described aspect of the present invention has the advantage that conversion of based on colour of the objects or distance of objects covers a wide range of objects and conditions such that the method may have various applications.
[0015] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein the pre-processing method comprises: removing noise from each of the plurality of image frames, preferably by applying a gaussian blur filter; assigning pixels of each of the plurality of image frames to a first group or a second group based on pixel colour, preferably by using k-means clustering; converting pixels of the first group to a first colour and pixels of the second group to a second colour; and defining edges between the first colour and the second colour, preferably using an edge detection algorithm.
[0016] The above-described aspect of the present invention has the advantage that the edges between the objects and the ground may be easily detected as long as there is a colour difference between the objects and the ground, without using predefined threshold values and without relying on a distance or height difference between the objects and the ground, as any colour difference between the objects and the ground is maximised. [0017] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein the pre-processing method comprises: removing noise from each of the plurality of image frames, preferably by applying a gaussian blur filter; maximising a contrast of depth values of each of the plurality of image frames, preferably by applying histogram equalization; selecting areas of each of the plurality of image frames which appear closer to the one or more sensors than a ground surface, preferably by applying a predefined threshold; converting pixels within the selected area to a first colour and pixels within a remaining unselected area to a second colour; removing noise, preferably by applying a morphological transformation, and more preferably morphological opening and closing; and defining edges between the first colour and the second colour, preferably using an edge detection algorithm.
[0018] The above-described aspect of the present invention has the advantage that the edges between the objects and the ground may be easily detected as long as there is height difference between the objects and the ground , without using predefined threshold values and without relying on a colour or intensity difference between the objects and the ground, as such height difference would correspond to a difference in distance or depth value captured by a depth sensor, and such difference in distance or depth value is maximised.
[0019] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein identifying one or more line segments in each binary image comprises using a Hough transform, preferably probabilistic Hough transform.
[0020] The above-described aspect of the present invention has the advantage that Hough transform discards short lines caused by noise and tolerates gaps between the objects forming the one or more rows of objects. Probabilistic Hough transform also has the advantage of increased efficiency while still retaining accuracy. [0021] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein geometric line post-processing of the one or more identified line segments comprises: merging one or more identified line segments of each binary image that are proximate to each other, preferably based on a predefined threshold; smoothening the merged line segments across the plurality of image frames, preferably by applying an infinite impulse response (IIR) filter; and filtering the smoothened merged line segments across the plurality of image frames to generate the representative line, preferably by: maintaining a counter for each smoothened merged line segment based on a number of image frames each smoothened merged line segment appears in; removing smoothened merged line segments with a counter smaller than a predefined threshold after a predefined number of image frames; and resetting the counter of all remaining smoothened merged line segments after a predefined number of frames, wherein the representative lines are any remaining line segments.
[0022] The above-described aspect of the present invention has the advantage that geometric line post-processing minimises issues of false line segments identified due to noise, duplicate line segments identified for the same row of objects and/or missing line segments due to gaps present in the one or more rows of objects.
[0023] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, further comprising: verifying the generated representative lines, preferably by identifying a deviation between the representative lines and predefined points or lines; and optionally, changing the pre-processing method depending on a result of the verification step, preferably changing the pre-processing method if an identified deviation between the generated representative lines and predefined points or lines is above a predefined threshold.
[0024] The above-described aspect of the present invention has the advantage that the accuracy of the generated representative lines is improved by verifying against predefined points such that the representative lines may be generated again using a different preprocessing method if the initial pre-processing method used is not performing well.
[0025] The above-described advantageous aspects of a computer-implemented method of the invention also hold for all aspects of a below-described computer-implemented method of the invention. All below-described advantageous aspects of a computer-implemented method of the invention also hold for all aspects of an above-described computer- implemented method of the invention.
[0026] The invention also relates to a computer-implemented method of planning a vehicle path above at least one row of objects, the method comprising: defining a representative line for each of the at least one row of objects over a plurality of image frames captured by one or more sensors using the computer-implemented method according to the invention, wherein the one or more sensors are preferably mounted on the vehicle, and designating one of the defined representative lines as a middle of a vehicle path above at least one row of objects.
[0027] The above-described aspect of the present invention has the advantage that the representative line may be used to plan an accurate vehicle path above at least one row of objects without the wheels of the vehicle running over the at least one row of objects.
[0028] The above-described advantageous aspects of a computer-implemented method of the invention also hold for all aspects of a below-described computer-implemented method of the invention. All below-described advantageous aspects of a computer-implemented method of the invention also hold for all aspects of an above-described computer- implemented method of the invention.
[0029] The invention also relates to a computer-implemented method of planning a vehicle path above and/or between at least one row of objects, the method comprising: defining a representative line for each row of objects over a plurality of image frames captured by one or more sensors using the computer-implemented method according to the invention, wherein the one or more sensors are preferably mounted on the vehicle; and defining a line between two adjacent representative lines, wherein the line is preferably equidistant from the two adjacent representative lines; and designating the defined line as a middle of a vehicle path above and/or between rows of objects.
[0030] The above-described aspects of the present invention have the advantage that the representative line may be used to plan an accurate vehicle path above and/or between rows of objects.
[0031] The above-described advantageous aspects of a computer-implemented method of the invention also hold for all aspects of a below-described system of the invention. All below-described advantageous aspects of a system of the invention also hold for all aspects of an above-described computer-implemented method of the invention.
[0032] The invention also relates to a system comprising one or more sensors, one or more processors and a memory that stores executable instructions for execution by the one or more processors, the executable instructions comprising instructions for performing a computer- implemented method according to the invention.
[0033] The above-described advantageous aspects of a computer-implemented method or system of the invention also hold for all aspects of a below-described vehicle of the invention. All below-described advantageous aspects of a vehicle of the invention also hold for all aspects of an above-described computer-implemented method or system of the invention.
[0034] The invention also relates to a vehicle comprising a system according to the invention.
[0035] The above-described advantageous aspects of a computer-implemented method, system or vehicle of the invention also hold for all aspects of a below-described computer program, machine-readable storage medium, or a data signal of the invention. All below- described advantageous aspects of a computer program, machine-readable storage medium, or a data signal of the invention also hold for all aspects of an above-described computer- implemented method, system or vehicle of the invention.
[0036] The invention also relates to a computer program, a machine-readable storage medium, or a data carrier signal that comprises instructions, that upon execution on one or more processors, cause the one or more processors to perform a computer-implemented method according to the invention. The machine-readable storage medium may be any storage medium, such as for example, a USB stick, a CD, a DVD, a data storage device, a hard disk, or any other medium on which a program element as described above can be stored.
[0037] As used in this summary, in the description below, in the claims below, and in the accompanying drawings, the term “vehicle” means any mobile agent capable of movement, including cars, trucks, buses, agricultural machines, forklift, robots, whether or not such mobile agent is capable of carrying or transporting goods, animals, or humans, and whether or not such mobile agent is autonomous.
[0038] As used in this summary, in the description below, in the claims below, and in the accompanying drawings, the term “object” means any object, including plants, crops, boxes, paint, and bricks, whether or not such objects protrude from the ground.
BRIEF DESCRIPTION OF THE DRAWINGS
[0039] These and other features, aspects, and advantages will become better understood with regard to the following description, appended claims, and accompanying drawings where:
[0040] Fig. l is a schematic illustration of a system for generating lines representing one or more rows of objects over a plurality of image frames, in accordance with embodiments of the present disclosure;
[0041] Fig. 2 is a flowchart of a computer-implemented method for generating lines representing one or more rows of objects over a plurality of image frames, in accordance with embodiments of the present disclosure;
[0042] Fig. 3 is an example of an image generated by a line segment identification module, in accordance with embodiments of the present disclosure;
[0043] Fig. 4 is a flowchart of a first pre-processing method, in accordance with embodiments of the present disclosure;
[0044] Fig. 5 is a flowchart of a second pre-processing method, in accordance with embodiments of the present disclosure; [0045] Fig. 6 is a flowchart of geometric post-processing of line segments to generate representative lines, according to embodiments of the present disclosure;
[0046] Fig. 7 is a flowchart of filtering smoothened merged line segments across a plurality of image frames, in accordance with embodiments of the present disclosure;
[0047] Fig. 8 is a flowchart of a method of planning a vehicle path above at least one row of objects, according to embodiments of the present disclosure;
[0048] Fig. 9 is a flowchart of a method of planning a vehicle path above and/or between at least one row of objects, according to embodiments of the present disclosure; and
[0049] Fig. 10 is a schematic illustration of a computer system within which a set of instructions, when executed, may cause one or more processors of the computer system to perform one or more of the methods described herein, in accordance with embodiments of the present disclosure.
[0050] In the drawings, like parts are denoted by like reference numerals.
[0051] It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative systems embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and executed by a computer or processor, whether or not such computer or processor is explicitly shown.
DETAILED DESCRIPTION
[0052] In the summary above, in this description, in the claims below, and in the accompanying drawings, reference is made to particular features (including method steps) of the invention. It is to be understood that the disclosure of the invention in this specification includes all possible combinations of such particular features. For example, where a particular feature is disclosed in the context of a particular aspect or embodiment of the invention, or a particular claim, that feature can also be used, to the extent possible, in combination with and/or in the context of other particular aspects and embodiments of the invention, and in the inventions generally. [0053] In the present document, the word “exemplary” is used herein to mean “serving as an example, instance, or illustration.” Any embodiment or implementation of the present subject matter described herein as “exemplary” is not necessarily be construed as preferred or advantageous over other embodiments.
[0054] While the disclosure is susceptible to various modifications and alternative forms, specific embodiment thereof has been shown by way of example in the drawings and will be described in detail below. It should be understood, however that it is not intended to limit the disclosure to the forms disclosed, but on the contrary, the disclosure is to cover all modifications, equivalents, and alternative falling within the scope of the disclosure.
[0055] Fig. 1 is a schematic illustration of a system for generating lines representing one or more rows of objects over a plurality of image frames, in accordance with embodiments of the present disclosure. System 100 for generating lines representing one or more rows of objects over a plurality of image frames may comprise one or more sensors 104 coupled to one or more processors 108. In some embodiments, the one or more sensors 104 are used to collect a plurality of image frames of one or more rows of objects. Preferably, the one or more rows of objects are arranged transversely or substantially transversely to the one or more sensors 104. The one or more sensors 104 are configured to capture the plurality of image frames as the one or more sensors 104 travel along the one or more rows of objects, preferably in a direction parallel or substantially parallel to the one or more rows of objects. Preferably, the one or more sensors 104 are mounted on a vehicle such that the plurality of image frames are captured by the one or more sensors 104 as the vehicle travels parallel or substantially parallel along the one or more rows of objects. Preferably, the one or more sensors 104 are mounted on a vehicle such that the one or more sensors 104 collect the plurality of image frames from the perspective of the vehicle. In some embodiments, the plurality of image frames may comprise object image data and ground image data. In some embodiments, the object image data may comprise image data of at least one of: a road surface marking, a parking lot line marking, a crop row, a plant row, a harvested swath, a crop edge, a transition between a cut crop and uncut crop, a transition between a harvested and unharvested crop, a storage rack row, a row of pallets, a row of containers, or a row of storage boxes. In some embodiments, the ground image data may comprise image data of the ground, soil, floor, or surface on which the one or more rows of objects are positioned on or within. Preferably, the one or more sensors 104 are positioned such that the field of view (FoV) of the one or more sensors 104 includes only the one or more rows of objects and the ground or surface on which the one or more rows of objects are positioned on or within.
[0056] According to some embodiments, the one or more sensors 104 may comprise a visible light sensor and/or a depth sensor. A visible light sensor captures information relating to the colour of objects in a scene. In some embodiments, the visible light sensor may be a camera or a video camera. In some embodiments, the visible light sensor may generate colour image data wherein each pixel or group of pixels of the colour image data may be associated with a colour or an intensity level measuring the amount of visible light energy observed, reflected and/or emitted from objects and the ground within a scene or within an image representing the scene, or a portion thereof. In some embodiments, the colour image data may be RGB (red, green, blue) colour data, CMYK (cyan, magenta, yellow, black) colour data, HSV (hue, saturation, value) colour data, or image data in other colour space. A depth sensor captures information relating to the distances of surfaces of objects and the ground in a scene. In some embodiments, the depth sensor may be an infrared camera or video camera. In some embodiments, the depth sensor may be a camera with two infrared lenses that calculates the distance to objects and/or the ground based on the disparity between the two infrared streams. In some embodiments, the depth sensor may generate depth image data wherein each pixel or group of pixels of the depth image data may be associated with an intensity level based on the distance of surfaces of objects or the ground in a scene. In some embodiments, the depth image data may be expressed as RGB (red, green, blue) colour data, CMYK (cyan, magenta, yellow, black) colour data, HSV (hue, saturation, value) colour data, or image data in other colour space.
[0057] According to some embodiments, the plurality of image frames collected by the one or more sensors 104 may be processed by one or more processors 108 to generate lines representing one or more rows of objects over the plurality of image frames. In some embodiments, the one or more processors 108 may comprise several software modules, such as an image conversion module 112, a line segment identification module 116, and a representative line generator module 120. The image conversion module 112 communicates with the line segment identification module 116 and the line segment identification module 116 communicates with the representative line generator module 120. [0058] According to some embodiments, the one or more processors 108 may further comprise a representative line verification module 124 which communicates with both the representative line generator module 120 and the image conversion module 112.
[0059] According to some embodiments, system 100 may be configured to plan a path for a vehicle. In such embodiments, the one or more processors 108 may further comprise a path planner module 128 which communicates with the representative line generator module 120. In some embodiments, the path planner module 128 may use the generated representative lines from the representative line generator module 120 and/or the verified representative lines from the representative line verification module 124 to find or plan a suitable path for the vehicle to a predefined global target. System 100 configured to plan a path for a vehicle may also further comprise a vehicle guidance system 132 which uses the output from the path planner module 128, as well as other parameters such as vehicle dimensions, and/or output from additional sensors mounted on the vehicle, such as a wheel or motion sensors to guide the vehicle towards the predefined global target.
[0060] For the sake of convenience, the operations of the present disclosure are described as interconnected functional blocks or distinct software modules. This is not necessary, however, and there may be cases where these functional blocks or software modules are equivalently aggregated into a single logic device, program, or operation with unclear boundaries. In any event, the functional blocks and software modules or described features can be implemented by themselves, or in combination with other operations in either hardware or software.
[0061] Fig. 2 is a flowchart of a computer-implemented method for generating lines representing one or more rows of objects over a plurality of image frames, in accordance with embodiments of the present disclosure. Computer-implemented method 200 for generating lines representing one or more rows of obj ects over a plurality of image frames may be carried out by the one or more processors 108. In some embodiments, the one or more rows of objects may comprise at least one of: a road surface marking, a parking lot line marking, a crop row, a plant row, a harvested swath, a crop edge, a transition between a cut crop and uncut crop, a transition between a harvested and unharvested crop, a storage rack row, a row of pallets, a row of containers, or a row of storage boxes. Method 200 commences at operation 204, wherein the one or more processors 108 receive a plurality of image frames comprising one or more rows of objects from one or more sensors 104. Preferably, the one or more rows of objects are arranged transversely or substantially transversely to the one or more sensors 104. The plurality of image frames is captured sequentially as the one or more sensors 104 travel along the one or more rows of objects, and preferably in a direction parallel or substantially parallel to the one or more rows of objects. In some embodiments, the one or more sensors 104 may comprise a visible light sensor and/or a depth sensor. In some embodiments, a visible light sensor may be preferable where there is sufficient visible light reflected off the one or more rows of objects and the ground, such as during the day or when the lights are turned on within an enclosed space. In some embodiments, a visible light sensor may be preferable where there is insufficient height difference between the objects and the ground, such as when the objects are embedded within the ground. In some embodiments, a depth sensor may be preferable where there is insufficient visible light reflected off objects and the ground, such as during the night or when the lights are turned off within an enclosed space. In some embodiments, a depth sensor may be preferable where there is sufficient height difference between the objects and the ground, such that there is a sufficient difference in the depth values of the object image data and the ground image data.
[0062] According to some embodiments, method 200 may continue with operation 208, wherein the image conversion module 112 converts each image frame of the plurality of image frames received in operation 204 into a binary image and defines edges in each binary image using a pre-processing method. A binary image is an image that comprises binary image data, wherein the pixels that comprise the binary image each have one of two states, or one of two colours. Preferably, the binary image comprises pixels that either are white or black (e.g., minimum intensity value or maximum intensity value), or any other two colours that may be on opposite ends of a colour spectrum. In some embodiments, the binary image may represent object image data of the one or more rows of objects as white and the ground image data as black, or vice versa. The incorporation of a pre-processing method to convert each image frame of the plurality of image frames into a binary image is advantageous as it increases the accuracy of edge detection and/or accuracy of the one or more line segments identified in a particular image, as well as enables edge detection to be accurately carried out enables edge detection to be accurately carried out as any colour difference between the one or more objects and its ground surface is maximised. [0063] According to some embodiments, method 200 may continue with operation 212, wherein the line segment identification module 116 identifies one or more line segments in each binary image generated by the image conversion module 112 in operation 208. The one or more line segments are lines that are fit to the one or more rows of objects. Various techniques for the identification of line segments may be employed. In some embodiments, the one or more line segments are identified using a Hough transform method, which is a known method to locate shapes such as lines in images. Hough transformation may be configured with a minimum number of pixels of a first colour required to form a line, and a maximum gap size of pixels of a second colour in between. Thus, Hough transform discards short lines caused by noise and is able to tolerate gaps between the objects forming the one or more rows of objects, such as gaps between plants making up a crop row or gaps on a broken road surface marking. Preferably, the one or more line segments are identified using probabilistic Hough transformation which randomly samples edge points of the defined edges to generate one or more most probable line segments. Probabilistic Hough transformation may be advantageous as it is faster than Hough transformation whilst retaining accuracy that is almost comparable to Hough transformation.
[0064] According to some embodiments, method 200 may continue with operation 216, wherein the representative line generator module 120 generates a representative line over the plurality of image frames based on geometric line post-processing of the one or more line segments identified by the line segment identification module 116 in operation 212. Each representative line generated by the representative line generator module 120 in operation 216 represents a row of objects of the one or more rows of objects. Geometric post-processing of the line segments identified by the line segment identification module 116 in operation 212 is carried out to remove false line segments identified due to noise, duplicate line segments identified for the same row of objects and/or identify and fill in gaps that the line segment identification module 116 may have missed in operation 212. The incorporation of geometric line post-processing is advantageous as geometric line post-processing refines the identified one or more line segments over a plurality of frames and filters any false identifications.
[0065] According to some embodiments, method 200 may optionally continue with operation 220, wherein the representative line verification module 124 verifies the representative lines generated by the representative line generator module 120 in operation 216. In some embodiments, the generated representative lines may be verified by identifying a deviation between the generated representative lines and predefined points. In some embodiments, the predefined points may be points or lines that are representative of the one or more rows of objects that were previously recorded and stored. In some embodiments, the predefined points may be Global Navigation Satellite System (GNSS) points. In other embodiments, the predefined points may be a floor plan generated of an enclosed space comprising the one or more rows of objects. In some embodiments, based on the results of the verification in operation 220, method 200 may revert to operation 208 wherein the image conversion module 112 converts each image frame of the plurality of image frames into a binary image using a different pre-processing method from the initial pre-processing method used. Preferably, the pre-processing method is changed if the deviation between the generated representative lines and predefined points or lines identified by the representative line verification module 124 in operation 220 is above a predefined threshold. The predefined threshold may be defined based on several factors, such as the accuracy of the generated representative lines, the accuracy of the predefined points or lines, as well as the state of the objects making up the one or more rows of objects. For example, the predefined threshold may be higher if the one or more rows of objects is wide or uneven.
[0066] According to some embodiments, where method 200 is used for path planning, method 200 may continue with operation 224 wherein the path planner module 128 uses the representative lines generated by the representative line generator module 120 in operation 216 or the generated representative lines verified by the representative line verification module 124 in operation 220 to plan a path for a vehicle. In some embodiments, path planner module 128 may plan a path or route for a vehicle to a predefined global target. In some embodiments, the representative lines generated by the representative line generator module 120 in operation 216 or the generated representative lines verified by the representative line verification module 124 in operation 220 may be used to plan a vehicle path above at least one row of objects. In some embodiments, the representative lines generated by the representative line generator module 120 in operation 216 or the generated representative lines verified by the representative line verification module 124 in operation 220 may be used to plan a vehicle path above and/or between rows of objects.
[0067] Fig. 3 is an example of an image generated by a line segment identification module, in accordance with embodiments of the present disclosure. An image 300 generated by the line segment identification module 116 may comprise black portions 304 representing ground image data, white portions 308 representing object image data, and line segments 312 representing the line segments identified by the line segment identification module 116 in operation 212.
[0068] Fig. 4 is a flowchart of a first pre-processing method, in accordance with embodiments of the present disclosure. First pre-processing method 400 may be used by the image conversion module 112 on each image frame comprising a colour image collected from a visible light sensor to convert each image frame into a binary image based on colour of the one or more rows of objects and ground surface. In some embodiments, first preprocessing method 400 may commence with operation 404 wherein noise is removed from each of the plurality of image frames. Noise may be removed using any known method or algorithm for removal of noise. Preferably, noise is removed from each of the plurality of image frames by applying a gaussian blur filter which is an approximation filter that reduces random image noise by smoothening an image using a Gaussian function.
[0069] According to some embodiments, first pre-processing method 400 may continue with operation 408 wherein pixels of each of the plurality of image frames are assigned to a first group or a second group based on pixel colour. The assignment of pixels may be carried out using any known method or algorithm for grouping pixels based on colour of the pixels. Preferably, the pixels of each of the plurality of image frames are assigned to a first group or a second group based on pixel colour using k-means clustering which is a grouping method with reduced computing power required as compared to other pixel grouping methods. Operation 408 therefore assigns the pixels to one of two groups. For example, the first group may comprise pixels representing object image data, and the second group may comprise pixels representing ground image data. For example, the first group may comprise pixels representing plants of crop rows, while the second group may comprise pixels representing the ground or surrounding soil. For example, the first group may comprise pixels representing road surface markings, while the second group may comprise pixels representing the road surface.
[0070] According to some embodiments, first pre-processing method 400 may continue with operation 412 wherein the pixels of the first group are converted to a first colour and the pixels of the second group are converted to a second colour. The first colour and second colour may be any colours. Preferably, first colour and second colour are either white or black which are colours on opposite ends of the colour spectrum that have either a maximum or minimum intensity value, which would reduce the computing required in subsequent operations. In addition, the larger the difference in intensity, the more efficient the subsequent edge detection steps detailed below. Furthermore, by converting each of the plurality of image frames into binary images, subsequent edge detection operations may be employed without any predefined threshold values and even in situations where there is only a small colour difference between the objects and the ground as k-means clustering already assigned and separated the pixels based on colour.
[0071] According to some embodiments, first pre-processing method 400 may continue with operation 416 wherein edges between the first colour and second colour are defined. The defined edges may then be used by the line segment identification module 116 subsequently in operation 212. Preferably, the edges are defined using an edge detection algorithm which identifies boundaries between objects in an image and are detected through changes or discontinuities in the intensity of images. As the changes or discontinuities in the intensity of images are maximised by the pre-processing method, the efficiency and accuracy of edge detection is increased. Preferably, the edges are defined using a canny edge detection algorithm in embodiments where a gaussian blur filter was applied in operation 404 as the application of a gaussian filter is itself a step in canny edge detection algorithm, therefore reducing the overall computing power used in edge detection.
[0072] Fig. 5 is a flowchart of a second pre-processing method, in accordance with embodiments of the present disclosure. Second pre-processing method 500 may be used by the image conversion module 112 on each image frame comprising a depth image collected from a depth sensor to convert each image frame into a binary image based on distance of the one or more rows of objects and a ground surface to the one or more sensors. In some embodiments, second pre-processing method 500 may commence with operation 504 wherein noise is removed from each of the plurality of image frames. Any known methods or algorithms may be used to remove noise. Preferably, noise is removed from each of the plurality of image frames by applying a gaussian blur filter which is an approximation filter that reduces random image noise by smoothening an image using a Gaussian function. [0073] According to some embodiments, second pre-processing method 500 may continue with operation 508 wherein a contrast of depth values of each of the plurality of image frames is maximized. Preferably, the contrast of depth values is maximized by applying histogram equalization which is a highly efficient and simple technique of modifying the dynamic range and contrast of an image. Histogram equalization is advantageous as it evenly distributes the intensity values of pixels of an image over an available range.
[0074] According to some embodiments, second pre-processing method 500 may continue with operation 512 wherein areas of each of the plurality of image frames which appear closer to the one or more sensors than a ground surface are selected. These selected areas would correspond with object image data and any remaining area that is unselected (i.e., remaining unselected area) would correspond to ground image data as the ground surface would usually be further from the one or more sensors 104 than the one or more rows of objects. Preferably, the areas are selected by applying a predefined threshold. For example, the predefined threshold may be a distance or depth threshold, such that only pixels with depth or distance values of less than 1 meter from the one or more sensors are selected.
[0075] According to some embodiments, second pre-processing method 500 may continue with operation 516 wherein pixels within the selected area are converted to a first colour and pixels within the remaining unselected area are converted to a second colour. The first colour and second colour may be any colours. Preferably, the first colour and the second colour are either white or black which are on opposite ends of the colour spectrum that have either a maximum or minimum intensity value, which would reduce the computing required in subsequent operations. In addition, the larger the difference in intensity, the more efficient the subsequent edge detection steps detailed below. Furthermore, by converting each of the plurality of image frames into binary images, subsequent edge detection operations may be employed without any predefined threshold values and even in situations where there is only a small distance difference between the objects and the ground as histogram equalization is used to maximise the contrast of depth values.
[0076] According to some embodiments, second pre-processing method 500 may continue with operation 520 wherein noise is further removed. Any known methods or algorithms may be used to further remove noise. Examples of further noise that may be removed includes sporadic objects that may be found between rows of the one of more rows of objects, such as sporadic weeds growing between crop rows. Preferably, noise is further removed by applying a morphological transformation. Morphological transformation are operations carried out during image processing based on the image shape so that edges can be easily defined in subsequent operations. More preferably, noise is further removed by applying morphological opening and closing. Morphological opening carves out boundaries between objects that may have been merged, while morphological closing fills in small holes or gaps inside objects. Morphological opening and closing may allow subsequent edge detection operations to perform better.
[0077] According to some embodiments, second pre-processing method 500 may continue with operation 524 wherein edges between the two colours are defined. The defined edges may then be used by the line segment identification module 116 subsequently in operation 212. Preferably, the edges are defined using an edge detection algorithm which identifies boundaries between objects in an image and are detected through changes or discontinuities in the intensity of images. As the changes or discontinuities in the intensity of images are maximised by the pre-processing method, the efficiency and accuracy of edge detection is increased. Preferably, the edges are defined using a canny edge detection algorithm in embodiments where a gaussian blur filer was applied in operation 504 as the application of a gaussian filter is itself a step in canny edge detection algorithm, therefore reducing the overall computing power used in edge detection.
[0078] According to some embodiments of the present disclosure, the one or more processors 108 of system 100 may execute either first pre-processing method 400 or second pre-processing method 500. The pre-processing method may be selected using a configuration file or a graphical user interface (GUI) setting. In some embodiments, the preprocessing method may be selected based on a colour difference between the one or more rows of objects and a ground surface when captured by a visible light sensor. For example, the first pre-processing method 400 may be more appropriate or accurate when there is a larger colour difference between the one or more rows of objects and the ground surface when captured by the visible light sensor as the larger difference in the intensity of pixels of object image data and ground image data captured by the visible light sensor allow easier separation by the first pre-processing method 400 based on colour. In another example, the second pre-processing method 500 may be more appropriate or accurate when there is a negligible colour difference between the one or more rows of objects and the ground surface as there is only a negligible difference in the intensity of pixels of colour image data captured by the visible light sensor such that the first pre-processing method 400 may not be as accurate. Several factors may affect colour difference between the one or more rows of objects and a ground surface when captured by a visible light sensor. For example, lighting condition or other environmental conditions such as fog or haze may affect the colour difference between the one or more rows of objects and a ground surface when captured by a visible light sensor. In bright lighting conditions, such as during the day with ample sunlight or an enclosed space with the lights turned on, the colour difference may be more distinct as more visible light is reflected off the objects and the ground surface and captured by the visible light sensor. In low lighting conditions, such as in the night, during a cloudy or foggy day or an enclosed space with the lights turned off, the colour difference may be less distinct as less visible light is reflected off the objects and the ground surface and captured by the visible light sensor.
[0079] In some embodiments, where the objects are plants, examples of factors that may affect the colour difference between the one or more rows of objects and a ground surface when captured by a visible light sensor include the plant condition, plant type and/or plant growth stage. For example, plants in a good condition may have more vibrant colours and thus have a large colour difference as compared to its surrounding soil. On the other hands, plants in a poor condition may be more brown or dull and thus have small or negligible colour difference as compared to its surrounding soil. In another example, different plant types have different colours and thus the colour difference between the plants and its surround soil may vary depending on the plant type. In another example, plants may have different colours at different growth stages, and thus the colour difference between the plants and its surrounding soil may vary across the different growth stages.
[0080] In some embodiments, the pre-processing method may be selected based on a distance difference between the one or more rows of objects and a ground surface when captured by a depth sensor. The second pre-processing method 500 may be preferred where the one or more rows of objects protrude from a ground surface, as such protrusion would lead to differences in depth or distance values captured by a depth sensor. For example, where the objects comprising the one or more objects are tall, the object image data of such objects would have a lower depth or distance value as compared to the ground image data as the objects would appear closer to the one or more sensors 104. In some embodiments, the 1 second pre-processing method 500 not be accurate when the one or more rows of objects do not protrude from the ground surface as there would be a negligible distance difference between the one or more rows of objects and the ground surface when captured by the depth sensor, therefore reducing the accuracy of the second pre-processing method 500. In such cases, the first pre-processing method 400 may be more accurate or appropriate as long as there is some colour difference between the one or more rows of objects and the ground surface when captured by the visible light sensor. Thus, the second pre-processing method 500 may be more appropriate for objects that are higher or taller, while the first preprocessing method 400 may be more appropriate for objects that are shorter or flatter.
[0081] In some embodiments, where the objects are plants, examples of factors that may affect the distance difference between the one or more rows of objects and a ground surface when captured by a depth sensor include the plant condition, plant type and/or plant growth stage. For example, plants in a good condition may grow taller and thus have a large distance difference to the depth sensor as compared to its surrounding soil. On the other hands, plants in a poor condition may be limp or have poor or stunted growth and thus have a small or negligible distance difference to the depth sensor as compared to its surrounding soil. In another example, different plant types grow to different heights and thus the distance difference to the depth sensor may vary depending on the plant type. In another example, plants may have different heights at different growth stages, and thus the distance difference to the depth sensor may vary across the different growth stages.
[0082] Fig. 6 is a flowchart of geometric post-processing of line segments to generate representative lines, according to embodiments of the present disclosure. Operation 216 of geometric post-processing of line segments may be carried out by representative line generator module 120 to refine the line segments identified by the line segment identification module 116. In some embodiments, operation 216 may commence with operation 604 wherein one or more of the identified line segments that are proximate to each other are merged, preferably based on a predefined threshold. The predefined threshold may be defined based on the types of line segments identified, a distance between the one or more rows of objects, and the type of objects that make up the one or more rows of objects. For example, where there is a minimum distance of 1 m between adjacent rows of objects, the predefined threshold may be 1 m, such that adjacent line segments with a distance of less than 1 m between the line segments are merged. Merging proximate line segments together averages line segments that are close to each other and merges multiple detected line segments for the same row of objects into one.
[0083] According to some embodiments, operation 216 may continue with operation 608 wherein the merged line segments from operation 604 are smoothened across the plurality of image frames. Preferably, the merged line segments are smoothened by applying an infinite impulse response (IIR) filter. An IIR filter finds all merged line segments in a first image frame of a time sequence and saves each merged line segment. The IIR then finds all merged line segments in a subsequent image frame. For each merged line segment in the subsequent image frame, the IIR filter searches for a corresponding merged line segment in the first image frame. If the IIR filter finds a matching merged line segment, the IIR filter merges the matching merged line segments by weighting the merged line segment in the subsequent image frame with a constant alpha and the matching merged line segment in the first image frame with a (1-alpha). The IIR filter then continues across the plurality of image frames, finding and weighting matching merged line segments, thus making the merged line segments smoother and more stable across the plurality of image frames.
[0084] According to some embodiments, operation 216 may continue with operation 612 wherein the smoothened merged line segments are filtered across the plurality of image frames to remove any smoothened merged line segments that may be due to noise, such as randomly appearing objects not part of the one or more rows of objects. An example of noise is randomly appearing weeds among the one or more rows of plants or crops. The filtering ensures that only smoothened merged line segments that appear over multiple image frames are retained. Any smoothened merged line segments remaining are the representative lines, each representative line representing a row of objects.
[0085] Fig. 7 is a flowchart of filtering smoothened merged line segments across a plurality of image frames, in accordance with embodiments of the present disclosure. Operation 612 of filtering smoothened merged line segments across a plurality of image frames may be carried out by representative line generator module 120. Operation 612 may commence with operation 716 wherein a counter is maintained for each smoothened merged line segment and each image frame in which the smoothened merged line segment appears adds to the counter. [0086] According to some embodiments, operation 612 may continue with operation 720 wherein smoothened merged lines with a counter smaller than a predefined threshold are removed after a predefined number of image frames. The removal of smoothened merged lines with a counter smaller than a predefined threshold is advantageous as it filters out noise. The predefined threshold and the predefined number of image frames depends on the speed at which the one or more sensors are travelling, as well as the frame rate of the one or more sensors. For example, where a sensor has a frame rate of 30 fps and the sensor is travelling at a speed of 1 m/s, the predefined number of image frames may be set at 60 and the predefined threshold may be set as 5, such that a smoothened merged line is removed if it is seen in less than 5 image frames within the 60 image frames captured, the 60 image frames captured over a distance of 2 meters and within a duration of 2 seconds.
[0087] According to some embodiments, operation 612 may continue with operation 724 wherein the counters of all remaining smoothened merged lines is reset after a predefined number of frames. Resetting of the counters is advantageous as it reduces the storage space required. If the counters are not reset, if a line segment is detected once, it would be stored forever even if it were no longer visible to the one or more sensors. The predefined number of frames depends on several factors, including the frame rate of the one or more sensors, the speed at which the one or more sensors are travelling, as well as the type of object making up the one or more rows of objects.
[0088] Fig. 8 is a flowchart of a method of planning a vehicle path above at least one row of objects, according to embodiments of the present disclosure. Method 800 of planning a vehicle path above at least one row of objects may be carried out by path planner module 128 using the representative lines generated by the representative line generator module 120 in operation 216 or the generated representative lines verified by the representative line verification module 124 in operation 220. Method 800 may commence with operation 804 wherein a representative line for each of the at least one row of objects is defined over a plurality of image frames captured by one or more sensors. The generated representative line may be the representative lines generated by the representative line generator module 120 in operation 216 or the generated representative lines verified by the representative line verification module 124 in operation 220. In some embodiments, the one or more sensors 104 are mounted on the vehicle for which the vehicle path is being planned for. In some embodiments, method 800 may continue with operation 808 wherein one of the defined representative lines for a row of objects is designated as a middle of the vehicle path above at least one row of objects.
[0089] Fig. 9 is a flowchart of a method of planning a vehicle path above and/or between at least one row of objects, according to embodiments of the present disclosure. Method 900 of planning a vehicle path above and/or between at least one row of objects may be carried out by path planner module 128 using the representative lines generated by the representative line generator module 120 in operation 216 or the generated representative lines verified by the representative line verification module 124 in operation 220. Method 900 may commence with operation 904 wherein a representative line for each of the at least one row of objects is defined over a plurality of image frames captured by one or more sensors. The generated representative line may be the representative lines generated by the representative line generator module 120 in operation 216 or the generated representative lines verified by the representative line verification module 124 in operation 220. In some embodiments, the one or more sensors 104 are mounted on the vehicle for which the vehicle path is being planned for.
[0090] According to some embodiments, method 900 may continue with operation 908 wherein a line is defined between two adjacent representative lines. Preferably, the defined line is equidistant from the two adjacent representative lines. In some embodiments, method 900 may continue with operation 912 wherein the defined line is designated as a middle of the vehicle path above and/or between rows of objects.
[0091] Fig. 10 is a schematic illustration of a computer system within which a set of instructions, when executed, may cause one or more processors of the computer system to perform one or more of the methods described herein, in accordance with embodiments of the present disclosure. It is noted that computer system 1000 described herein is only an example of a computer system that may be employed and computer systems with other hardware or software configurations may be employed. In some embodiments, computer system 1000 may be connected to one or more sensors, such connection to the one or more sensors may be wired or wirelessly. In some embodiments, computer system 1000 may comprise a server computer, a laptop, a personal computer, a desktop computer, or any machine capable of executing a set of instructions that specify actions to be taken by the computer system. Computer system 1000 may comprise one or more processors 108 and one or more memory 1004 which communicate with each other via a bus 1008. Computer system 1000 may further comprise a network interface device 1012 which allows computer system 1000 to communicate with a network 1016. In some embodiments, computer system 1000 may further comprise a disk drive unit 1020 which may include a machine-readable storage medium 1024 on which is stored one or more sets of instructions 1028 embodying one or more of the methods described herein. The one or more sets of instructions 1028 may also reside in the one or more processors 108 or the one or more memory 1004. In some embodiments, the one or more sets of instructions 1028 may be received as a data carrier signal received by computer system 1000. In some embodiments, computer system 1000 may further comprise an antenna 1032 which allows computer system 1000 to connect with external sources such as a satellite navigation system.
[0092] According to some embodiments, computer system 1000 may be part of a vehicle such that computer system 1000 is used to plan a vehicle path for the vehicle that computer system 1000 is part of.
[0093] The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that on-going technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments. The terms “comprises”, “comprising”, “includes” or any other variations thereof, are intended to cover a non-exclusive inclusion, such that a setup, device or method that includes a list of components or steps does not include only those components or steps but may include other components or steps not expressly listed or inherent to such setup or device or method. In other words, one or more elements in a system or apparatus proceeded by “comprises... a” does not, without more constraints, preclude the existence of other elements or additional elements in the system or method. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise. [0094] Finally, the language used in the specification has been principally selected for readability and instructional purposes, and it may not have been selected to delineate or circumscribe the inventive subject matter. It is therefore intended that the scope of the invention be limited not by this detailed description, but rather by any claims that issue on an application based here on. Accordingly, the embodiments of the present invention are intended to be illustrative, but not limiting, of the scope of the invention, which is set forth in the following claims.

Claims

1. A computer-implemented method for generating lines representing one or more rows of objects over a plurality of image frames, the method comprising: receiving from one or more sensors a plurality of image frames comprising one or more rows of objects, wherein the plurality of image frames is captured sequentially as the one or more sensors travel along the one or more rows of objects; converting each of the received plurality of image frames into a binary image and defining edges in each binary image using a pre-processing method; identifying one or more line segments in each binary image; and generating a representative line over the plurality of image frames based on geometric line post-processing of the one or more identified line segments, wherein each representative line represents a row of objects of the one or more rows of objects.
2. The computer-implemented method of claim 1, wherein the one or more rows of objects comprise at least one of: a road surface marking, a parking lot line marking, a crop row, a plant row, a harvested swath, a crop edge, a transition between a cut crop and uncut crop, a transition between a harvested and unharvested crop, a storage rack row, a row of pallets, a row of containers, or a row of storage boxes.
3. The computer-implemented method of claim 1 or 2, wherein the one or more sensors comprises a visible light sensor and/or a depth sensor.
4. The computer-implemented method of any of the preceding claims, wherein the preprocessing method is selected based on at least one of: a colour difference between the one or more rows of objects and a ground surface when captured by a visible light sensor; and a distance difference between the one or more rows of objects and the ground surface when captured by a depth sensor.
5. The computer-implemented method of any of the preceding claims, wherein the preprocessing method converts each of the received plurality of image frames into a binary image based on colour of the one or more rows of objects and a ground surface or distance of the one or more rows of objects and a ground surface to the one or more sensors.
6. The computer-implemented method of any of the preceding claims, wherein the preprocessing method comprises: removing noise from each of the plurality of image frames, preferably by applying a gaussian blur filter; assigning pixels of each of the plurality of image frames to a first group or a second group based on pixel colour, preferably by using k-means clustering; converting pixels of the first group to a first colour and pixels of the second group to a second colour; and defining edges between the first colour and the second colour, preferably using an edge detection algorithm.
7. The computer-implemented method of any of the preceding claims, wherein the preprocessing method comprises: removing noise from each of the plurality of image frames, preferably by applying a gaussian blur filter; maximising a contrast of depth values of each of the plurality of image frames, preferably by applying histogram equalization; selecting areas of each of the plurality of image frames which appear closer to the one or more sensors than a ground surface, preferably by applying a predefined threshold; converting pixels within the selected area to a first colour and pixels within a remaining unselected area to a second colour; removing noise, preferably by applying a morphological transformation, and more preferably morphological opening and closing; and defining edges between the first colour and the second colour, preferably using an edge detection algorithm.
8. The computer-implemented method of any of the preceding claims, wherein identifying one or more line segments in each binary image comprises using a Hough transform, preferably probabilistic Hough transform.
9. The computer-implemented method of any of the preceding claims, wherein geometric line post-processing of the one or more identified line segments comprises: merging one or more identified line segments of each binary image that are proximate to each other, preferably based on a predefined threshold; smoothening the merged line segments across the plurality of image frames, preferably by applying an infinite impulse response (IIR) filter; and filtering the smoothened merged line segments across the plurality of image frames to generate the representative line, preferably by: maintaining a counter for each smoothened merged line segment based on a number of image frames each smoothened merged line segment appears in; removing smoothened merged line segments with a counter smaller than a predefined threshold after a predefined number of image frames; and resetting the counter of all remaining smoothened merged line segments after a predefined number of frames, wherein the representative lines are any remaining line segments.
10. The computer-implemented method of any of the preceding claims, further comprising: verifying the generated representative lines, preferably by identifying a deviation between the representative lines and predefined points or lines; and optionally, changing the pre-processing method depending on a result of the verification step, preferably changing the pre-processing method if an identified deviation between the generated representative lines and predefined points or lines is above a predefined threshold.
11. A computer-implemented method of planning a vehicle path above at least one row of objects, the method comprising: defining a representative line for each of the at least one row of objects over a plurality of image frames captured by one or more sensors using the computer-implemented method according to any of the preceding claims 1 to 10, wherein the one or more sensors are preferably mounted on the vehicle, and designating one of the defined representative lines as a middle of a vehicle path above at least one row of objects.
12. A computer-implemented method of planning a vehicle path above and/or between rows of objects, the method comprising: defining a representative line for each row of objects over a plurality of image frames captured by one or more sensors using the computer-implemented method according to any of the preceding claims 1 to 10, wherein the one or more sensors are preferably mounted on the vehicle; defining a line between two adjacent representative lines, wherein the line is preferably equidistant from the two adjacent representative lines; and designating the defined line as a middle of a vehicle path above and/or between rows of objects.
13. A system comprising one or more sensors, one or more processors and a memory that stores executable instructions for execution by the one or more processors, the executable instructions comprising instructions for performing a computer-implemented method according to any of the preceding claims.
14. A vehicle comprising the system of claim 13.
15. A computer program, a machine-readable storage medium, or a data carrier signal that comprises instructions, that upon execution on one or more processors, cause the one or more processors to perform a computer-implemented method according to any of the preceding claims 1 to 12.
PCT/EP2023/056057 2022-03-10 2023-03-09 Method and system for identifying rows of objects in images WO2023170228A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB2203314.6 2022-03-10
GB2203314.6A GB2616597A (en) 2022-03-10 2022-03-10 Method and system for identifying rows of objects in images

Publications (1)

Publication Number Publication Date
WO2023170228A1 true WO2023170228A1 (en) 2023-09-14

Family

ID=81254822

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2023/056057 WO2023170228A1 (en) 2022-03-10 2023-03-09 Method and system for identifying rows of objects in images

Country Status (2)

Country Link
GB (1) GB2616597A (en)
WO (1) WO2023170228A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190061153A (en) * 2017-11-27 2019-06-05 (주) 비전에스티 Method for lane detection autonomous car only expressway based on outputting image of stereo camera
CN112307953A (en) * 2020-10-29 2021-02-02 无锡物联网创新中心有限公司 Clustering-based adaptive inverse perspective transformation lane line identification method and system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4697480B2 (en) * 2008-01-11 2011-06-08 日本電気株式会社 Lane recognition device, lane recognition method, and lane recognition program
CN110879943B (en) * 2018-09-05 2022-08-19 北京嘀嘀无限科技发展有限公司 Image data processing method and system
CN112001216A (en) * 2020-06-05 2020-11-27 商洛学院 Automobile driving lane detection system based on computer

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190061153A (en) * 2017-11-27 2019-06-05 (주) 비전에스티 Method for lane detection autonomous car only expressway based on outputting image of stereo camera
CN112307953A (en) * 2020-10-29 2021-02-02 无锡物联网创新中心有限公司 Clustering-based adaptive inverse perspective transformation lane line identification method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ANONYMOUS: "Stereo Vision - 5KK73 GPU 2010", 16 November 2010 (2010-11-16), XP093042060, Retrieved from the Internet <URL:https://sites.google.com/site/5kk73gpu2010/assignments/stereo-vision> [retrieved on 20230425] *

Also Published As

Publication number Publication date
GB2616597A (en) 2023-09-20
GB202203314D0 (en) 2022-04-27

Similar Documents

Publication Publication Date Title
US10002416B2 (en) Inventory, growth, and risk prediction using image processing
CN110988912B (en) Road target and distance detection method, system and device for automatic driving vehicle
CN108345822B (en) Point cloud data processing method and device
CN110147706B (en) Obstacle recognition method and device, storage medium, and electronic device
CN110487562B (en) Driveway keeping capacity detection system and method for unmanned driving
Bar Hillel et al. Recent progress in road and lane detection: a survey
Dahlkamp et al. Self-supervised monocular road detection in desert terrain.
US20200026283A1 (en) Autonomous route determination
US8750567B2 (en) Road structure detection and tracking
Zhou et al. Self‐supervised learning to visually detect terrain surfaces for autonomous robots operating in forested terrain
Campos et al. Spatio-temporal analysis for obstacle detection in agricultural videos
Hong et al. Road detection and tracking for autonomous mobile robots
Kitt et al. Moving on to dynamic environments: Visual odometry using feature classification
Zorzi et al. Full-waveform airborne LiDAR data classification using convolutional neural networks
CN115049700A (en) Target detection method and device
Singh et al. Comprehensive automation for specialty crops: Year 1 results and lessons learned
Neto et al. Real-time estimation of drivable image area based on monocular vision
Iloie et al. UV disparity based obstacle detection and pedestrian classification in urban traffic scenarios
Ross et al. Online novelty-based visual obstacle detection for field robotics
Katramados et al. Real-time traversable surface detection by colour space fusion and temporal analysis
Rezaei et al. Traffic-net: 3d traffic monitoring using a single camera
Park et al. Drivable dirt road region identification using image and point cloud semantic segmentation fusion
Arthi et al. Object detection of autonomous vehicles under adverse weather conditions
Gökçe et al. Recognition of dynamic objects from UGVs using Interconnected Neuralnetwork-based Computer Vision system
Zhang et al. A front vehicle detection algorithm for intelligent vehicle based on improved gabor filter and SVM

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23710864

Country of ref document: EP

Kind code of ref document: A1