WO2002082181A1

WO2002082181A1 - Corridor mapping system and method

Info

Publication number: WO2002082181A1
Application number: PCT/AU2002/000427
Authority: WO
Inventors: Mark Berman; Changming Sun; Ronald Jones; Hugues Talbot; Xiaoliang Wu; Kevin Cheong; Richard Beare; Michael Buckley
Original assignee: Commonwealth Scientific And Industrial Research Organisation
Priority date: 2001-04-04
Filing date: 2002-04-04
Publication date: 2002-10-17
Also published as: AUPR419901A0

Abstract

The present invention relates to a system and method for the automated mapping of corridors, in particular to a system and method for measuring encroachment of vegetation onto right of way and infrastructure components extending within corridors.Infrastructure components such as power lines, telecommunication lines, oil and gas pipe lines, extend for very long distances in corridors. The encroachment of vegetation on the infrastructure components is a problem.The present invention provides a corridor mapping system which utilises stereo image processing means arranged to process a plurality of stereo image pairs taken of the corridor to provide information on the three dimensional arrangement of components within the corridor. The system identifies infrastructure components and determines the status of the infrastructure components from the images.

Description

Corridor Mapping System and Method.

Field of Invention

The present invention relates to a system and method for the automated mapping of corridors, and, particularly, but not exclusively, to a system and method for measuring encroachment of vegetation onto right of way and infrastructure components extending within corridors.

Background of Invention

It is commonly known to run infrastructure components such as power lines, telecommunications lines, oil and gas pipelines, above ground, for very long distances, extending in "corridors". Encroachment of vegetation on the infrastructure components (whether they be power lines, telecommunications lines or pipelines) is undesirable. With power lines and telecommunications lines, encroachment of vegetation can have disastrous results. With powerlines it can lead to bushfires, damage to the power lines, damage to telecommunications.

Similarly with the encroachment of vegetation onto pipelines carrying volatile substances, such as gas and oil pipelines. Further, encroachment of vegetation can make access to the infrastructure for purposes of maintenance difficult, by preventing right of way. Infrastructure corridors therefore need to be inspected, in order to monitor encroachment of vegetation, onto the infrastructure components and onto the corridors . Presently, inspection of infrastructure corridors is a cumbersome process requiring many man-hours of manual visual inspection. Manual inspection can include extensive aerial audits of the infrastructure corridor, as well as ground based manual visual inspection. This process is extremely expensive, time consuming and subject to observer bias or failure to observe potential trouble spots .

Further, human beings are not very good at judging perspective from a distance (such as when viewing ground based objects from an aircraft, for example) . It is therefore quite difficult to tell from a manual inspection whether vegetation is encroaching on infrastructure components or not. For example, when the infrastructure is a power line, if the inspection is taking place from an aircraft so that the inspector is looking down on the power line, it is very difficult for the inspector to tell whether the vegetation that is underneath the power lines is close to or far away from the lines.

It has been previously known to use laser range finding techniques to scan very large objects. These techniques are not suitable for measuring the encroachment of vegetation onto infrastructure components, particularly where the infrastructure components are small, because of lack of resolution. Further, the speed of operation of laser scanning techniques is low.

There is a need for system and method which enables an automated mapping or inspection of corridor-type infrastructure, and a determination as to encroachment on the infrastructure by vegetation within the corridor or detection of any faults with the infrastructure, for the purposes of maintenance of the infrastructure.

Summary of the Invention

In accordance with a first aspect of the present invention, there is provided a corridor mapping system for automatically determining the status of infrastructure extending within a corridor, the system comprising stereo imaging processing means arranged to process a plurality of stereo image pairs taken of the corridor to provide information on the three dimensional arrangement of components within the corridor, infrastructure identification means for identifying infrastructure components from the images, and status determining means for determining the status of the infrastructure components from the images . Preferably, the status determination means is arranged to determine encroachment of components within the corridor on the infrastructure components.

The components may include vegetation and the status determination means therefore enables a determination of whether vegetation is encroaching on the infrastructure . Alternatively or additionally, the status determination means may be arranged to identify and determine the position of faults associated with the infrastructure components.

The infrastructure may be any type of infrastructure which extends within corridors. It may be oil or gas pipelines, for example, or any pipelines. It may be power lines or telecommunications lines extending between poles within the corridor. For example, oil or gas pipelines, coastal zone mapping, roads and railways, fibre optics communications lines and rivers are all examples of infrastructure which may be mapped by utilising the present invention. In a preferred embodiment, the infrastructure components are power lines or telecommunications lines, which comprise lines carrying the power or telecommunication signals, strung between poles. In this preferred embodiment, the status determination means is preferably arranged to determine encroachment of vegetation on the power or telecommunications lines (which will hereinafter be referred to as "infrastructure lines"), or on right of way to the infrastructure lines . In this embodiment, the infrastructure identifications means is preferably arranged to apply an image segmentation process to identify the infrastructure lines from the images .

Preferably, the image segmentation process involves the separate steps of finding candidates for infrastructure lines from the images, then selecting the candidates most likely to be the true infrastructure lines .

In some cases, depending upon the original resolution of the original stereo images, it may not be possible to locate the infrastructure lines. Infrastructure lines generally have narrow profiles, and can therefore be difficult to see. This is particularly the case when the images are captured by imaging apparatus which is positioned at a relatively large distance from the infrastructure lines. There is also a problem where the original images are video images, as video to digital conversion will be required for processing and this can result in a decrease in resolution and quality. In the majority of cases, stereo images will be captured by a pair of image capture devices mounted to an aircraft, and the cameras will therefore be working at relatively large distances from the infrastructure.

In such circumstances, the infrastructure identification means is preferably arranged to identify the infrastructure poles from the images . The poles are identified and the positions then calculated. The positioning of the infrastructure lines can then be inferred from the position of the poles. The infrastructure identification means is preferably arranged to infer the position of the infrastructure lines from the information on the positioning of the poles.

Preferably, a catenary envelope is calculated. The catenary envelope calculation takes into account the fact that the actual position of the infrastructure lines at any one time may vary depending on such conditions as humidity, temperature and others.

In many circumstances, it will be necessary to locate the position of the poles whether or not the positions of the infrastructure lines can be identified directly. As the poles form part of the infrastructure, it is important to be able to determine encroachment of components, such as vegetation, on the poles, as well as on the infrastructure lines, and it is also important to identify any faults in the poles .

The infrastructure identification means preferably applies a mosaicing process, in order to mosaic the digital surface models (DSMs) together between pairs of poles in order to create a pole to pole DSM. Preferably, the infrastructure identification means is arranged to create a pole to pole DSMs. The DSM is used to locate the position of components such as vegetation.

This information is preferably utilised by the status determination means, together with the information on the positioning of the catenary envelope of the infrastructure lines, to determine the encroachment of components, such as vegetation, on the infrastructure lines .

The infrastructure identification means is preferably arranged to identify poles by, firstly, using segmentation techniques to identify pole candidates in each of the left and right images of the stereo images where the poles are positioned, and then comparing the pole candidates in each stereo image and matching the pole candidates to find the most likely candidates for the poles. Preferably, the most likely candidates are determined by finding the pole candidates from each of the left and right images that when matched form an approximate "V" shape (where preferably, the left arm of the V is from the right image and the right arm of the V is from the left image, due to camera orientation) .

Preferably, where infrastructure lines are identified directly, the infrastructure lines are identified by using segmentation techniques to look for "curved" image features in each of the left and right stereo images, the curved features being within a certain height range off the ground. An envelope will preferably still be calculated to take into account likely variation in position of the infrastructure lines depending on such conditions as humidity, temperature and others.

Preferably, absolute position information for the corridor mapping process is obtained by way of global positioning satellite (GPS) information. Preferably, the system also includes image capture apparatus, for capturing the pairs of stereo images. Preferably, the image capture apparatus includes a pair of cameras . The pair of cameras are preferably mounted to an aircraft, preferably an aeroplane, one under each wing. This enables stereo images to be taken. The system may also include a GPS arrangement mounted on the aircraft, and other orientation devices, such as gyroscopes. The system of the present invention, therefore, preferably has the advantage that it is able to automatically provide for the monitoring of corridor-type infrastructure to determine whether any components in the corridor, such as vegetation, are encroaching on the infrastructure, or to determine whether the infrastructure has any faults (eg cracks in pipeline, breaks in power lines, cracks or breaks in telecommunication or power poles) . This information is obtained from a plurality of pairs of stereo images which are preferably obtained from image capturing apparatus mounted to an aircraft.

Generally, the only "manual" aspect of the system is the initial capture which requires a pilot to fly a plane over a corridor. Even this could be automated, however, using unmanned aircraft. The rest of the process is preferably fully automated. This has the advantage of reducing significantly the number of man hours required to monitor corridor-type infrastructure. Further, because the system is automated and carried out by computing systems, analysis of the image data is objective and less likely to result in errors, as occur in the case of manual inspection.

In accordance with a second aspect of the present invention, there is provided a corridor mapping method, for automatically determining the status of infrastructure extending within a corridor, the method comprising the steps of processing a plurality of stereo image pairs taken of the corridor, in order to provide information on the three dimensional arrangement of components within the corridor, identifying infrastructure components from the images, and determining the status of the infrastructure components from the images . In accordance with a third aspect of the present invention, there is provided a computer program arranged, when loaded into a computing system, to control the computing system to provide stereo image processing means arranged to process a plurality of stereo image pairs taken of an infrastructure corridor, in order to provide information on the three dimensional arrangement of components within the corridor, to provide infrastructure identification means for identifying infrastructure components from the images, and to provide status determination means for determining the status of the infrastructure from the images.

In accordance with a fourth aspect of the present invention, there is provided a computer readable medium, storing instructions for controlling a computing system to provide a stereo image processing means arranged to process a plurality of stereo image pairs taken of an infrastructure corridor, in order to provide information on the three dimensional arrangement of components within the corridor, to provide infrastructure identification means for identifying infrastructure components from the images, and status determination means for determining the status of the infrastructure from the images.

Brief Description of the Drawings Features and advantages of the present invention will become apparent from the following description of an embodiment, thereof by way of example only, with reference to the accompanying drawings, in which:

Figure 1 is a representation of an aircraft flying above power lines and collecting stereo images;

Figure 2 is a flow chart of a system in accordance with an embodiment of the present invention; Figure 3 illustrates the camera geometry for image rectification;

Figure 4 illustrates the process of building image pyramids; Figure 5a illustrates horizontal slices of a correlation volume;

Figure 5b illustrates the maximal connected path across the slice;

Figures 6a to 6f illustrate stages in a background finding process;

Figure 7 illustrates a process for finding candidates for poles;

Figure 8 illustrates a process for locating cross arms ; Figure 9 illustrates results of the pole and cross arm segmentation procedures;

Figure 10 illustrates constraints for determining valid pairs of poles;

Figure 11 illustrates a process for matching cross arms to valid pairs of pole candidates;

Figure 12 illustrates a process for determining a pole marker;

Figure 13 illustrates a process for determining an external marker; Figure 14 illustrates a process for determining the coordinates of the top of the segmentation result for the pole;

Figure 15 illustrates 3D calculation for each matched point between left and right images; Figures 16a and 16b illustrate a disparity image and DSM respectively;

Figure 17 illustrates the merging of two DSM grids;

Figure 18 is a plan view of measurement points for near pole and centre clearances; Figure 19 is a side view of power lines suspended between two power poles and its geometry;

Figures 20a, 20b and 20c are views of a powerline and several mosaiced objects in the landscape, from various viewpoints .

The following description of an example embodiment relates to a system and process for corridor mapping power lines by using a strategy which first of all requires obtaining stereo images of the corridor and then locating the power poles within the image, and mapping the position of the power lines by inferring their position from the known position of the power poles. In this embodiment, therefore, the infrastructure components are power lines and power poles. The following is a general strategy for a system and process for mapping the power lines and power poles and for finding the components in the corridor:

Use stereo vision to obtain a DSM from video images

Find the power poles automatically from images Mosaic the DSMs to create a pole-to-pole DSM Infer the catenary envelope using pole positions - Calculate the distances from the catenary envelope to components in the corridor such as trees and other vegetation Figure 1 is a pictorial representation of an image capture system for capturing left and right stereo images of the corridor. An aircraft 1 with one camera on each wing tip flies above a corridor containing power lines 2 collecting a stream of stereo images . The angle between the optical axes of the two cameras is about 7 degrees. The cameras are looking down at about 35 degrees (i.e. the angle between the optical axis and the horizontal plane is about 35 degrees) . The distance between the two cameras is 10 metres. Figure 2 is the flow chart of the system.

The following steps are carried out by infrastructure identification means and status determination means to produce the relevant information in a pole-to-pole DSM and they are described in detail later. The power lines are not directly identified by the system of this embodiment. Instead, the power poles are identified and the catenary envelope of the power lines is inferred from the pole positions. It will be appreciated that in alternative systems falling within the scope of the prese'nt invention, the power lines may be directly identified:

1. Automatically calculate the relative orientation of the stereo cameras for each pair of images . Because of the effect of wind and the vibration of the wing tip, the camera geometry when capturing each pair of images is different.

2. Rectify individual image pairs to generate epipolar images based on the obtained camera relative orientation.

3. Match individual image pairs to obtain disparity maps .

4. Identify power poles using image information.

5. Compute 3D information from matched pairs . This step includes the calculation of the 3D positions of the tops of the power poles and the 3D surface of the 3D scenes.

6. Calculate the catenary envelope of power lines using the 3D points of the tops of the poles within a span.

7. Generate a DSM using the disparity map and camera parameters; generate an orthoimage using the DSM, image and camera information.

8. Mosaic the DSM and orthoimages into a 3D pole-to-pole DSM and pole-to-pole orthoimage.

9. Identify tree boundaries in each 3D pole-to-pole DSM.

10. Measure appropriate distances from the catenary envelope to tree boundaries.

11. Perform an interactive 3D visualisation.

Input Camera Parameters and Images

Camera Parameters There are 6 external parameters recorded that describe the position and orientation of the camera when each image was acquired. These are the global positioning system (GPS) and the inertial navigation system (INS) measurements . The parameters recorded are:

X: The ^-coordinate (easting) of the camera, units = metres

Y: The ^-coordinate (northing) of the camera, units = metres Z: The height of the camera above a modelled earth's surface, units = metres az : The direction the camera is pointing, units = degrees clockwise from north alt: The angle at which the camera is pointing towards the ground, units = degrees down from horizontal theta : A rotation angle of the camera about its axis, units = degrees

The X and Y coordinates are relative to a standard map grid. The Z coordinate is the height of the aircraft above an ellipsoid fitted to control points on the earth's surface. This modelled surface does not reflect local terrain variation and therefore is not exactly the height of the aircraft above the ground.

The azimuth angle { az) is related to the flight direction of the aircraft, but includes a contribution from the fact that the camera is aimed towards the centre of the aircraft, rather than exactly parallel to the flight path. The angle ^,alt'' is always negative when retrieved from the image files and appears equivalent to an angle of elevation, with the negative values signifying below rather than above the horizontal plane.

The internal camera parameters are also known. These parameters, identical for each image, are:

- camera focal length, units = millimetres camera lens distortion pixel size of the CCD array in the X direction, units = millimetres pixel size of the CCD array in the Y direction, units = millimetres location of the principal point

Stereo Images

In this example, the images were initially recorded on SuperVHS tapes, and digitised off-line. Because the aircraft was moving while acquiring the images, the interlaced nature of the video camera will reduce the spatial resolution of the obtained images . For each pole- to-pole span, there are about 10 frames of images. They are oblique views of the scenes .

The images are colour images in TIFF format . The colour resolution seems to be 15 bits rather than 24 bits. The image size is 736x560 pixels.

In this example, only lightness information has been used to match image points . The lightness image is obtained by averaging the red, green and blue bands of the original colour image. Colour information is used in the power pole segmentation stage.

Stereo Feature Matching by Correlation and Relaxation An implementation of Zhang's [Z Zhang, R Deriche, O Faugeras, and Q-T Luong. A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry. Technical Report 2273 I RIA, May 1994] technique for matching features in two stereo images by correlation and relaxation is explained in this section.

Extracting Points of Interest

High curvature points are extracted as feature points from each image. The method proposed in Zhang's paper uses a slightly modified version of the Plessey corner detector and a model based approach with sub-pixel precision output [T Blaszka and R Deriche. Recovering and characterising image features using an efficient model based approach. Technical Report 2422. INRIA, November 1994] .

Matching Through Correlation

Once the corner features have been extracted for each image, their mean and standard deviation in a given correlation window about that corner point is calculated. This correlation window size is {2n +1) (2m + 1) where n = m = 7.

For each corner point in the first (left) image, we define a search window of size

(2 +l) (2v+l) centred about that corner point in the second (right) image, u and v axe. set to a quarter of the image width and height respectively. Correlation scores are calculated between a corner point in the first image and all corner points in the second image that lie in this search space. The correlation scores for all corner points in the first image are calculated and thresholded. The corner points in the first image with a correlation score > 0.8 with a corresponding corner point in the second image are candidate matches. Hence we have, for each corner point in the first image, a list of corner points in the second image which are considered candidate matches.

Disambiguating Matches Through Relaxation

The relaxation technique proposed in Zhang's paper disambiguates candidate matches and fine-tunes the matching process further.

For each image and for each corner point in an image, identify a list of corner points in the same image that lie within a circle of radius, R, which is one eighth of the image width. Calculate the Euclidean distances between each corner point in an image and neighbouring corner points inside the circular window.

Now for each corner point in the first image and corresponding candidate list corner points in the second image, we calculate the strength of a match using neighbouring corner points centred about each of the corner points under consideration. The match strengths define the support for a match and uses the sum of scores from all other corner points in a circular window of the specified radius R . The formula for calculating match strengths is described further in the paper. Heuristics have been defined to ensure symmetry - the match strengths calculated for corner points in the first image with candidate matches in the second image should be the same if the first and second images have been swapped.

Let us define an energy function as the sum of match strengths of all candidate matches. Then the solution for disambiguating matches is equivalent to minimising this energy function. The relaxation process proposed in the paper is an iterative procedure of computing candidate match strengths for the corner points and updating the matches by minimizing the total energy. At the first iteration, the match strengths (as described above and in the paper) for all candidate match corner points are calculated and stored in a list T_sm and an ambiguity score in an associated list T_ua . This ambiguity score is 1 — SM² / SM¹ where SM² is the second best match strength with respect to the best match strength SM¹ , for a candidate match corner point pair. These two lists are sorted in descending order. Corner points that remain in the top 60% of both sorted lists are considered and labelled to be strong matches and are eliminated from further processing. The process is repeated with the remaining corner points and with new match strengths. This iterative procedure will stop when no further matches can be found.

Camera Orientation and Image Rectification

The rectification process needs the provision of camera parameters . These camera parameters can either be provided directly or they can be worked out from image feature points. Because of the vibration of the aircraft wings, the relative camera parameters will not stay constant. We have observed that the epipolar lines can be mis-matched by up to 14 pixels because of the non- constancy of the camera parameters . Therefore other means of obtaining these camera parameters [C Sun. A robust structure from motion technique applicable to the SIROVISION project. Technical Report DMS-E95/5, CSIRO Division of Mathematics and Statistics, February 1995. C Sun. Camera calibration for the SIROVISION project. Technical Report DMS-Ξ96/30, CSIRO Division of Mathematics and Statistics, March 1996] or modifying the assumed known parameters automatically will be necessary.

Obtaining Relative Camera Orientation Automatically Because of the vibration of the aircraft ings, the geometry of the two cameras is not fixed. Therefore it is necessary to recover the geometry automatically by just using image information. The obtained geometry will be used for image rectification and for 3D calculation. The parameters related to this geometry include the three rotation angles and the three elements of the translation vector. The automated procedure for recovering the relative camera geometry only use the intrinsic camera parameters and some matched image points . Because the point matching step is automatic, mis-matches may occur. Therefore it is necessary to identify and remove those mis-matches for the estimation of the camera geometries. Matched feature points are used for the estimation of the relative camera geometries. This robust estimation procedure involves S- estimation [D Ruppert. Computing S estimators for regression and multivariate location/dispersion. Journal of Computational and Graphical Statistics, 1(3) :253-270, 1992] and optimisation steps. Our proposed algorithm for robust relative orientation estimation is :

1. Automated feature matching using relaxation.

2. Robust relative orientation using S-estimation:

a. Selection of a random elemental sample of a minimum number of matched points (in our case 5 matched points) . Based on the elemental sample and the initial parameter values, the relative orientation parameters can be obtained.

b. Calculation of the residual and standard deviation of the image errors for all the matched points based on the current values of the relative orientation parameters .

c. Scaling the standard deviation.

d. Performing the weighted M-estimator calculations using the standard deviation obtained from the previous step.

e. Recording the relative orientation parameters which give the minimum standard deviation.

Image Rectification

After the relative orientation parameters of the cameras have been obtained, stereo image rectification can be performed.

The operation of rectification is meant to ensure a simple epipolar geometry for a stereo pair [O Faugeras.

Three-Dimensional Computer Vision: A Geometric Viewpoint.

The MIT Press, 1993] . By simple geometry we mean that the epipolar lines are parallel to the image rows, i.e. the rectification process transforms (usually non-horizontal) epipolar lines into horizontal scan lines. One possible way to achieve such a result is to re-project the images onto a single image plane using the same camera centres. The re-projection operates as follows: Given a pixel J in the original image, we construct a new pixel by intersecting the line from the camera centre to I in the new image plane. The new image plane is chosen so that it is parallel to the line connecting the two camera centres .

Figure 3 shows the camera geometry for image rectification. Each camera is modelled by its optical centre C and its image plane. I and I₂ are the image points of the same 3D point, P, projected on to the left and right cameras respectively. The plane defined by the 3D point P and the two camera centres intersect with the image planes of the two cameras producing two lines . These two lines are called epipolar lines as shown in the figure. Usually these lines have certain angles with the image scan lines . The new optical axes of the two virtual cameras are parallel to each other as indicated by the dashed lines in Figure 3. The rectification step will project the image points I_± and I_∑ to l and J^ respectively in the new image space. This will have the effect of transforming the original epipolar lines into horizontal scan lines.

If we let J^* = (U, V, S) ^T denote the projective coordinates of I, and (X, Y, Z) ^T the coordinates of P, the following relation holds :

where T is a 3x4 matrix usually called the perspective matrix of the camera. The elements of T are related to the internal (focal length, image centre, etc) and external (position and orientation) camera parameters. In the general case S ≠ 0, the image coordinates of I (in pixels) are given by: uλ fu/s

The principle of rectification is the definition of two new perspective matrices __^" and N which respectively define the same camera centre Ci and C₂ as T_± and _∑ but with a new common image plane parallel to the line CιC₂. The rectification is the function which computes the new coordinates { u v ) from the old ones ( u, v) for both left and right images .

The constraints for the new perspective matrices M and N are : 1. The camera centres remain.

2. The new focal lengths of the two cameras must be the same .

3. For any point P not in the focal plane, the image points l and 1^ respectively computed with M and N will lie on the same horizontal scan line. An image point I_± in image 1 (left image) comes from a 3D point P lying on the 3D straight line D defined by T Cι . The parametric equation of D is: P = d + λn, where __, a vector collinear to D, is given by:

__ = Ni I^*i with

where tj is the 3-vector obtained from the first 3 elements of the j^th row of T, i = 1, 2 indicating the left and right images [C Sun. Multi-resolution rectangular subregioning stereo matching using fast correlation and dynamic programming techniques. Technical Report 98/246, CSIRO Mathematical and Information Sciences, Australia, December 1998] .

The projective coordinates of the new image point

of P on the rectified left image are computed as follows:

Similarly, other rectified image points in the left image can be obtained. The rectification of the right image is similar to that of the left image. After the rectification of the images, epipolar lines are parallel to the axes of the image coordinate frames . Therefore potential matches between the left and right images satisfy simple relations. This allows for simpler and more efficient dense stereo matching.

Dense Stereo Matching

Disparity is the difference in pixel locations between matched features of different images. This disparity image together with the camera parameters can be used to calculate the 3D surface. As we have obtained the rectified left and right images, we can use them to achieve fast dense stereo matching in order to obtain a disparity map.

Fast Cross Correlation Cross correlation is a technique for matching two signals. It is often used in signal processing and has some important applications in stereo reconstruction. The value of the optimised correlation function is a measure of how well the left and right regions match. Direct calculation of cross correlation over the whole image is computationally expensive. We have developed fast algorithms for correlation calculation using the box filtering technique [C Sun. A fast stereo matching method. In Digital Image Computer: Techniques and Applications, pages 95-100, Massey University, Auckland, New Zealand,

December 10-12 1997. C Sun. Multi-resolution rectangular subregioning stereo matching using fast correlation and dynamic programming techniques. Technical Report 98/246, CSIRO Mathematical and Information Sciences, Australia, December 1998] .

Rather than working with the whole image during the fast image correlation stage, we could work with sub- images to speed up the correlation calculation further and reduce the memory space for storing the correlation coefficients .

Coarse-to-fine Scheme

It has been shown that a multi-resolution or pyramid data structure approach to stereo matching is faster than one without multi-resolution [K S Dumar and U B Desai. New Algorithms for 3D surface description from binocular stereo using integration. Journal of the Franklin

Institute, 331B(5) :531-554, 1994], as the search range in each level is small. Besides fast computation, a more reliable disparity map can be obtained by exploiting the multi-resolution data structure. The upper levels of the pyramids are ideal to obtain an overview of the image scene. The details can be found down the pyramid at higher resolution. There are three useful properties for the coarse-to-fine scheme [F Ackermann and M Hahn. Image pyramids for digital photogrammetry. In H Ebner, D Fritsch, and C Heipke, editors. Digital Photogrammetric Systems, pages 43-58. ichmann, 1991] : (a) the pull-in range or search range can be increased, because at a coarse pyramidal level only rough initial values are needed; (b) the convergence speed can be improved; and (c) the reliability of finding correct matches can be increased.

Figure 4 shows the process of building the image pyramids. In the current implementation, the lower resolution image is obtained by simply taking the average value the corresponding rxr (e.g. r=2) pixels in the previous higher resolution level for its simplicity.

During the process of projecting the disparity map from the current level of the pyramid to the next (if current level is not level 0, or the highest image resolution) , the disparity image size was scaled up by the value of r, where r is the reduction ratio used when building the image pyramid, and the disparity value was scaled up by the same r. The disparity value where the position { i , j) of the next level image is not a multiple of r was obtained using bilinear interpolation.

Best Path in the Matrix

Most researchers [S D Cochran and G Medioni. 3D surface description from binocular stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(10) :981-994, October 1992] choose the position that gives the maximum correlation coefficient as the disparity value for any point in the left image. We choose a slice of the correlation coefficient volume as a 2D correlation matrix for each scan line of the rectified image and use this matrix to obtain more reliable disparities. The width of the matrix is the same as the length of the horizontal image scan line, and the height of the matrix equals the correlation search range, 2w+l. A typical correlation matrix is shown in Fig. 5(a) . We will use the correlation matrix to find the disparity for any one scan line. Rather than choosing the maximum correlation coefficient, we find a best path from left to right through the correlation matrix. The position of the path indicates the best disparity for this scan line.

The algorithm for finding the best path through the correlation matrix uses a dynamic programming technique [M Buckley and J Yang. Regularised shortest-path extraction. Pattern Recognition Letters, 18 (7) :621-629, 1997. G L Gimel farb, V M Krot, and M V Grigorenko. Experiments with symmetrized intensity-based dynamic programming algorithms for reconstructing digital terrain model.

International Journal of Imaging Systems and Technology, 4:7-21, 1992. S A Lloyd. A dynamic programming algorithm for binocular stereo vision. GEC Journal of Research, 3(l):18-24, 1985.]. Fig. 5(a) gives an image containing the correlation coefficients in the correlation matrix.

Fig. 5 (b) is the path obtained in the correlation matrix using this technique. The best path gives the maximum sum of the correlation coefficients along the path when certain constraints are imposed. The distance of each point along this path to the middle line in the matrix (from left to right) is the disparity for this point. The disparity gradient limit constraint can be easily implemented during the dynamic programming minimization process. This limit constrains the size of the neighbour search or the path along which it can go.

Algorithm Steps

Our proposed algorithm for stereo matching is :

1. Build image pyramids with K levels (from 0 to K—l ) , with the reduction ratio of r, from the original left and right images; The upper or coarse resolution levels are obtained by averaging the corresponding rxr pixels in the lower or finer resolution level as shown in Fig . 4.

2. Initialise the disparity map as zero for level k = K-l and start stereo matching at this level. 3. Perform image matching using the following steps:

(a) Segment images into rectangular subregions.

(b) Perform fast zero-mean normalised correlation to obtain the correlation coefficients .

(c) Use dynamic programming to find the best path, which will then give the disparity map.

4. If k ≠ 0, propagate the disparity map to the next level using bilinear interpolation, set k = k-l and then go back to step 3 ; otherwise go to step 5.

5. Display the disparity map.

Pole Segmentation

The aim of the pole segmentation procedure is to find the pole when it appears in an image pair and then to find the position along the pole where the power line is attached. Our approach is to find either the intersection point of the cross arm with the pole or, in cases where there is no cross arm attached to the pole, to locate accurately the top of the pole. We require as input to the segmentation procedure a pair of left and right rectified colour input images. When the image pair is rectified, we can expect that the vertical position of the top and bottom of the pole to be the same in both the left and right images . This is an important feature which is used in the segmentation procedure to distinguish the pole from other image features .

Note, that if the resolution is sufficient it may be possible to locate the pole insulators (which would indicate exactly where the line is attached to the power poles) .

The segmentation process follows the three basic steps: (i) find candidates for poles and cross arms in the image pair; (ii) find the pole candidate most likely to be the true pole (if any) by matching pole candidates and corresponding cross arms in the image pair; (iii) if no cross arms are present, refine the segmentation of the pole so that the top of the pole can be more accurately determined. These three basic steps are discussed in the following sections respectively.

Finding candidates for poles and cross arms In this section, we discuss the procedure for segmenting candidates for poles and cross arms. The procedure begins with a preprocessing step to mark background parts of the image that we know cannot possibly be poles. Next, candidates for poles are found by using a linear feature detector based on the median filter [A K Jain. Fundamentals of Digital Image Processing. Pentice-

Hall, Englewood Cliffs, New Jersey, 1989] . Finally, we discuss in this section the procedure used to find candidates for cross arms in the image.

Finding background regions in the image

The first stage in the segmentation process employs the colour information at hand in the input image to find 'background' regions of the image, or regions that we know cannot possibly be poles, for example green vegetation regions . There are two assumptions made about poles during this processing stage: poles are grey (not green or some other colour) and poles are light (they are not black and there is adequate lighting on the pole) . These two assumptions are the basis for a simple segmentation procedure to find background regions in the image. After transforming the colour input image into an HLS (Hue, Lightness and Saturation) image, the saturation component of this image is then used to identify regions that are not grey and the lightness component is used to find dark regions .

Fig. 6a shows a portion of an example rectified colour input image on which the process is demonstrated. The detailed steps of this stage are as follows:

Transform the colour input image (Fig. 6a) into an HLS image - Threshold the saturation component of the HLS image

(Fig. 6b) at a value of 20 to obtain a binary mask of high saturation regions - the background (Fig. 6c) Select the lightness component of the HLS image (Fig. 6d) and apply a morphological opening [J Serra. Image Analysis and Mathematical Morphology. Academic Press, 1982] with window size of 50 followed by a morphological closing of size 50 to find the background trend in the lightness component image

(Fig. 6e) - Compute the average pixel value meanres of the background trend, in a region restricted to the binary mask of high saturation regions found above Threshold the lightness component image at the value meanres + (255-meanres) /5 to obtain a mask of the background of the image (Fig. 6f) .

Finding candidates for poles The next step in the segmentation process is to identify candidates for poles in the input image. We work solely on the lightness component of the HLS transformation of the input image and no direct use of ^s colour is made during this stage of the procedure.

However, the background region, which was obtained in the process described above using colour information, is used to restrict the segmented pole candidates to regions that are not deemed to be background. Poles are linear features and to identify them we use an approach based on a linear median filter. We use median filters instead of morphological filters such as openings and closings because a median filter is self dual and can simultaneously find pole objects on both dark and light backgrounds. Moreover, the median filter will tolerate a certain amount of dark shadowing along the pole object in the image, often caused by the cross arm when it is present. The disadvantage of the median filter however is that it can connect isolated noise elements with the pole features, thus overestimating the length and width of the pole. This is generally not a problem when cross arms are present because in this case we do not require an accurate estimate of the height of the pole (just the pole/cross arm intersection point) . When cross arms are not present, the segmentation procedure then uses a watershed function to correct the overestimation of the height of the pole.

The process is demonstrated on the lightness component image previously shown in Fig. 6d. The sequence of steps is as follows:

Perform a vertical median filter of length 29 on the lightness component of the HLS image (Fig. 7a) Perform a horizontal median filter of length 29 on the vertical median filter result and take the absolute difference between the two images (Fig. 7b)

Threshold this result at a grey-level of 20 and remove those pixels in the background of the image (Fig. 6f) estimated in the previous section (Fig. 7c) Perform a vertical median filter of length 29 on this result to connect up broken line elements and remove isolated noise (Fig. 7d) - Perform a dynamic line opening (see below for a description of this new filter) and remove line segments less than half of the maximum length (Fig. 7e)

Remove lines that are of length less than 30 to produce the final image of pole candidates (Fig. 7f) .

As mentioned above, a new filter called the dynamic line opening is used to clean up linear features in binary images . We have found that this filter is very effective at removing jags on the sides of pole objects (which can often connect the pole object with noise in the background) . The filter treats each object in the binary image independently (as opposed to a classical opening which does not change dynamically depending on the objects in the image) . The direction θ_± of the major axis of the best fit ellipse [R M Haralick and L G Shapiro. Computer and Robot Vision, volume I. Addison-Wesley, Redings, Massachusetts, 1992] to the binary object i is calculated and the maximum length Li of the object in direction θi is determined (using a minimum bounding box in direction θ_±) . Then each object is filtered by removing line segments within the object that are shorter than some fraction of L± (in our procedure, this fraction is a half) . Effectively, this is just an opening of the object i using a line structuring element of length [1/2] Li in direction θi . As shown from the result in Fig. 7e, the filter tends to remove noise from the side of the object and preserve only the main form of the object.

Finding candidates for cross arms

The process for finding cross arms is quite similar to that for finding pole candidates. Although in principle the pole finding procedure could be applied directly to find cross arms, some different filters have been used and parameter values have changed in order to improve the success rate of the process on the particular images we had at hand. As before, we work on the lightness component image and use the background region^' to restrict the cross arm candidates to regions that are not in the background. Again, we use an approach based on a linear median filter which can simultaneously find objects on dark and light backgrounds and tolerate a certain amount of dark shadowing along the object. The sequence of steps is as follows and is demonstrated on the lightness component image previously shown in Fig. 6d:

Perform a horizontal median filter of length 31 on the lightness component of the HLS image (Fig. 8a)

Perform a vertical median filter of length 21 on the horizontal median filter result and take the absolute difference between the two images (Fig. 8b) Threshold this result at a grey-level of 25 and remove those pixels in the background of the image estimated in the previous section (Fig. 8c) Apply a small vertical closing of length 3 to improve vertical connectivity along the cross arm

Perform an opening by union [P Soille, E Breen, and R Jones. Recursive implementation of erosions and dilation along discrete lines at arbitrary angles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(5):562- 567, 1996] , using 5 structuring elements of length 27 at angles 10, 5, o (horizontal) , -5 and -10 degrees, thus leaving only strong candidates for cross arms (Fig. 8d) .

The main point of note in the above procedure is the final use of an opening by union in order to clean the image, leaving only strong candidates for cross arms. Theoretically, as cross arms should be mostly horizontal in the rectified images, a single horizontal opening should suffice for this purpose. However, due to the complexity of the imaging geometry and the actual variety of cross arm positions and orientations, this is not always the case and a set of openings must be used to allow for some variation in the angle of the cross arm about the horizontal position.

Matching candidates for poles and cross arms After running the background detection and the pole and cross arm segmentation procedures on both the left and right rectified images, we now have images of pole and cross arm candidates for the image pair. Fig. 9 shows an example set of results for a pair of left and right images (shown in Figs. 9a and b respectively) . The corresponding pole candidates found are shown in Figs. 9c and d respectively and the cross arm candidates found are shown in Figs. 9e and f respectively. By combining the results for the left and right images, the matching process checks and discards pairs of pole candidates that are deemed unsuitable, using a set of simple geometric constraints. Then, using the results for the cross arm candidates, the remaining valid pairs of pole candidates are ordered to find the pair most likely to correspond to an actual pole in the input images.

Matching pole candidates

The first step in the pole matching process is to label and obtain summary shape statistics (such as area, bounding box and best fit ellipse statistics) for the pole candidates in the left and right images . Any candidates that have a vertical height less than 45 pixels or are not lying within the central region of the image (defined within the horizontal pixel range [140, 595] for an image of width 735 pixels) are immediately discarded. The latter constraint was introduced in order to tune the segmentation procedure to work well for a particular set of images that had been successfully rectified. However, more generally we would expect this to be a valid assumption as the power pole should lie in the central region of the image if the aircraft is indeed flying over the tops of poles .

Having pruned out some of the pole candidates in the image pair on the basis of these two constraints, we now turn our attention to matching pairs of candidates from the left and right images . We say that a left and a right candidate form a valid pair of poles if they satisfy the following constraints (the parameters in the following were tuned to a particular set of images to improve the success rate) :

- The vertical overlap of the two candidates is at least 40 pixels (as the images are rectified, the top and bottom vertical position of the pole should be the same in both the left and right rectified images) The vertical overlap is at least 40 percent of the height of both the left and the right candidates

(this is to make sure that the degree of overlap is significant when compared to the actual size of the candidates, and it also ensures that the two candidates are comparable in size) - The angular separation between the candidates is in the range [1.5, 15] degrees (due to the geometry of the camera system, we can expect vertical objects to swing anticlockwise a certain amount when considering the left and then the right objects) - The horizontal distance between the centroid of the right pole candidate and the left pole candidate is in the range [-70, 250] pixels (again due to the geometry of the camera system) .

These constraints are illustrated in Fig. 10, where a pair of pole candidates from the left and right images in Fig. 9 is shown superimposed together. This in fact is the only valid pair of poles for this example. Here, the object on the left is a pole candidate from the left image (refer to Fig. 9b) , with a central horizontal position of approximately 315, and the object on the right is a candidate from the right image, with a central horizontal position of approximately 355, (Fig. 9c) . Note that more typically a pair of pole candidates will form a *V shape, where the right pole candidate is actually positioned to the left hand side of the left pole candidate. However, as the figure illustrates, this is not always guaranteed. Fig. 10a shows the constraint imposed on the vertical overlap of the two candidates (this must be at least 40 pixels) . The percentage overlap must also be sufficient; in this example, the percentage overlap for the left candidate is 100 percent and that for the right candidate is approximately 93 percent (both of which are sufficient) . Figure 10b illustrates the constraint on angular separation, which requires that the angle of the candidate swing anticlockwise a certain amount when considering the left and then the right pole candidate. In contrast, horizontal objects such as road markings and fence railings swing clockwise a certain amount (which is a very useful fact when we consider that such objects can look very much like a pole) . Finally, Figure 10c illustrates the constraint on horizontal distance, measured between the centroids of the two pole candidates . In this example, the horizontal distance is actually negative because we do not have the typical form of a V shape pair of candidates (hence the lower limit on horizontal distance is set to -70 for such cases) .

Matching cross arm candidates

Once a set of valid pairs of poles has been found, the next stage in the matching process is to find and match any cross arms that intersect the pole candidates. This is a fairly simple process which comprises the following steps for each valid pair of poles : Fit a line to the left pole candidate in the valid pair that passes through its centroid at an angle given by the direction of the major axis of the best fit ellipse to the pole candidate

Find all the cross arm candidates in the left image that intersect this line, between two vertical extremes given by the bottom of the pole candidate and the top of the pole candidate plus an extra 80 pixels (we found in practise that this extra amount was needed in cases where the pole candidate was actually an undersegmentation of the true pole in the image and so the top of the pole candidate was below the height of the cross arms) - Record the vertical position of each intersecting cross arm at the centroid point where it intersects the line

Repeat the above process for the pole candidates in the right image to find the right intersecting cross arms

Match a pair of left and right intersecting cross arms by finding the pair that has the minimum vertical separation (given by the distance between the vertical position of the left cross arm and the vertical position of the right cross arm)

Repeat the above step, excluding cross arms that have already been matched, until either there are no remaining left cross arms or no remaining right cross arms.

The matching cross arms for the example valid pair in Fig. 10 are shown in Fig. 11. Here, the vertical separation marks where to search for intersecting cross arms along the line fitted to the pole candidate; in this case there are two intersecting cross arms for the both the left and right candidates. The vertical position of each intersecting cross arm is marked by a circle and matching cross arms are indicated by a circle with the same colour.

Choosing the best valid pair of pole candidates We now have matched valid pairs of pole candidates and have found any matching cross arms for each valid pair. The final stage in the process is to select the best valid pair of pole candidates, which will hopefully correspond to the true power pole in the input images . Before we discuss this however, we first treat the particular case when we have a valid pair of poles for which no matching cross arms have been be found. In fact, we should expect this to happen fairly often as a certain percentage of power poles do not have cross arms attached to them. In order to improve the success rate of the algorithm when it was trained on a sample set of images, it was necessary to introduce a final constraint on such valid pairs that the height of both the left and right pole candidates must be at least 90 pixels (twice the height constraint that was applied previously in this section) . In fact this extra constraint is justifiable if we consider that when the power pole has no cross arms attached to it, it is more likely that we obtain a cleaner segmentation of the pole with no break up (which often occurs due to shadowing near cross arms). Thus, in general we should expect the segmentation results for poles without cross arms to have a greater vertical height.

Once this extra constraint has been applied, we then consider the remaining valid pairs of pole candidates in order to find the best solution. If we consider two valid pairs of poles A and B, we consider that A is a better solution than B if it satisfies one of the following conditions :

A has the same number of cross arms as B and has a greater vertical height than B A has more cross arms than B and has at least half the vertical height of B

A has less cross arms than B and has at least twice the vertical height of B

By comparing all the valid pairs, the best solution for a valid pair is determined. At this stage, only this best solution is used for further processing by the procedure. However, we envisage that in future work high level knowledge and three dimensional information can be used to test the validity of the other valid pairs found by the segmentation procedure, perhaps yielding alternative solutions.

Poles with no cross arms In cases where there is no cross arm attached to the power pole, we must use the position of the top of the pole as a measure for where the power line is attached. The pole segmentation procedure described above may require some refinement in order to obtain the position of the top of the pole. Therefore, we refine the result given by the pole segmentation procedure using a watershed function [L Vincent and P Soille. Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13 (6) :583-598, June 1991]. The watershed function requires as input a marker for the pole and a marker for the background (called the external marker) , and also a segmentation function. We discuss how these are obtained in the following three sections. We then discuss how the coordinates of the top of the pole may be obtained from the improved segmentation result .

Obtaining a pole marker

The first stage in the process is to clean and refine the previous segmentation result for the pole so as to obtain a good pole marker. The previous segmentation result for the pole is not acceptable because it often falls over the edge of the pole and we require that the pole marker lie entirely within the pole.

The steps of the process are detailed below and applied to the (left) rectified colour input image in Fig. 12a. Some points of note are the use of the skeleton operation [J Serra. Image Analysis and Mathematical Morphology. Academic Press 1982] and the rank-max opening operation. Skeletonising the segmented pole using a minimal skeleton will generally guarantee that the pole marker lies entirely within the actual pole, as required. The rank-max opening works in much the same way as a max opening (or union of openings) , where a number of different openings are performed on the input image and the maximum (or union) of all the results is taken as the final result. However, instead of each simple opening, a rank opening is performed which is a simple rank filter followed by a dilation (the result is in fact idempotent just like a normal opening) . This filter is useful because it allows some break up of the pixels within the object being filtered. A final point is that the following process is not always guaranteed to give a result, as the filters used may remove the original segmented pole altogether. In such case, the procedure will return a blank result and there will be no result for the pole segmentation (the original segmented pole would be considered to have been a bogus pole) .

The steps are:

Apply an 11 by 11 opening to the previous segmentation result for the pole (Fig. 12b) so as to remove any objects that are smaller than this size

(including the pole object)

Take the difference between this result and the original image so as to restore only these small objects (the effect of this operation is to remove objects that are larger than 11 by 11 and could not be true pole objects)

Apply an area opening [L Vincent. Grayscale area openings and closings, their efficient implementation and applications. In J Serra and P Salembier, editors. Mathematical Morphology and Its Applications to Signal Processing, pages 22-27, UPC Publications, May 1993] of size 300 to remove any small debris from the image

Apply a rank-max opening using 11 lines of length 51 distributed evenly over the range of angles [80,100] (where 90 degrees is vertical) with a rank of o.i - Skeletonise this result using a minimal skeleton and apply a small 3 by 3 dilation to strengthen the pole marker (Fig. 12c)

Find the direction of the pole marker and apply a median filter to the original colour input image in this direction (Fig. 12d)

Extract the multispectral gradient from this filtered image and fit it to an 8-bit CHAR image (Fig. 12e) Threshold this gradient image to find regions that have a gradient less than 20 and restrict the pole marker found above (Fig. 12c) to this region (Fig. 12f)

Obtaining the external marker

Having found the pole marker, we now turn our attention to finding an external marker for the watershed function. The process is similar to that described for finding background regions in the image, although some of the steps and parameters have changed in order to perform well on certain images. In particular, a rank-min closing is used to fill in regions in the external marker that are not linear features such as poles (the rank closing is a simple rank filter followed by an erosion, the result of which is idempotent just like a normal closing) . We also make use of the pole marker found in the previous section to make sure that the external marker does not overlap the pole.

The details are as follows and are illustrated in Fig. 13 using the colour input image in Fig. 13a:

Transform the colour rectified input image into an HLS image - Threshold the saturation image at 1/3 the maximum saturation value to find regions of high saturation (typically this will be the background) (Fig. 13a) Select the lightness component of the HLS image and apply a morphological opening with window size of 50 followed by a morphological closing of size 50 to find the background trend in the lightness component image

Find the mean value of the background trend and threshold the lightness component to find pixels less than this value (Fig. 13b)

Combine the results for the thresholded saturation (Fig. 13a) and lightness images (Fig. 13b) using a boolean OR operation (shown in close up at the region of the pole in Fig. 13c) - Take the pole marker that was previously derived (Fig. 12f) , dilate it using a 5 by 5 window and remove this dilated area from the combined threshold result (Fig. 13c) so as to remove any trace of the pole from the external marker region (Fig. 13d) - Erode this result using a 5 by 5 window to further shrink back the external marker from the pole (Fig. 13e)

Apply a rank-min closing using 11 lines of length 51 distributed evenly over the range of angles [80,100] with a rank of 0.05

Apply a final small erosion using a 3 by 3 window to clean this result (Fig. 13f) .

Applying the watershed function We now have both the marker for the pole and the external marker required as input to the watershed function. The segmentation function we use is the multispectral gradient that was derived during the process of finding a pole marker (see Fig. 12e) . Application of the watershed function gives a good estimate of the top of the pole.

Finding the top of the pole

Now that we have an improved segmentation result for the pole in both the left and the right images, it is necessary to extract the coordinates for the top of the pole so that they may be used to find the position along the pole where the power line is attached. As illustrated in Fig.16, this is done by simply fitting a line to the segmented pole using the best-fit ellipse statistics of centroid and orientation. By inserting the top vertical position into the equation for this line, the corresponding top horizontal position is readily extracted. These are then used as the coordinates for the top of the pole . Note that we obtain coordinates for the segmented pole in both the left and right images, and no effort is made to combine the two results at this stage of the project (such as averaging the two vertical positions) .

3D Calculation After finding the corresponding points between the left and right 2D images, the 3D position of the same physical point P can be obtained. The parametric equation of the 3D straight line D_± in the left image is

P = d + λn_x

while the parametric equation of the 3D straight line D₂ in the right image is

P = C₂ + μn₂

The __j's in the above equations are defined earlier in the specification. The 3D coordinate of P can be obtained by finding the intersection of D and D₂ as shown in Figure 15. Because of digitisation error, or inaccurate matched points, the two lines may not meet in space. We choose the mid point of the shortest line segment L between these two lines as the 3D position of point P. If we calculate the 3D position for every matched point on the image, a 3D surface of the scene can be obtained.

The 3D positions of the tops of the power poles are calculated in the same way.

The final 3D information is calculated in the global reference system, and therefore can be related to other geographical data such as those in a GIS system.

DSM and Orthoimage Generation

Since with the resolution available at the moment it is difficult to see power lines themselves, their position is modelled based on the knowledge of where the power poles are. [Note that in the future, it may be necessary to see the power lines so that they can be identified directly, in addition to identification of the power poles . ]

This requires consecutive power poles to appear in a single image. Unfortunately, power poles are several images apart. The fact that the camera is not aimed vertically down below the aircraft can cause complications with the mosaicking of the image sequence. The oblique angle means that the size of the ground area imaged in each image pixel varies from the top to the bottom of the image. The most obvious visual effects of this are that objects near the bottom of the image, closer to the aircraft, appear larger than similarly sized objects near the top of the image and the road appears to narrow (parallel lines converge) towards the top of the image. The camera also points towards the centre of the aircraft, causing similar distortions across the image. To create the pole-to-pole DSM and orthoimage, we need to generate a DSM and orthoimage for each stereo pair.

DSM Generation •

In our case the known output image grid will be the Australian Map Grid (this example being carried out in Australia) . The location of each data item is given in easting and northing coordinates which have been interpolated onto a regular grid. This means that information derived from these images can easily be related to other geographical data.

From the 3D calculation stage, each pair of matching points generates a point in the 3D space. Within the field of view of the cameras, all the 3D points which have corresponding image matching points have a 3D location measurement. However, because of complexities such as the oblique camera geometry, image resolution limitations, occlusion etc, the 3D points obtained do not lie on a regular grid.

From these irregular 3D points, interpolation was performed in the local region to obtain the 3D points on a regular grid. The grid spacings or the distances of neighbouring points in the X and Y directions need to be given. The spacing of the grid will relate to the size of the DSM.

DSM Format

VirtuoZo™ regular DSM format is a very simple plain ASCII format and therefore was adapted as our DSM format. VirtuoZo™ DSM is defined as a grid and each node of the grid has an elevation value. The data type of the elevation is integer. All elevation values in VirtuoZo™ DSM data use decimeter as their unit. -99999 is set as its elevation value for a empty node which means it does not have an elevation value (we adapted -99999 to -9999 to suit our short integer storage for an elevation value) . The order of elevation data is from left to right (-X to +X) and from bottom to top (-Y to +Y) , and each DSM row starts from the beginning of a line .

There seven numbers in the first line of a VirtuoZo DSM file :

1. X0 - X-coord. of the left-bottom node (unit : centimeter)

2 . Y0 - Y-coord. of the left-bottom node (unit : centimeter)

3 . Angle - The DSM rotation angle in the XY -plane (unit : degree, and anti-clockwise is positive direction) 4. DX - Spacing interval in the X-direction (unit : centimeter)

5 . DY - Spacing interval in the Y-direction (unit : centimeter)

6 . NX - Numbers of points in the X-direction 7 . NY - Numbers of points in the Y-direction

Here is an example of a VirtuoZo™ DSM file (only the beginning of the file is shown) :

6591. 732 -1571 . 581 0 . 007352 20 . 00 20 . 00 66 113 -7567 -7593 -7601 -7574 -7606 -7567 -7608 -7609 -7597 -7602

-7604 -7577 -7621 -7590 -7577 -7633 -7639 -7649 -7614 -7632

-7631 -7632 -7647 -7626 -7600 -7602 -7630 -7603 -7621 -7613

-7620 -7604 -7616 -7620 -7573 -7582 -7574 -7563 -7555 -7530 - -77448844 - -77551122 - -77443366 - -77448888 - -77550055 - -77449922 - -77447722 -7494 -7572 -7568

-7572 -7586 -7579 -7602 -7614 -7586 -7608 -7603 -7602 -7569 -7585 -99999 -99999 -99999 -99999 -99999

-7555 -7570 -7556 -7583 -7597 -7576 -7602 -7592 -7581 -7587 -7585 -7585 -7587 -7586 -7605 -7612 -7626 -7639 -7609 -7622

-7613 -7632 -7609 -7617 -7602 -7624 -7624 -7617 -7622 -7607

Figure 16 (a) shows a disparity map used to calculate the 3D positions of those matched points in the image .

Figure 16 (b) shows an image of the DSM obtained. It gives a view of the terrain as if the camera is looking directly above and without perspective distortions. The black colour around the borders of the image indicates missing values . Regions with low altitudes are shown in blue and regions with high altitudes are shown in red.

Orthoimage Generation

After the DSM of a local region has been obtained, its corresponding orthorectified image or orthoimage can be generated. Generating an orthoimage needs the DSM, camera parameters (internal and external) and the original image .

For each element in the DSM, we know its 3D X, Y, Z position. Using the known camera parameters, its corresponding image location of a DSM position can be obtained using the collinearity equations given below. Therefore we have a corresponding image intensity value related to this DSM position.

This obtained image location is unlikely to be at the exact centre of a pixel in the input image grid. The interpolated image intensity value in the output image is determined by a nominated re-samplihg scheme. In the examples presented here, the simplest possible re-sampling scheme is used. The pixel intensity from the input image pixel closest to the location corresponding to the given output pixel is transferred to the output image. This is known as nearest neighbour re-sampling. Other common resampling schemes are bilinear re-sampling (combining the intensities from the surrounding 2x2 pixel region) and cubic convolution re-sampling (combining the intensities from the surrounding 4x4 pixel region) .

The appropriate relationship between any two airborne video images or between an airborne video image and the ground is given by the collinearity equations . The derivation of these equations is given in [J Trinder. Course notes from GMAT6512: Principles of image geometry. Technical Report, School of Geomatic Engineering, University of New South Wales, 1995] and they are described as part of the ' 'differential rectification'' process in [K Novak. Rectification of digital imagery. Photogrammetric Engineering and Remote sensing, 58(3) :339- 344, March 1992] . The collinearity equations are:

_ -f{n _l{X₎ ~X^c)+m,₂{Y_} -D+n^jZ_j -Z^c)) ^Xj ° m₃₁{X_j -X'^:)+m₃₂{Y_j -r)+m₃₃{Z_j -Z'^:)

where

Xj and y are the image coordinates of object in

CCD array units o and y_ are the displacement between the actual origin of the image coordinate system and the true origin defined by the principal point f is the camera focal length

Xj, Yj and Zj are the coordinates of object j in the reference (output) grid

X°, Y° and __^c are the coordinates of the camera position in the reference grid ma ■ ■■ W-33 are the elements of a 3x3 matrix which is a function of the 3 rotations of the camera coordinate system about the X, Y and Z axes .

VirtuoZo™ Image Format

Our orthoimage mosaicking program uses VirtuoZo™ image format as the input and output image format. A

VirtuoZo™ image is defined as a raster which is from left to right (-X to +X) and from bottom to top (-Y to +Y) , which means that VirtuoZo™ image data and the DSM data are in the same arrangement and can be easily matched with the coordinate system. There is 128-byte header information at the beginning of each VirtuoZo image file, followed by row-by-row image intensity data. The image intensity value is stored using the ' 'unsigned char'' data type. Two subroutines are used to read and write the header information of a VirtuoZo™ image. These are: rd_header() and wrjαeader ( ) , which can be found in the source code files.

VirtuoZo™ image format can handle both grey (256 intensity levels) and true colour images (24-bit with red, green and blue channels) . The declaration of grey or true colour mode is specified in one of the members of the data structure ImgHeader. If the member ''hi'' is set to 1, a true colour image is defined, otherwise the image is in grey.

Mosaicking of DSMs and Orthoimages Once the images are aligned on a common grid, mosaicking is performed to overlay the images on each other. In the regions where the images overlap, a decision needs to be made about how to determine what intensity values are transferred to the output image. The options are to place one image on top of (overwrite) the other or to average the intensity values of the two (or more) images . The latter option can produce a visually smoother image by smoothing any overall brightness differences between the images, but if the alignment of the images is not sufficiently accurate, it can also cause spatial blurring of image features in the overlap region. Since the time lapse between successive images is minimal, and hence illumination differences should be minimal, the overwriting option will be used for these examples . Analysis of a large area requires the creation of a mosaic, which is composed of a series of DSMs or orthoimages. In our case, the orthoimages have been obtained from the original oblique images. The following sections give a brief description of the mosaicking process for the orthorectified oblique images.

DSM Merger The first step is to merge several DSMs into a single large DSM. The input DSMs were generated as described in previous sections, and all DSMs should have the same spacing intervals and the same rotation angles related to the local coordinate system before the merging process starts. Figure 17 illustrates two DSM grids which need to be merged . The shadow areas show the overlapping regions . Two DSMs being merged may have different elevation values in the overlapping area. Our goal in merging DSMs is to diminish the elevation differences in the overlapping area. If the grid nodes from two DSMs do not match exactly the same planimetric locations (Figure 17 (b) ) , the second DSM needs to be resampled. This is done using a bilinear interpolation. In the simpler case, where the grid nodes match exactly the same planimetric locations (see Figure 17 (a) ) , the averaged elevation values are simply taken as the final elevation values .

A program called ' 'mos_dem' ' was developed to merge several DSMs into a single large DSM in this project. The usage of mod_dem is as follows:

% mos_dem < index_file >

where < index_file > is a text file which includes the total number of DSMs to be merged and each DSM file name. The contents of an example index file are:

6 11155610 11155621 11155707 11155718 11155804 11155901

where the number in the first line indicates a total of 6 DSMs to be merged. This is followed by the first file name ''11155610'' without ''.dem'' extension and so on. The default output DSM file is called ' 'mosaick.dem' ' .

Orthoimage Mosaicking A program called ' 'mos_ortho' ' was developed to mosaic several orthoimages into a single large orthoimage. Since the input orthoimages are in a pure image format, there is no geometric reference information inside orthoimages themselves, therefore orthoimages' companions - DSMs, need to be provided along with orthoimages in order to mosaic orthoimages geometrically. The usage of ''mod ortho' ' is as follows:

mos ortho < index file >

where < index_file > is the same Index file used for DSM merger described in the previous section. ' 'mos_ortho' ' not only reads orthoimage data but also reads its corresponding DSM data. For example, the first file name in the above < index_file > example is ''11155610'', and so ' '11155610. ortho ' ' and '' 11155610.dem' ' will be read by ' 'mos_ortho' ' . The default output mosaicked orthoimage file is called ' 'mosaick. ortho' ' .

Format Conversion

A format converter called ' Ortho2pgm' ' was written since VirtuoZo™ image format cannot be recognised by most popular image display and processing software. The usage of ' Ortho2pgm' ' is as follows:

ortho2pgm < vir_image > < pgm_image >

where < vir_image > is the input image in VirtuoZo image format and < pgm_image > is the specified output image in raw PGM format.

Similarly, an alternative format converter called ' 'dem2pgm' ' was written in order to ''see'' a DSM visually. ' 'dem2pgm' ' transfers a DSM into a grey image, in which the values represent the elevations . The usage of ' 'dem2pgm' ' is as follows:

% dem2pgm < DSM > < PGM >

where < DSM > is the input DSM file name and < PGM > is the specified output image in raw PGM format.

Discussion

When two or more orthoimages are mosaicked into a single larger image, a problem which frequently arises is the creation of spurious ''edges'' at the seams of the input images . These edges occur when there are perceptible discontinuities in image patterns or intensity differences within the region of overlap. These artificial edges will cause trouble in image interpretation and analysis. Several techniques can be used to solve both types of problem [Y Afek and A Brand. Mosaicking of orthorectified aerial images. Photogrammetric Engineering and Remote

Sensing, 64 (2) :115-125, 1998. S Hummer-MiHer. A digital mosaicking algorithm allowing for an irregular join "line" . Photogrammetric Engineering and Remote Sensing, 55(l):43-47, 1989. V Sequeira, K Ng, E Wolfart, J G M Goncalves, and D Hogg. Automated reconstruction of 3d models from real environments. ISPRS Journal of Photogrammetry and Remote Sensing, 54:1-22, 1999. Z Wang. Principles of Photogrammetry (with Remote Sensing) . Publishing House of Surveying and Mapping, Beijing, 1990. C Slama. Manual of Photogrammetry. American Society of

Photogrammetry, fourth edition, 1980] .

Since the mosaicking of orthorectified oblique images in this project is more complicated than conventional vertical satellite and aerial image mosaicking, it requires more resources to develop more rigorous algorithms and therefore it is beyond the scope of this project . Measuring the Distance of Trees from Power Lines

This section explains how the distance from the power line to neighbouring trees and vegetation is calculated. It then describes a method of determining if the vegetation lies within an envelope centred about the power line. The calculation of distances from the power _tline to the underlying surface is performed after the full 3D surface between two power poles is generated, using stereo matching and mosaicking.

The method is illustrated using a span of mosaicked images of landscapes which contains a power line suspended between two power poles, ideally situated at each end of the span of images. If this is not so, the power poles are situated within the span of images.

Input Data (DSM)

The input. data depicts the plan view of a landscape (3D-surface) stored in DSM data format (described in an earlier section) . A pixel in the image has a grey level intensity which corresponds to the height (integer) of' the landscape at that pixel point. Vegetation and tall objects are not clearly distinguishable in the input data except that they are higher than the surface (as visualized by the naked eye) . This is due to the reconstruction of the 3D-surface and the mosaicking of images to form a span.

Now, given the location of two power poles with respect to the coordinates at the bottom left corner of the DSM data format, we can calculate the location of the power lines modelled by a catenary, the envelope that surrounds the power lines and whether vegetation lies within this envelope.

Power Poles and Clearance Spans The minimum safe distance of an object to the power line is known as a clearance. This measure is given for a specific conductor type and span length of a power line (the Euclidean distance between two power poles when viewed from the top) . Clearances to the power line nearer the power pole is smaller than clearances along the centre 2/3 span of the power line (see Figure 18) . In this example, a clearance distance of 150cm is used for near spans (1/6 at the beginning and end of power line) and 200cm for centre spans. Each clearance distance represents the radius of a circular envelope in 2D space centred about the power line when viewed along the power line, and hence forms a cylindrical envelope along the span of the power line in 3D space.

Power Line and Catenary

A flexible, inelastic power line with constant load (W) per unit of arc length suspended between two power poles assumes the shape of a catenary. The equation of this catenary is given as :

where C = [H/W] is a catenary constant, H is the horizontal component of tension (5000 N) and W is the resultant distributed conductor load (0.15 N/m) . As shown in Figure 19, z denotes the perpendicular height between the power line and the tangent intersecting the lowest point on the catenary (y-axis) . The top and bottom pole corresponds to the top and bottom of the input data (DSM) . The fixed height of a power pole is 800 cm. By using the catenary equation, we can calculate the horizontal distances from the lowest point on the catenary to each of the power poles, denoted by yt and yb. Given the two poles we rewrite the catenary equations:

The sum of yt and yb denotes the span length of the power line, L. The absolute difference in height of the 2 power poles, \ht - hb\ is the same as \zt — zb\ , which we will call h. Let us assume that zt > zb; then yt > yb. Given these equations we can derive the following pair of equations : yt = C sinlT¹ {K sinh^-1 (K

Now if zt < zb, then yt < yb and we obtain:

yt =--Csintf¹(_?), yb = Csinbf^! {K) +-

For the simple case of zt - zb, we use yt = yb = [L/2] . We can now calculate yt and yb and hence zt and zb. Using the pole heights, we calculate zd = ϊxt — zt (or hb - zb) , the height difference between the tangent intersecting the catenary (y-axis) and- ''ground zero''. This value is used to calculate the perpendicular height of a point on the catenary line to ''ground zero''. ''Ground zero'' is a line of reference when determining if an object lies within the catenary envelope.

Catenary Envelope

The locations of the power poles determine the linear orientation of the suspended power line when viewed from the top (XY axis, 2D) . Using linear coordinate geometry, we can determine the perpendicular distances of all points in the image (landscape) to the power line.

The span length (L) of the power line is given by the Euclidean distance between the 2 power poles . Consider a point C[XC, YC] in the image and the power line TB suspended between 2 power poles T[XT, YT] (top pole) and B [XB, YB] (bottom pole) . Let I[XI, YI] be a point of perpendicular projection of C onto TB and:

{YT-YQ{YT-YB) - {XT- XQ{XB- XT) r=

L²

{YT-YQ{XB-XT) - {XT-XQ{YB-YT)

The distance from T to I is rL and from C to J is sL. These distances are used to determine which clearances to use with respect to the power line span. If 0 ≤ r ≤ 1, then point C is inside the catenary envelope defined by a clearance value. Once a point C is found inside the catenary envelope, we calculate the coordinates of point I, XI = XT + r (XB-XT) and YI = YT + r ( YB-YT) . All calculations made up to this point have been performed in the XY plane. With these XY coordinates we calculate the height, ZI, of point J on the power line to ' 'ground zero'' using the given catenary equations. The following calculations are performed in the XZ plane. We calculate the Euclidean distance between points C[XC, ZC] and I [XI, ZI] and determine if it is less than the clearance, which implies that point C (an object) lies within the catenary envelope.

Final Output

Figure 20 shows a sample output of power pole positions, a power line and catenary envelope, landscape and trees . Figure 20(a) (XY plane) is a top view of the landscape with the position of the power poles and power line represented as green pixels 10. The grey level intensities are proportional to the height of each pixel in the landscape. The catenary envelope is represented as blue pixels 11 and vegetation found within the catenary envelope are represented as red pixels 12. The top power pole corresponds to the top of the image.

Figure 20(b) (YZ plane) is a side view of the landscape, viewed from the right side of the first panel. Each pixel point in the landscape image has a height. The heights are scaled accordingly and plotted as points along the Z-axis. Points along the X-axis (projecting out of the image) are plotted with higher grey-level intensities as it moves closer to the viewer. Again, green pixels 10 represent the power line modelled by a catenary and red pixels 11 represent vegetation found within the catenary envelope . Figure 20(c) (YZ plane) is another side view of the landscape, viewed from the right side of the first panel. It is similar to the last panel except that the landscape points plotted along the X-axis are bounded by the catenary envelope (in the XY plane) . The calculation of distance from the power line to points in the surroundings is based on a 3D surface which is a mosaic of several 3D surfaces between the power poles. The quality of the 3D surface and precision of the position of the poles will determine the precision of the distance measurement from the conductor. The distance measurement itself does not create inaccuracy. It does not contain any intensive calculations, and thus it is not likely to add significantly to the total processing time.

3D Visualisation

The stereo processing produces a DSM of the terrain. The visualisation software allows interactive viewing of the terrain model overlaid with the original images and estimated power pole positions. This section describes the process used to convert the DSM files to a form that can be displayed by the software, and the use of the software.

Converting the DSM into Triangles

The DSM files are high resolution, dense, height maps (approximately 1000x4000 pixels) . The simplest way to display such an object in three dimensions is to break it up into triangles . Unfortunately a simple minded approach produces far too many triangles for typical display hardware to handle in an interactive fashion. Approximations that dramatically reduce^' the number of triangles while only introducing small errors are essential. A free software package called terra has been used to do this .

Terra attempts to minimize the number of triangles and the error in the approximation. It employs a greedy insertion algorithm that searches for the point with the highest error and inserts a new vertex there. A Delaunay triangulation is used to update the mesh in an efficient manner. Details of the algorithms used are described in a technical report available from http: //www.cs .emu. edu/~garland/scape/scape.ps .gz

Adding Power Poles and Power Lines

Knowing the 3D positions of the tops of the power poles within a span of the images and the 3D terrain or vegetation surface, power poles can be drawn above the surface with the calculated height. Cross arms can also be modelled. From the 3D positions of the tops of the poles, the catenary equations of power lines can be calculated and modelled.

In the above embodiment, the system and method locate the power poles and then predict the position of the power lines. In alternative embodiments, the position of the power lines may be directly obtained. This may be done by looking for curved features which are within a certain height range off the ground.

Note that it may be necessary even where it is possible to directly find the power lines, to continue to identify the power poles.

When referring to "image" in the above description it will be appreciated that the image is usually generated by computer, and the term "image" in this case should be taken to cover not only a produced visual image, but also the data stored in the computer from which the image is produced.

Modifications and variations as would be apparent to a skilled addressee are deemed to be within the scope of the present invention.

Claims

1. A corridor mapping system for automatically determining the status of infrastructure extending within a corridor, the system comprising stereo image processing means, arranged to process a plurality of stereo image pairs taken of the corridor, in order to provide three dimensional information on the arrangement of components within the corridor, infrastructure identification means for identifying infrastructure components from the images, and status determination means for determining status of infrastructure from the images .

2. A corridor mapping system in accordance with claim 1, wherein the status determination means is arranged to determine encroachment of components within the corridor on the infrastructure components.

3. A corridor mapping system in accordance with claim 2 , wherein the components include vegetation and the status determination means is arranged to determine encroachment of vegetation on the infrastructure.

4. A corridor mapping system in accordance with claim 1, 2 or 3 , wherein the status determination means is arranged to identify and determine the geographic position of faults associated with the infrastructure components.

5. A corridor mapping system in accordance with claim 2 or claim 3, wherein the infrastructure components include infrastructure lines, strung between infrastructure poles.

6. A corridor mapping system in accordance with claim 5, wherein the infrastructure identification means is arranged to utilise an image segmentation process in order to identify the infrastructure lines from the images .

7. A corridor mapping system in accordance with claim 5, wherein the infrastructure identification means is arranged to identify the infrastructure poles from the images .

8. A corridor mapping system in accordance with claim 7, wherein the infrastructure identification means is arranged to identify the infrastructure poles by a process of identifying pole candidates in each of the left and right images of the stereo images and then matching the pole candidates and determining the most likely pole candidates to be those that form an approximate "V" shape when matched, the left arm of the V being from the right stereo image and the right arm of the V being from the left image .

9. A corridor mapping system in accordance with claim 7, wherein the infrastructure identification means is arranged to use an image segmentation process to identify the poles.

10. A corridor mapping system in accordance with claim 7, 8 or 9 , the infrastructure identification means being arranged to determine the relative positions of the poles and to infer, from knowledge of the position of poles, the positions of the infrastructure lines within the corridor.

11. A corridor mapping system in accordance with claim 10, wherein the infrastructure identification means is arranged to apply a mosaicing process in order to mosaic DSMs between pairs of poles together, in order to create a pole to pole DSM.

12. A corridor mapping system in accordance with claim 10, wherein the image identification means is arranged to infer a catenary envelope for an infrastructure line for each pole to pole image information.

13. A corridor mapping method, for automatically determining the status of infrastructure extending within a corridor, the method comprising the steps .of processing a plurality of stereo image pairs taken of the corridor, in order to provide three dimensional information on the arrangement of components within the corridor, identifying infrastructure components from the images, and determining the status of the infrastructure components from the images .

14. A method in accordance with claim 13 , wherein the step of determining the status of the infrastructure from the image, includes the step of determining the encroachment of components within the image on the infrastructure components .

15. A method in accordance with claim 14, wherein the components include vegetation, and the step of determining includes the step of determining encroachment of vegetation components on the infrastructure components.

16. A method in accordance with claim 13, 14 or 15, wherein the step of determining the status of the infrastructure includes the step of determining the geographic position of faults associated with the infrastructure components.

17. A method in accordance with claim 16, wherein the infrastructure components include infrastructure lines strung between infrastructure poles, and the step of identifying infrastructure components includes the step of carrying out an image segmentation process to identify the infrastructure lines from the images .

18. A method in accordance with claim 15, wherein the infrastructure components include infrastructure lines strung between infrastructure poles, and the step of identifying infrastructure components includes the steps of identifying the infrastructure poles.

19. A method in accordance with claim 18, wherein the step of identifying the infrastructure poles includes the step of carrying out an image segmentation process .

20. A method in accordance with claim 18 or claim

19, wherein the step of identifying the infrastructure components includes the step of inferring the positioning of the infrastructure lines from the position of the identified poles .

21. A method in accordance with claim 20, wherein the step of inferring the positions of the infrastructure lines includes the step of mosaicing DSMs together between pairs of poles in order to create a pole to pole DSM.

22. A method in accordance with claim 20 or claim 21, wherein the step of inferring the position of the infrastructure lines includes the s.tep of inferring the catenary envelope .

23. A computer program arranged when loaded into a computing system, to control the computing system to implement a method in accordance with any of claims 13 to 22.

24. A computer readable medium, storing instructions for controlling a computing system to implement a method in accordance with any one of claims 13 to 22.