AU2022335152A1

AU2022335152A1 - "methods and systems for identifying objects in images"

Info

Publication number: AU2022335152A1
Application number: AU2022335152A
Authority: AU
Inventors: Kirk Bloomfield
Original assignee: Total Drain Group Pty Ltd
Current assignee: Total Drain Group Pty Ltd
Priority date: 2021-08-25
Filing date: 2022-08-23
Publication date: 2024-04-04
Also published as: WO2023023736A1; AU2021107375A4

Abstract

Described embodiments relate to a computer-implemented method comprising receiving an acquired image of a candidate object to be identified and metadata associated with the image. The metadata includes an image capture location of a device used to capture the image at image capture time. The method further comprises identifying the candidate object in the image and associating the candidate object with a respective object type identifier; determining an estimated object location based on the image capture location of the device; determining a region of interest based on the estimated object location; and querying a database of asset records to determine a set of asset matches for the candidate object in the image based on the region of interest and the respective object type identifier.

Description

"Methods and systems for identifying objects in images"

Technical Field

[0001] Embodiments generally relate to methods, systems, and computer-readable media for identifying objects or assets, such as infrastructure and/or municipal assets, in acquired images.

Background

[0002] Municipalities often receive reports from residents, staff and/or contractors detailing the appearance and/or operational condition of public infrastructure assets. These reports are often generic in nature. For example, they may include only a vague location, and the reporter may not be skilled enough to recognise the asset type or failure mode.

[0003] The asset owner will then typically send a crew out to inspect and service the asset. This can be inefficient as an unqualified or incorrect crew may be dispatched, and/or they may not have the materials necessary to rectify the problem.

[0004] Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present disclosure as it existed before the priority date of each of the appended claims.

[0005] Throughout this specification the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps. Summary

[0006] Some embodiments relate to a computer- implemented method comprising: receiving an acquired image of a candidate object to be identified and metadata associated with the image, the metadata including an image capture location of a device used to capture the image at image capture time; identifying the candidate object in the image and associating the candidate object with a respective object type identifier; determining an estimated object location based on the image capture location of the device; determining a region of interest based on the estimated object location; querying a database of asset records to determine a set of asset matches for the candidate object in the image based on the region of interest and the respective object type identifier.

[0007] In some embodiments, the metadata further comprises a roll angle of orientation of the device at image capture time and a bearing angle of the device at image capture time, and the method further comprises: determining an estimated height at which the device was held at image capture time, wherein the estimated object location is determined based on the roll angle, the bearing angle, the estimated height and the image capture location.

[0008] In some embodiments, the region of interest is a buffer region about the image capture location.

[0009] In some embodiments, the method further comprises: determining a zone of interest, wherein the zone of interest is a sub-section of the region of interest. The zone of interest may be determined based on the bearing angle, the estimated height, the roll angle and a field of view angle.

[0010] In some embodiments, the method further comprises: responsive to the set of asset matches being a null set, generating an asset record in the database for the candidate object. [0011] In some embodiments, querying the database of asset records to determine the set of asset matches for the candidate object comprises: querying the database to determine a first subset of asset records of the database with regions of interest corresponding to the determined region of interest of the candidate object; determining a second subset of asset records from the first subset of asset records based on the candidate object type identifier; and determining the set of asset matches from the second subset of asset records.

[0012] In some embodiments, the method further comprises: responsive to the set of asset matches comprising a single asset record, associating the candidate object with the single asset record.

[0013] In some embodiments, the method further comprises: responsive to the set of asset matches comprising a plurality of asset records, determining an asset record with a location closest to the estimated object location as the asset match, and associating the candidate object with the asset match.

[0014] In some embodiments, the method further comprises: determining a third subset of asset records from the first subset of asset records or the second subset of asset records, wherein said determining is based on the zone of interest; and determining the set of asset matches from the third subset of asset records.

[0015] In some embodiments, the image comprises a plurality of objects, and the method further comprises: determining the candidate object from the plurality of objects based on one or more of: (i) a highest predicted fault probability associated with each of the plurality of objects; (ii) closest proximity of each of the plurality of objects to estimated object location; and (iii) context information associated with the image.

[0016] In some embodiments, the method further comprises: saving the acquired image in an asset record of the asset match in the database. [0017] In some embodiments, the asset match is an identification of a particular infrastructure asset in a network.

[0018] In some embodiments, the method further comprises: generating a work order based on the determined asset match; and transmitting the work order to a user device. The work order may comprise one or more of: (i) an identifier for the asset record; (ii) service information associated with the asset match; and (iii) a parts list to carry out service on the one or more objects.

[0019] In some embodiments, identifying the candidate object in the image comprises: providing a numerical representation of the image to the image classifier, wherein the image classifier is configured to output the predicted object type identifier. The image classifier may be further trained to determine a likelihood of the candidate object being faulty, and is configured to output a fault score.

[0020] In some embodiments, determining an estimated height at which the device was held at image capture time is based on: (i) an average height of a human; or (ii) a height of the user who acquired the image using the device.

[0021] Some embodiments relate to a system comprising: one or more processors; and memory comprising computer executable instructions, which when executed by the one or more processors, cause the system to perform any one of the described methods.

[0022] Some embodiments relate to a computer-readable storage medium storing instructions that, when executed by a computer, cause the computer to perform any one of the described methods.

Brief Description of Drawings

[0023] Various ones of the appended drawings merely illustrate example embodiments of the present disclosure and cannot be considered as limiting its scope. [0024] Fig.l is a block diagram of a network for identification of assets in images, according to some embodiments;

[0025] Fig. 2 is a more detailed block diagram of the network of Fig. 1, according to some embodiments;

[0026] Fig. 3 is a perspective view of a schematic illustration of a user and device being used for capturing images of assets, according to some embodiments;

[0027] Fig. 4 is top view of the schematic illustration of Fig. 3;

[0028] Figs. 5A, 5B and 5C are further schematic illustrations of Fig. 3;

[0029] Fig. 6 is an example image of an infrastructure asset undergoing asset identification, according to some embodiments; and

[0030] Fig. 7 is a process flow diagram of a method of identifying assets in captured images, according to some embodiments.

Description of Embodiments

[0031] Embodiments generally relate to methods, systems, and computer-readable media for identifying assets, such as infrastructure and/or municipal assets, in acquired images. For example, the asset may be a drain, grate, pipe, meter, or other such utility asset.

[0032] In some embodiments, a user, such as a resident, staff and/or a contractor, may capture an image of an asset with a fault and/or requiring maintenance or replacement, such as an infrastructure asset, using a user device such as a smartphone or tablet. The captured image along with metadata associated with the image or the user device, taken at the time the image was captured, is then transmitted to an asset fault reporting system or server. [0033] The metadata includes position information derived from the user device. The position information may include an image capture location of a device used to capture the image at image capture time The image capture location may be a Global Positioning System (GPS) location, to identify the geographic location of the device when the image was acquired. The position information may include device orientation and/or device direction or bearing. For example, the device orientation may be derived from a gyroscope of the user device, and may include a roll angle orientation. The device bearing may be derived from a geomagnetic field sensor compass of the device.

[0034] The asset fault reporting system determines an estimated object location based on the image capture location of the device. The asset fault reporting system may determine an estimated height at which the device was held at image capture time, and the estimated object location may be based on the roll angle, the estimated height and the image capture location. The estimated height may be based on an average height of a human, or a known height of the user who captured the image with the device. For example, user information including a height of the user that acquired the image using the device may be transmitted from the user device to the asset fault reporting system. In other embodiments, the asset fault reporting system may determine the user height from a user profile associated with the user, for example, as may be stored in a database accessible to the asset fault reporting system. In some embodiments, an asset fault reporting application may be deployed on the user device and may be configured to transmit the acquired image, the metadata and in some embodiments, the user data, to the asset fault reporting system.

[0035] The asset fault reporting system may determine a region of interest as a buffer region about the estimated object location. The asset fault reporting system may determine a zone of interest as a subsection of the region of interest. The asset fault reporting system may determine a bearing angle of the device at image capture time, for example, form the received metadata, and the zone of interest may be determined based on the bearing angle, the estimated height, the roll angle and a field of view angle. [0036] In some embodiments, a fault reporting application running on the device used to capture the image is configured to determine the estimated object location based on the image capture location of the device, the region of interest and/or the zone of interest.

[0037] The asset fault reporting system provides the received image to an image classifier, on a server, to identify the type of asset(s) in the acquired image. In some embodiments, the image classifier determines a likelihood of the object(s) being faulty and outputs a predictive score indicative of the likelihood. The asset fault reporting system associates the image with object type identifier(s) associated with respective asset(s) identified in the image.

[0038] The asset fault reporting system queries an asset database of asset records to determine a set of asset matches for the candidate object in the image based on the region of interest and the respective object type identifier. If no matches are determined, the asset fault reporting system generates an asset record in the database for the candidate object. If a single match is determined, the candidate object is associated with (or identified as being the asset of) the single asset record. If multiple matches are determined, the asset record with a location closest to the estimated object location may be selected as the asset match.

[0039] Once a candidate object has been identified as being associated with a particular asset record, asset fault reporting system may generate a work order to schedule or organise maintenance of the asset, as may be required.

[0040] Described embodiments allow for a more accurate identification of assets in images of objects reported to the asset fault reporting system. In some embodiments, this is achieved by determining a more precise or accurate location of a candidate object in the image, and/or an accurate identification of the object type and/or an accurate prediction of the likelihood of the object being faulty. This can be particularly valuable where an image comprises more than one object or potentially faulty asset. [0041] By associating a report relating to a potential faulty asset to the correct asset identifier and record, the asset’s service history may be accurately tracked. Data extracted from the asset record may be used in prediction models to forecast asset lifespan and extrapolate maintenance costs for example. Assets that routinely fail can be isolated for further consideration of environmental or building issues. Accurate service histories may provide a benefit when required in litigation or freedom of information requests, for example.

[0042] The described embodiments may allow an end user to accurately interact with an asset record without the asset owner having to supply bulk data, or confidential information.

[0043] Accordingly, the disclosure provides an improved method of asset identification, which may allow for increased accuracy and/or efficiency in identifying faults in assets, and may therefore allow for improved response times by maintenance crew. Further, an incident associated with the asset, such as a fault, can be recorded, for example, associated with a unique asset identifier in the database, which may allow for improved outcomes in asset lifecycle planning.

[0044] Referring now to Fig. 1, there is shown a block diagram of a system 102 for performing the disclosed methods, according to some embodiments. The system 102 comprises at least one user device 104 in communication with a server 108 over a network 106. For example, the server 108 may be an asset fault reporting system. The user device 104 and/or server 108 may also be in communication with a database 110 over the network 106. Further client devices 112 may also be in communication with the database 110 and server 108 over the network 106.

[0045] The network 106 may comprise at least a portion of one or more networks having one or more nodes that transmit, receive, forward, generate, buffer, store, route, switch, process, or a combination thereof, etc. one or more messages, packets, signals, some combination thereof, or so forth. The network 106 may include, for example, one or more of: a wireless network, a wired network, an internet, an intranet, a public network, a packet- switched network, a circuit- switched network, an ad hoc network, an infrastructure network, a public- switched telephone network (PSTN), a cable network, a cellular network, a satellite network, a fiber optic network, some combination thereof, or so forth.

[0046] Server 108 may comprise one or more computing devices configured to share data or resources among multiple network devices. Server 108 may comprise a physical server, virtual server, or one or more physical or virtual servers in combination.

[0047] Database 110 may comprise a data store configured to store data from network devices over network 106. Database 110 may comprise a virtual data store in a memory of a computing device, connected to network 106. The server 108 may be directly connected to and in communication with database 110, or server 108 may be connected to database 110 over a communications network, such as 106, or an alternative communications network, which may for example, be a private network.

[0048] User devices 104 and client devices 112 may comprise a smartphone device, smart camera device, computer, laptop, tablet, or other computing device capable of acquiring images, executing instructions from memory, and sending instructions and images over network 106.

[0049] As depicted in Fig. 2, server 108 comprises one or more processors 232 and memory 233 storing instructions (e.g. program code), which, when executed by the processor(s) 232, causes the server 108 to function according to the described methods, such as method 700 below. The processor(s) 232 may comprise one or more microprocessors, central processing units (CPUs), graphical/graphics processing units (GPUs), application specific instruction set processors (ASIPs), application specific integrated circuits (ASICs) or other processors capable of reading and executing instruction code.

[0050] Memory 233 may comprise one or more volatile or non-volatile memory types. For example, memory 233 may comprise one or more of random access memory (RAM), read-only memory (ROM), electrically erasable programmable readonly memory (EEPROM) or flash memory. Memory 233 is configured to store program code accessible by the processor(s) 232. The program code comprises executable program code modules. In other words, memory 233 is configured to store executable code modules configured to be executable by the processor(s) 232. The executable code modules, when executed by the processor(s) 232 cause the server 108 to perform certain functionality, as described in more detail below.

[0051] The memory 233 may comprise an image classifier 234. The image classifier 234 may be a code module configured to receive, as input, images and/or data and output a determination of the type of asset (e.g. drain, grate, hydrant, etc.). In some embodiments, the image classifier 234 may determine a likelihood of the identified object being faults, and output a predictive score. The image classifier 234 may comprise a machine learning module.

[0052] In some embodiments, the image classifier 234 is a convolutional neural network based image classifier, such as the “You Only Look Once” or “YOLO” object detection algorithm. The classifier may be trained using a training dataset comprising labelled examples of images. Each labelled example may comprise an image of an asset (for example an image of a fire hydrant, a stormwater pit, a park bench etc.) a first label indicative of the asset type, and in some embodiments, a second label indicative of whether or not the asset of the image is faulty or not. In some embodiments, the examples may include further labels indicative of ownership of the asset, such as a brand logo or other identifier, such as a council logo or business logo. The memory 233 may comprise an asset identification module 236. The asset identification module 236 may be a code module configured to identify specific infrastructure assets from a database of asset records based on determined asset types and estimated asset locations. Asset identification module 236 may query database 110 in order to compare the determined asset type and position information with asset records stored in database 110, in order to determine a match. The match may comprise a probability indicative of the goodness or quality of the match between the candidate object and the asset records. [0053] In some embodiments, the asset identification module 236 of server 108 may be configured to determine the estimated object location based on the image capture location of the device, the region of interest and/or the zone of interest, as discussed in further detail below.

[0054] The user device 104 comprises one or more processors 204 and memory 206. Processor 204 may include more than one electronic processing device and additional processing circuitry. For example, the processors 204 may include multiple processing chips, a digital signal processor (DSP), analog-to digital or digital-to analog conversion circuitry, or other circuitry or processing chips that have processing capability to perform the functions described herein. The processors 204 may execute all processing functions described herein locally on the user device 104 or may execute some processing functions locally and outsource other processing functions to another processing system, such as server 108.

[0055] The memory 206 may comprise one or more volatile or non-volatile memory types. For example, memory 206 may comprise one or more of random access memory (RAM), read-only memory (ROM), electrically erasable programmable readonly memory (EEPROM) or flash memory. Memory 206 is configured to store program code accessible by the processor(s) 204. The program code comprises executable program code modules. In other words, memory 206 is configured to store executable code modules configured to be executable by the processor(s) 204. The executable code modules, when executed by the processor(s) 204 cause the user device 104 to perform certain functionality, as described in more detail below.

[0056] An application, such as a fault reporting application 208, may be stored in memory 206. User device 104 may comprise a wireless communication device 210. User device 104 may comprise imaging device 212. User device 104 may comprise user interface (UI) 220. User device 104 may further comprise a compass 226. User device 104 may further comprise a compass 228. User device 104 may further comprise a compass GPS unit 230. [0057] The memory 206 may further comprise executable program code that defines a communication module 222, and/or a user interface (UI) module 224. The memory 206 is arranged to store program code relating to the communication of data from memory 206 over the network 106.

[0058] The application 208 stored in memory 206 may comprise a program code module, which enables the user to acquire and transmit images to server 108 over network 106. In some embodiments, the fault reporting application 208 is configured to acquire images using the imaging device 212. The fault reporting application 208 may determine image metadata including position information (which may include a geographical identifier, such as GSP identifier, and/or device orientation). The application 208 may determine user data (which may include a user identifier, such as a phone number or username, and/or user height data). The application 208 may transmit the acquired images, metadata, and, in some embodiments, user data including height data, over network 106 by way of wireless communication device 210. The application 208 may perform any combination of determining metadata using position formation, determining user data, transmitting the acquired images, metadata, and/or user data over network 107. For example, the application 208 may be an asset fault reporting application.

[0059] In some embodiments, the application 208 may be configured to determine the estimated object location based on the image capture location of the device, the region of interest and/or the zone of interest, and to provide any such information to the asset fault reporting system 108.

[0060] Communication module 222 may comprise program code, which when executed by the processor 204, implements instructions related to initiating and operating the wireless communication device 210. When initiated by the communication module 222, the wireless communication device 210 may send or receive data over network 106. Communication module 222 may be configured to package and transmit data generated by the UI module 224 and/or retrieved from the memory 206 over network 106 to a client device 112, and/or to server 108. [0061] UI module 224 may comprise program code, which when executed by the processor 204, implements instructions relating to the operation of user interface 220.

[0062] UI 220 may comprise a display screen, such as a touch screen, configured to allow a user to interact with the user device 104 and access the application 208. In some embodiments, the UI 220 may comprise a display screen and a human machine interface such as a mouse, keyboard, stylus, or other means of interacting with the user device 104.

[0063] Wireless communication device 210 may comprise a wireless Ethernet interface, SIM card module, Bluetooth connection, or other appropriate wireless adapter allowing wireless communication over network 106. Wireless communication device 210 may be configured to facilitate communication with external devices such as client device 112 and server 108. In some embodiments, a wired communication means is used.

[0064] Imaging device 212, may comprise a camera, arranged to capture images, such as of an outside environment. In some embodiments, imaging device 212 comprises a digital camera device. The imaging device 212 may have an image resolution of about 12 megapixels or greater, for example. Other resolutions are contemplated, such as 3.2 megapixels, 16 megapixels or other common pixel resolutions for smartphone devices that enable clear images to be captured.

[0065] Compass 226 may comprise a magnetometer, digital compass, and/or other sensor or combination of sensors configured to allows determination of the orientation of the user device 104 with respect to magnetic north, or other cardinal directions, and output the determined orientation to the processor 204.

[0066] Gyroscope 228 may comprise a micro-electro-mechanical system (MEMS) and/or other sensor or combination of sensors configured to determine an orientation of the user device 104 by gravity, and output the determined orientation to the processor 204. [0067] GPS 230 may comprise a GPS receiver configured to receive GPS satellite information and determine the GPS location of the user device 104 and output the GPS location of the user device 104 to the processor 204.

[0068] Client device 112 may comprise processor 214, in connection with memory 216. Application 208 may be stored within memory 216. The processor 214, memory 216, and application 208 may be substantially as the corresponding components described with respect to the user device 104.

[0069] Fig. 3 is a schematic illustration 300 of a user 302 and user device 104 being used for capturing images of assets, such as infrastructure assets, to be identified. The position of the user device relative to a magnetic pole may be determined using the geomagnetic field sensor or the device’s orientation relative to a frame of reference may be determined using the gyroscope 228 of the user device (at the time the image is acquired).

[0070] Fig. 3 also depicts a field of view (FOV) 308 of the imaging device 212 of user device 104. The FOV is indicative of an area visible to the imaging device 212 (area captured by the imaging device 212), and may correlate to an area within which the asset to be identified lies, and the area intended to be captured by the user. In the depicted illustration, object 310 is within the field of view 308, and the user 302 is holding the device 104 at a height 311 above ground and at an angle 312 relative to the incident ray 306 (i.e., the roll angle). The height 311 may be an estimated height at which the device 104 was held at image capture time. The height 311 may be based on an average height of a human or a height of the user who acquired the image using the device, for example.

[0071] FOV 308 may comprise an FOV angle 314; an angle about the incident ray 306 relating to the visible area of the imaging device 212. In some embodiments, this may be about 60 or 70 degrees. In some embodiments, the FOV angle 314 represents a subset of the area visible to the imaging device 212. [0072] Fig. 4 is a schematic illustration of a top view of the same scenario as depicted in Fig. 3. Fig. 4 further depicts an example cardinal orientation or bearing angle 402 with respect to North. Fig. 4 further depicts an outer boundary 410 and a an inner boundary 412, defining therebetween a zone of interest (510, Fig. 5A and 5C).

[0073] Figs. 5A, 5B, and 5C are further illustrations of the scenario of Fig. 3. and Fig. 4. Fig. 5A depicts the scenario of Fig. 4, including the zone of interest 510 defined by the outer boundary 410 and inner boundary 412. As illustrated, the inner boundary and the outer boundary are defined by FOV angle 314. The zone of interest 510 is an area around the object 310, and the determination of the zone of interest 510 allows for more efficient matching between identified objects 310 and those within database 110 during an asset matching process, as discussed in more detail below with reference to Fig. 7.

[0074] Fig. 5B depicts the scenario of Fig. 3, including the inner boundary 412 and outer boundary 410. Inner boundary 412 may be located at a fixed distance from the feet of the user 302. The inner boundary 412 may be calculated based on the FOV angle 314. The inner boundary 412 may be calculated based on bearing angle 402. The inner boundary 412 may be calculated based on estimated object position. The inner boundary 412 may be calculated based on image capture location of the device 104. The inner boundary 412 may be calculated based on any of the above in combination.

[0075] Outer boundary 410 may be located at a fixed distance from the feet of the user 302 and/or the estimated object location. The outer boundary 410 may be calculated based on the FOV angle 314. The outer boundary 410 may be calculated based on bearing angle 402. The outer boundary 410 may be calculated based on estimated object position. The outer boundary 410 may be calculated based on image capture location of the device 104. The outer boundary 410 may be calculated based on any of the above in combination.

[0076] Fig. 5C depicts the scenario of Fig.3, with distance 512 being the distance to the inner boundary 412, and 514 being the distance to the outer boundary 410. In Fig. 5C, the zone of interest 510 is between the inner boundary 412 and the outer boundary 510.

[0077] Fig. 6 depicts an example image 600 captured by the user device 104 and displayed to a user 302 through user interface 220. This image 600 depicts an infrastructure asset comprising a drain and grate. In some embodiments, the image 600 may be represented by a graphical element on user interface 220, generated by application 208, in order to clearly communicate to a user 302 what object is being identified. In the example of Fig. 6, an identification region 606 may indicate a label indicative of an object or asset type determined by the image classifier, for example.

[0078] In some embodiments, bounding box 608 may represent the area of a matched result within the supplied image, when undergoing an asset identification process such as that described below with reference to Fig. 7. In other embodiments bounding box 608 represents a tagged image for use in a training dataset. In such embodiments, a human may draw a box around a known object and adds descriptive tag(s) corresponding to infrastructure assets and/or infrastructure asset conditions. Examples may include “Pit”, “Side Entry”, “Good Condition”, or some indicator of ownership such as a brand logo or other identifier, such as a council logo or business logo.

[0079] Fig. 7 is a process flow diagram of a method 700 for identifying assets in images. The method 700 is a computer-implemented method, and may be implemented, for example, by the server 108 executing computer executable instructions stored in memory 233 of the server 108, such as image classifier 234 and asset identification module 236.

[0080] However, it will be appreciated that in some embodiments, parts or steps of method 700, or indeed the entire method 700, may be implemented by a fault reporting application 208 deployed on a user device 104 in communication with server 108 across communications network 106. However, by configuring method 700 to be performed on the server 108 as opposed to the user device 104, a potential need to add excess device resource load to the client device 104 may be avoided or mitigated and/or to share bulk asset data with the client device 104, which may be difficult due to network or data capacity issues, may be avoided or mitigated.

[0081] At 702, the server 108 receives an acquired image of a candidate object to be identified and metadata associated with the image. The metadata includes an image capture location of a device 104 used to capture the image at image capture time. In some embodiments, the metadata comprises a roll angle 312 of orientation of the device at image capture time. In some embodiments, the metadata comprises a bearing angle 402 of the device at image capture time.

[0082] At 704, the server 108 identifies the candidate object in the image and associates the candidate object with a respective object type identifier. In some embodiments, the server 108 comprises an image classifier, such as a machine learning based image classifier, to identify or predict an object type of the one or more objects in the image. For example, a numerical representation of the image may be provided as an input to the image classifier, and an object type may be provided as an output. In some embodiments, the image classifier may be further trained to determine a likelihood of the candidate object being faulty, and to output a fault score indicative of the probability of the object being faulty.

[0083] Example object types include fire hydrants, sluice valves, grated pits, maintenance holes, park benches, signs, lighting, etc.

[0084] At 706, the server 108 determines an estimated object location based on the image capture location of the device. In some embodiments, the image capture device, such as user device 104, determines the estimated object location based on the image capture location of the device, and provides the estimated object location to the server 108.

[0085] In some embodiments, the server 108 determines an estimated height at which the device 104 was held at image capture time, and determines the estimated object location based on the roll angle 312, the bearing angle 402, the estimated height 31 land the image capture location. In some embodiments, the image capture device, such as user device 104, determines the estimated height and determines the estimated object location based on the roll angle, the estimated height and the image capture location. The estimated height at which the device was held at image capture time may be based on an average height of a human or a height of the user who acquired the image using the device.

[0086] For example, the distance d of the user’ s feet from the object 310 may be estimated as: d = h tan a , where h is the estimated height of the device at image capture time, and a is the roll angle (e.g., the angle between the centre line of the image and the vertical). The estimated object location can then be determined from the image capture location of the device, the distance d and the bearing direction.

[0087] At 708, the server 108 determines a region of interest based on the estimated object location.

[0088] In some embodiments, the server 108 determines or defines a region of interest as a buffer region about the image capture location. For example, the region of interest may be defined by a circle having a particular radius and with the image capture location as the centre point. In some embodiments, the image capture device, such as user device 104, determines the region of interest and provides the determined region of interest to the server 108.

[0089] In some embodiments, the server 108 (or user device 104) determines a zone of interest 510, wherein the zone of interest is a sub-section of the region of interest. The server 108 (or user device 104) may determine a bearing angle 402 of the device at image capture time, and may determine the zone of interest based on the bearing angle 402, the estimated height 311, the roll angle and a field of view (FOV) angle 314. In some embodiments, the zone of interest may include a buffer zone of about one to five metres, and/or a yaw of about zero to five degrees. [0090] The FOV may depend on the device being used to capture the image. In some embodiments, the metadata may include the FOV angle 314. In some embodiments, the FOV angle 314 is determined to be a predefined value, as may be typical of mobile computing devices with camera. For example, the FOV angles 314 may be 60 or 70 degrees.

[0091] The zone of interest 510 may be defined by an inner boundary 412 and an outer boundary 410. The distance d_ib of the inner boundary from the user’s feet (as depicted in Fig, 5) may be calculated as: d_ib = h , where (3 is the FOV angle 314. The location of the inner boundary may then be determined based on d_ib, the bearing angle 402, and the image capture location of the device. The location of the outer boundary 410 may be determined based on based on d_ib, the bearing angle 402, and the estimated object location.

[0092] At 710, the server 108 queries a database of asset records to determine a set of asset matches for the candidate object 310 in the image based on the region of interest and the respective object type identifier.

[0093] In some embodiments, querying the database of asset records to determine the set of asset matches for the candidate object comprises querying the database to determine a first subset of asset records of the database with regions of interest corresponding to the determined region of interest of the candidate object, determining a second subset of asset records from the first subset of asset records based on the candidate object type identifier, and determining the set of asset matches from the second subset of asset records. In some embodiments, the server 108 determines a third subset of asset records from the first subset of asset records or the second subset of asset records, wherein said determining is based on the zone of interest, and determines the set of asset matches from the third subset of asset records.

[0094] Responsive to the set of asset matches being a null set, the server 108 generates an asset record in the database for the candidate object. Responsive to the set of asset matches comprising a single asset record, the server 108 associates the candidate object with the single asset record. Responsive to the set of asset matches comprising a plurality of asset records, the server 108 may determine an asset record with a location closest to the estimated object location as the asset match, and associating the candidate object with the asset match.

[0095] If an asset match is determined, the server 108 may generate a work order based on the determined asset identifier, and transmit the work order to a user device 104 or client device 112 over network 106. The work order may comprise the returned asset identifier and in some embodiments, service information associated with the asset identifier. In situations where a match is found, the work order may be logged in database 110, accessible to client devices 112 and user device 103 over network 106. The acquired image, and positional data may then forwarded to a works supervisor using a client device 112 to assign to the appropriate works crew.

[0096] Should a match with a relatively high degree of certainty be made, a list of replacement parts may be generated and forwarded to the works crew, so that needed materials may be carried.

[0097] The work order may be transmitted by server 108 over network 106 to a work crew using client device 112, and/or to a user 302 using user device 104. Accordingly, a high degree of transparency can be provided, allowing for clear communication to a user 302 that their issue has resulted in a match in the database 110, and that a work order has been generated as a result.

[0098] The work order may further comprise one or more of photos, asset dimensions, and locations, collated to a job sheet for a worker.

[0099] It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the above-described embodiments, without departing from the broad general scope of the present disclosure. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.

Claims

CLAIMS:

1. A computer-implemented method comprising: receiving an acquired image of a candidate object to be identified and metadata associated with the image, the metadata including an image capture location of a device used to capture the image at image capture time; identifying the candidate object in the image and associating the candidate object with a respective object type identifier; determining an estimated object location based on the image capture location of the device; determining a region of interest based on the estimated object location; and querying a database of asset records to determine a set of asset matches for the candidate object in the image based on the region of interest and the respective object type identifier.

2. The method of claim 1, wherein the metadata further comprises a roll angle of orientation of the device at image capture time and a bearing angle of the device at image capture time, and the method further comprises: determining an estimated height at which the device was held at image capture time, wherein the estimated object location is determined based on the roll angle, the bearing angle, the estimated height and the image capture location.

3. The method of claim 1 or claim 2, wherein the region of interest is a buffer region about the image capture location.

4. The method of any one of the preceding claims, further comprising: determining a zone of interest, wherein the zone of interest is a sub-section of the region of interest.

5. The method of claim 4 when dependent directly or indirectly on claim 2, wherein the zone of interest is determined based on the bearing angle, the estimated height, the roll angle and a field of view angle.

6. The method of any one of the preceding claims, further comprising: responsive to the set of asset matches being a null set, generating an asset record in the database for the candidate object.

7. The method of any one of the preceding claims, wherein querying the database of asset records to determine the set of asset matches for the candidate object comprises: querying the database to determine a first subset of asset records of the database with regions of interest corresponding to the determined region of interest of the candidate object; determining a second subset of asset records from the first subset of asset records based on the candidate object type identifier; and determining the set of asset matches from the second subset of asset records.

8. The method of any one of the preceding claims, further comprising: responsive to the set of asset matches comprising a single asset record, associating the candidate object with the single asset record.

9. The method of any one of the preceding claims, further comprising: responsive to the set of asset matches comprising a plurality of asset records, determining an asset record with a location closest to the estimated object location as the asset match, and associating the candidate object with the asset match.

10. The method of claim 4 or any one of claims 5 to 8 when dependent directly or indirectly on claim 4, further comprising: determining a third subset of asset records from the first subset of asset records or the second subset of asset records, wherein said determining is based on the zone of interest; and determining the set of asset matches from the third subset of asset records.

11. The method of any one of the preceding claims, wherein the image comprises a plurality of objects, and the method further comprises: determining the candidate object from the plurality of objects based on one or more of: (i) a highest predicted fault probability associated with each of the plurality of objects; (ii) closest proximity of each of the plurality of objects to estimated object location; and (iii) context information associated with the image.

12. The method according to any one of the preceding claims, further comprising: saving the acquired image in an asset record of the asset match in the database.

13. The method of any one of the preceding claims, wherein the asset match is an identification of a particular infrastructure asset in a network.

14. The method of any one of the preceding claims, further comprising: generating a work order based on the determined asset match; and transmitting the work order to a user device.

15. The method of claim 11 wherein the work order comprises one or more of: (i) an identifier for the asset record; (ii) service information associated with the asset match; and (iii) a parts list to carry out service on the one or more objects.

16. The method of any one of the preceding claims, wherein identifying the candidate object in the image comprises: providing a numerical representation of the image to the image classifier, wherein the image classifier is configured to output the predicted object type identifier.

17. The method of claim 16, wherein the image classifier is further trained to determine a likelihood of the candidate object being faulty, and is configured to output a fault score.

18. The method of claim 2, or any one of claims 3 to 17 when directly or indirectly dependent on claim 2, wherein determining an estimated height at which the device was held at image capture time is based on: (i) an average height of a human; or (ii) a height of the user who acquired the image using the device.

19. A system comprising: one or more processors; and memory comprising computer executable instructions, which when executed by the one or more processors, cause the system to perform the method of any one of claims 1 to 18.

20. A computer-readable storage medium storing instructions that, when executed by a computer, cause the computer to perform the method of any one of claims 1 to 18.