GB2595260A

GB2595260A - Image processing in evidence collection

Info

Publication number: GB2595260A
Application number: GB2007456.3A
Authority: GB
Inventors: Franc Simon
Original assignee: Anatomap Ltd
Current assignee: Anatomap Ltd
Priority date: 2020-05-19
Filing date: 2020-05-19
Publication date: 2021-11-24
Also published as: GB202007456D0

Abstract

An image processing method comprises receiving 302 image data in relation to an object, segmenting 303 the image data to form segmented image data; determining 304 one or more edges of the object from the segmented image data; determining a length of the object using the or each edge; and using the length of the or each edge and a predetermined length of the object to determine 305 a translation between lengths within the image and a length of the object. The segmented image data is segmented into data representing the object or a part of the object. The determination of the length of the or each edge is determined in a frame of reference of the image data and the determined translation is between lengths in the reference frame and a length of the object. The translation may be a scaling factor and may be used to determine the length of another feature, such as a wound, represented by the image data. The method may comprise censoring the image data to hide the details of the object. The object may be a card such as a credit, debit, or identity card.

Description

Title: Image processing in evidence collection

Description of Invention

Embodiments relate to image processing methods for determining a translation between a length within a frame of reference of image data and a length of a feature represented by the image data.

The collection of evidence, for example to support a legal case, is a process which is both important and fraught with difficulties. Such evidence must be accurately recorded in a manner which can be securely retrieved. Furthermore, evidence which is provided by witnesses is notoriously prone to inconsistencies and errors of recollection. The successful collection of evidence is challenging and time consuming.

These issues are particularly pertinent in legal cases (both civil and criminal) which involve injuries, for example, such as domestic abuse cases or indeed in any case in which the scale of an object, scene, or mark, within photographic evidence, is relevant to define attribution or causation.

Photographic evidence, e.g. an image, can be helpful but can be difficult to assess accurately. For example, in an image it can be difficult to determine the actual size of what is represented and there is often no contextual information to determine what is shown in the image. Furthermore, modern photographic manipulation software is easy to use and inexpensive, which can devalue the evidential weight of even photographic evidence in relation to which there may have been an opportunity for tampering.

Embodiments, therefore, seek to alleviate one or more problems associated with the prior 30 art.

An aspect provides an image processing method including: receiving image data in relation to an object; segmenting the image data into data representing the object or a part thereof, to form segmented image data; determining one or more edges of the object from the segmented image data; determining a length, in a frame of reference of the image data, of the object using the or each edge; and using the length of the or each edge in the frame of reference of the image data and a predetermined dimension of the object to determine a translation between lengths within the frame of reference of the image and a length of the object.

A method may further include: using the translation to determine the length of another feature represented by the image data and annotating the image data with the determined length of the other feature.

Annotating the image data may include overlaying the determined length of the other 15 feature on the image data.

The other feature may be a wound.

Determining one or more edges of the object from the segmented image data may include determining a plurality of edges of the object, and the method may further include: determining one or more further dimensions of the object from the plurality of edges of the object; determining an orientation of the object by comparing the length and the one or more further dimensions with corresponding predetermined dimensions of the object; and generating an instruction for movement of the object to a predetermined orientation based on the determined orientation.

The image data may be captured by a camera and the predetermined orientation is a predetermined orientation with respect to a focal plane of the camera.

Determining a length, in a frame of reference of the image data, of the object using the or each edge may include extrapolating one or more edges of the object and determining a contour of a candidate object in the image data.

A method may further include modifying the image data within the contour of the candidate object in the image data, or which is segmented as part of the object, so as blur the image of the object.

The object may be a card complying with the ID-1 size specification in ISO/IEC 7810:2003.

Another aspect provides a computer readable medium storing instructions which, when executed by a processor, cause the method as above.

The method may further include classifying the image data independently of the segmenting of the image data.

Classifying the image data may include determining a bounding box for the object using the image data and generating a confidence factor based on the bounding box, the confidence factor being indicative of the likelihood of the bounding box defining a part of the image data representing the object, and wherein the method may further include applying a threshold to the confidence factor to reject image data defined by the bounding box from one or more further processing steps when the confidence factor does not meet the threshold.

Another aspect provides a device configured to: receive image data in relation to an object; segment the image data into data representing the object or a part thereof, to form segmented image data; determine one or more edges of the object from the segmented image data; determine a length, in a frame of reference of the image data, of the object using the or each edge; and use the length of the or each edge in the frame of reference of the image data and a predetermined dimension of the object to determine a translation between lengths within the frame of reference of the image and a length of the object.

A device may be further configured to: use the translation to determine the length of 5 another feature represented by the image data and annotating the image data with the determined length of the other feature.

Annotating the image data may include overlaying the determined length of the other feature on the image data.

The other feature may be a wound.

Determining one or more edges of the object from the segmented image data may include determining a plurality of edges of the object, and the device may be further configured to: determine one or more further dimensions of the object from the plurality of edges of the object;determine an orientation of the object by comparing the length and the one or more further dimensions with corresponding predetermined dimensions of the object; and generate an instruction for movement of the object to a predetermined orientation based on the determined orientation.

Determining a length, in a frame of reference of the image data, of the object using the or 25 each edge may include extrapolating one or more edges of the object and determining a contour of a candidate object in the image data.

A device may be further configured to modify the image data within the contour of the candidate object in the image data, or which is segmented as part of the object, so as blur 30 the image of the object The object may be a card complying with the ID-1 size specification in ISO/IEC 7810:2003. The device may be further configured to classify the image data independently of the segmenting of the image data.

Classifying the image data may includes determining a bounding box for the object using the image data and generating a confidence factor based on the bounding box, the confidence factor being indicative of the likelihood of the bounding box defining a part of the image data representing the object, and wherein the device may be further configured to apply a threshold to the confidence factor to reject image data defined by the bounding box from one or more further processing steps when the confidence factor does not meet the threshold.

Embodiments are described, by way of example only, with reference to the accompanying drawings, in which: Figure 1 shows a system according to some embodiments; Figure 2 shows a client device according to some embodiments; Figure 3 shows a user interface according to some embodiments; Figure 4 shows a representation of partially processed image data of some embodiments; Figure 5 shows a representation of partially processed image data of some embodiments; Figure 6 shows a screenshot of an annotation tool of some embodiments; Figure 7 shows a process of some embodiments; and Figure 8 shows another process of some embodiments.

Embodiments include a system I which is configured to process image data for the collection of legal evidence -see figure 1, for example.

The system 1 may include a client device 12 which is communicatively coupled to a server 11 of the system 1 by a network 13.

The client device 12 (see figure 2, for example) could take a number of different forms, some of which are described herein by way of example.

The client device 12 may be a mobile computing device which is configured to be carried by a person, for example. The mobile computing device may be a mobile (i.e. cellular) telephone, which may be a smartphone and which could be a device using an Android(RTM) operating system or i0S(RTM) operating system. The mobile computing device may be a tablet computing device which, again, may be operating using an Android(RTM) operating system or an i0S(RTM) operating system or iPadOS(RTM) operating system, for example. In some embodiments, the mobile computing device may be a wearable device (such as a smartwatch), a personal digital assistant, a laptop computer, a notebook computer, or the like.

In some embodiments, however, the client device 12 may be a generally static computing device such as a desktop computer or workstation -i.e. a device which is not intended to be carried by a user on a regular basis.

In some embodiments, the client device 12 may be an integrated computing device which forms part of another device or system -for example, a smart appliance (such as a refrigerator), a vehicle (such as a car), a doorbell, or security system device.

The client device 1 may include one or more processors 121, a memory 122, a computer readable medium 123, a communication interface 124, and a display screen 125. The client device 1 may further include one or more of: a camera 126, a power supply 127, a location sensor 128, and a clock 129, for example.

The or each processor 121 is communicatively coupled with the memory 122, the computer readable medium 12, the communication interface 124, and the display screen 125. As such, the or each processor 121 may be configured to execute one or more instructions stored on the computer readable medium 123 in the performance of the operations described herein. The computer readable medium 123 may, therefore, store the one or more executable instructions. In the execution of the or each instruction, the or each processor 121 may be configured to store data using the memory 122 and to retrieve stored data from the memory 122 in order to enable the execution of the or each instruction.

The or each processor 121 may be configured to receive information from the communication interface 124 from one or more devices which are remote from the client device 1 (i.e. which are not an integral part of the client device 1). In some embodiments, not being an integral part of the client device 1 means not being provided within or on a housing (i.e. casing) of the client device 1. The communication interface 124 may include one or more interface subsystems and these may include, for example, a wireless network interface subsystem 124a, and/or a wired network interface subsystem 124b, and/or a shod range wireless interface subsystem 124c, and/or a wired communication interface subsystem 124d.

The wireless network interface subsystem 124a may be a Wi-Fi(RTM) or WiMax(RTM) subsystem, for example (i.e. using a wireless communication protocol defined in IEEE Standard 802.11x or 802.16y). The wireless network interface subsystem 124a may be configured to communicate with another device or system over a range of more than 10m,

for example.

The wireless network interface subsystem 124a may be a mobile (i.e. cellular) telephone subsystem which is configured to communicate using a mobile (i.e. cellular) telephone communication network (such a subsystem 124a may be configured to communicate using 25 an LTE(RTM) communication protocol, for example).

The wired network interface subsystem 124b may be, for example, a local area network interface which may use an Ethernet connection and which may use the Internet Protocol.

The short range wireless interface subsystem 124c may be a subsystem which is configured to communicate over a relatively short range (e.g. less than 10m) and this may use, for example, the Bluetooth(RTM) communication standard.

The wired communication interface subsystem 124d may be a subsystem which is configured to communicate with one or more devices which are coupled with a wired connection and which may use a serial or parallel communication protocol. In some embodiments, the wired communication interface subsystem 124d is a communication bus and may be a Universal Serial Bus (USB(RTM)) subsystem 124d.

The communication interface 124 may be configured to communicate with one or more other devices (e.g. the server 11) via the network 13.

The display screen 125 is configured to present one or more graphical user interfaces to a user of the client device 12. The or each graphical user interface may be, for example, generated by the or each processor 121 (or may be generated remotely (e.g. by the server 11) and transmitted to the client device 12 to be displayed (i.e. rendered) on the display screen 125.

The client device 12 may include or be communicatively coupled to a user input apparatus 125a which is configured to receive a user's input. The user input apparatus 125a may include a keyboard and/or mouse, for example. In some embodiments, the user input apparatus 125a is communicatively coupled to the client device 12 via the communication interface (e.g. via the short range wireless interface subsystem 124c or the wired communication interface subsystem 124d, for example). In some embodiments, the user input apparatus 125a may be integrated with the display screen 125 and the display screen 125 may be a touchscreen.

The client device 12 may also, as described, include a power supply 127. This power 30 supply 127 may be in the form of a battery, for example, or a connection to a mains power supply, or both. The power supply 127 is configured to provide electrical power to the client device 12 for use by the various parts thereof (such as the one or more processors 121, the memory 122, the computer readable medium 123, the communication interface 124, the display screen 125, and (if provided) the camera 126).

The client device 12 may include the clock 129, which is configured to provide time information (which may include date information) to one or more other parts of the client device 12. The clock 129 may be in the form of independent hardware or may be implemented in software -i.e. in the form of executable instructions stored on the computer readable storage medium and executable by the or each processor 121. The one or more other parts of the client device 12 may include, for example, the or each processor 121 and/or (if provided) the camera 126.

The client device 12 may include the location sensor 128 which may be a sensor for a satellite-based location system -such as GPS, GLONASS, GALILEO, or BDS. The location sensor 128 is configured to determine its location (which may also be the client device 12 location) and to output the location information to one or more other parts of the client device 12. These one or more other parts may include the or each processor 121 and/or Cif provided) the camera 126. The location information may be in the form of a global location (e.g. a longitude and latitude).

In some embodiments, the client device 12 includes the camera 126 -which may be an integral part of the client device 12. In some embodiments, the client device 12 is communicatively coupled to a camera 126 which is not an integral part of the client device 12 -in which case, the camera 126 may be communicatively coupled to the client device 12 by the communication interface 124 and by any of the subsystems 124a-d thereof, for example. For instance, the camera 126 may be communicatively coupled to the client device 12 by a USB(RTM) connection to the wired communication interface subsystem 124d.

The camera 126 may be configured to capture an image of a scene local to the camera 126 (i.e. within a field of view of the camera 126). This image may be a visible light image but may include one or more non-visible components (such as in the infrared or ultraviolet light ranges). The image is output by the camera 126 as image data (so, in other words, the image data represents the image captured by the camera 126).

The camera 126 may have an image capture device, which may be a charge coupled device sensor or a complementary metal oxide semiconductor sensor, for example. The camera 126 may include one or more lenses through which light, from the scene, must pass to reach the image capture device. The camera 126 may include one or more light filters through which light must pass to reach the image capture device (e.g. to filter out specific light frequencies or ranges of frequencies). The camera 126 may include an illumination device (such as a light emitting diode) to provide illumination of the scene, for example.

The camera 126 generates and outputs the image data to the or each processor 121 in embodiments in which the camera 126 is a part of the client device 12.

The camera 126 may be configured to append metadata to the image data. This metadata may include, for example, one or more of: - an identifier for the camera 126 (which may be a unique identifier such as a serial number, and/or make and/or model identifiers); - a size of the image data (which may be a size of the image in pixels or a size of the image data in bytes); - a location at which the image data was generated (which may be obtained from the location information in embodiments in which the camera 126 is communicatively coupled to the location sensor 128); and/or - a time at which the image data was generated (which may be obtained from the clock 129 in embodiments in which the camera 126 is communicatively coupled to the clock 126).

In some embodiments, the camera 126 outputs the image data to the or each processor 121 and it is the or each processor 121 which is configured to append the metadata to the image data.

In some embodiments this metadata may include other information and this may include an identifier for the or each processor, an identifier for the client device (which may be a unique identifier such as a serial number; and/or make and/or model identifiers), a checksum for the image data, a checksum for the metadata, a single checksum for both the image data and metadata, a user identifier, or the like.

In embodiments in which the camera 126 is not part of the client device 12, then the image data (and any associated metadata appended by the camera 126 may be provided to the or each processor 121 via the communication interface 124 -using any of the described subsystems 124a-124d, for example. Accordingly, in such embodiments, the camera 126 is communicatively coupled to the or each processor 121 and, by extension to the client device 12 -this communicative coupling being via the communication interface 124. As will be appreciated, the communicative coupling may be a wired or wireless communicative coupling (see the various described subsystems 124a-124d of the communication interface 124).

The camera 126 may include its own location sensor 128 and/or clock 129-which may be as described in relation to the client device 12. The camera 126 may also include its own one or more processors (and may include its own memory and/or its own computer readable medium for storing instructions for execution by its one or more processors).

Therefore, the camera 126 may be configured to append metadata to the image data including location and/or time information from its own location sensor and/or clock, as the case may be.

As will be understood, the image data generated by the camera 126 (whether or not that camera 126 is part of the client device 12 or separate and communicatively coupled thereto) may be associated with metadata.

The server 11 could take a number of different forms and may comprise a number of server systems which may be geographically distributed but in communication with each other, for example.

The server 11 (or each server system) may include one or more processors 111, a memory 112, a computer readable medium 113, and a communication interface 114. The client device 1 may further include one or more of: a camera 126, a power supply 127, a location sensor 128, and a clock 129, for example.

The or each processor 111 is communicatively coupled with the memory 112, the computer readable medium 11, and the communication interface 114. As such, the or each processor 111 may be configured to execute one or more instructions stored on the computer readable medium 113 in the performance of the operations described herein.

The computer readable medium 113 may, therefore, store the one or more executable instructions. In the execution of the or each instruction, the or each processor 111 may be configured to store data using the memory 112 and to retrieve stored data from the memory 112 in order to enable the execution of the or each instruction.

The or each processor 111 may be configured to receive information from the communication interface 114 from one or more devices which are remote from the server 11 and which may include the client device 12 (i.e. one or more devices which may be geographically separated from the server 11).

The communication interface 114 may include one or more interface subsystems and these may include, for example, a wireless network interface subsystem 114a, and/or a wired network interface subsystem 114b.

The wireless network interface subsystem 114a may be a Wi-Fi(RTM) or WiMax(RTM) subsystem, for example (i.e. using a wireless communication protocol defined in IEEE Standard 802.11x or 802.16y). The wireless network interface subsystem 114a may be configured to communicate with another device or system over a range of more than 10m, for example.

The wireless network interface subsystem 114a may be a mobile (i.e. cellular) telephone subsystem which is configured to communicate using a mobile (i.e. cellular) telephone communication network (such a subsystem 124a may be configured to communicate using an LTE(RTM) communication protocol, for example).

The wired network interface subsystem 114b may be, for example, a local area network interface which may use an Ethernet connection and which may use the Internet Protocol.

The communication interface 114 may be configured to communicate with one or more other devices (e.g. the client device 12) via the network 13.

The server 11 may be a server provided by a cloud computing service such as the Microsoft Azure (RTM) cloud service.

The network 13 may be a local area network or a wide area network, or a combination thereof. The network 13 may include the Internet The network 13 may include a wired network and/or a wireless network.

The communications over the network 13 between the client device 12 and the server 11 may be encrypted.

The system 1 may, in some embodiments, include a plurality of client devices 12 which are each configured to communicate with a single (i.e. one and only one) server 11. The server 11 may include a plurality of server systems which are configured to communicate with a single client device 12 (i.e. one and only one client device 12). In some embodiments, the system 1 may include a plurality of client devices 12 which are each configured to communicate with a single (i.e. one and only one) server system of the server 11 (that server system being located at a single geographical location).

The client devices 12, in embodiments including a plurality of client devices 12, need not be the same type of client device -i.e. different types of client device 12 may On the same embodiment) be communicatively coupled to the server 11. Different client devices 12 may have different configurations and attributes, for example -such as at least one mobile client device 12 and at least one generally static client device 12.

Users of the system 1 may include one or more user types For example, the users may include one or more of: -a victim (which may include a claimant and/or a legal representative of a victim (such as a relative); - a law enforcement officer; - a witness; - a medical practitioner (who may be considered to be a witness in some embodiments); and/or - a legal services professional.

The victim may be the victim of a criminal or civil legal offence, for example. The victim may, for example, be a claimant in a civil legal case. In some embodiments, the user may be a representative of the victim, such as a legal representative (e.g. a relative).

The law enforcement officer may be a member of the police or other investigatory body entrusted with investigation of a criminal or civil offence. References to the police are not intended to be limited to any particular form of law enforcement officer but are intended to encompass, for example, both local and federal law enforcement officers.

The witness may be a witness of a criminal or civil offence, and may be a direct witness of the events to which the offence relates or may be a witness of the results of the offence. A witness may, for example, be a doctor, a nurse, or other member of a medical service who may have treated the victim (e.g. in the event of a case involving an injury). In some embodiments, a medical practitioner (i.e. member of a medical service) may be a separate type of user from other forms of witness.

The legal services professional may be a lawyer, a legal advisor, or the like. The legal services professional may be acting for the victim or, for example, for a suspect.

The different user types may be categorised into data input users (which may include the victim and/or the witness) and data output users (which may generally include the law enforcement officer and/or the legal services professional).

The type of user may determine, for example, which features of embodiments are made available to the user.

Embodiments include a client-side computer program, which may be referred to as an "app" for example, and a server-side computer program.

In some embodiments, the client-side computer program is in the form of instructions, stored on the computer readable medium 123 for execution by the or each processor 121 of the client device 12. In some embodiments, the server-side computer program is in the form of instructions, stored on the computer readable medium 113 for execution by the or each processor 111 of the server device 11.

The client-side computer program may be configured, on start-up, to present a sign-in interface to a user on the display screen 125. The sign-in interface may provide the means by which the user can authenticate their identity with the client-side computer program and/or the server-side program. As such, the sign-in interface may provide a text entry box into which the user may enter a username and a text entry box into which the user may enter a password. The user input may be via the user input apparatus 125a, for example. In some embodiments, other forms of authentication of a user may be used -for example, the use of biometric information collected by a biometric sensor of the client device 12 or communicatively coupled thereto (e.g. by the communication interface 124).

The biometric sensor may be a fingerprint sensor, for example. The biometric sensor may be the camera 126 (or a second camera in relation to which the description of the camera 126 applies equally). The biometric sensor may be configured to sense, for example, a fingerprint, a face, a retina, or the like (i.e. biometric identification information) associated with the user. These are all other potential forms of authentication information (other examples being the username and password mentioned above, for example).

The client-side program may be configured to receive the authentication information and to compare this to locally stored authentication information (i.e. information stored on the client device 12 -such as on the computer readable medium 123) and/or may be configured to send the authentication information to the server-side program. The server-side program may be configured to compare this to remotely stored authentication information (i.e. information stored on the server 11 -such as on the computer readable medium 113. The server-side program may be configured to send the results of that comparison to the client-side program (to authenticate, or not, the user). If the user is authenticated (e.g. by the client-side program, by the server-side program, or both), then the client-side program may be configured to present one or more account-specific interfaces to the user on the display screen 125.

The one or more account-specific interfaces may be interface which are specific to that user and/or which include information which is associated with that user's account.

The one or more account-specific interfaces may include a role selection interface. The role selection interface may include an indication of a number of different possible roles for the user. In particular, a single user may use the client-side program as one of a number of different types of user (each type being a role). The roles may, therefore, match one or more of the types of user described herein (such as a victim, a law enforcement officer, a witness; medical practitioner, and/or a legal services professional). As will be appreciated, a user may use the client-side program as a victim in relation to one case but may, likewise, use the client-side program as a witness in relation to another case. Similarly, it may be that the same user may have more than one role in relation to the same case (for example, a user may be a law enforcement officer role in relation to which the law enforcement officer is investigating one or more aspects of the case, but the same user may also have witnessed an incident associated with the case and so may also be a witness).

The user may select one of the roles (i.e. a single role -one and only one) using the user input apparatus 125a, for example.

In this example, the user may have selected the victim role. On selection of the victim role, the client-side program may be configured to present one or more evidence collection interfaces to the user on the display screen 125, into each of which the user may input information using the user input apparatus 125a, for example.

The one or more evidence collection interfaces may include a victim portrait capture interface in relation to which the camera 126 or second camera may be activated and an image captured by the camera presented to the user via the victim portrait capture interface. The capturing of the image of the user may be triggered by the user, via the user input apparatus 125a, for example, and a plurality of images may be presented in sequence so that the user selects one of the plurality of images for use as a victim portrait image. This image may be associated with metadata as described. The victim portrait image may form evidence of the identity of the victim, for example. The victim portrait may, therefore, be a self-captured image of the victim and this may be called a victim "selfie", for example.

The one or more evidence collection interfaces may include a body part selection interface. The body part selection interface may present, on the display screen 125, a chart schematically representing the human body. The chart may have selectable segments for body parts of the represented human body. For example, the head, left arm, left leg, right arm, right leg, left-side torso, right-side torso, left hand, left foot, right hand, right foot, and the like.

The body part selection interface permits the selection of one or more body parts in relation to which evidence is to be gathered. The selection may be made using the user input apparatus 125a, for example.

On the selection of one or more body parts, the client-side program may be configured to present an image capture interface 100 (see figure 3) to the user via the display screen 125. In embodiments and/or situations in which multiple body parts were selected, then an image capture interface 100 may be presented in relation to each body part, in sequence (i.e. serially).

The image capture interface 100 may include a camera-feed section (present on the display screen 125) in which sequential images captured by the camera 126 are presented to the user (e.g. as a video).

The image capture interface 100 may include one or more instructions 1001 to the user and these instructions may be presented temporarily to the user. The or each instruction 1001 may be presented to the user, on the display screen 125, for a predetermined time period and/or until an input is received via the user input apparatus 125a, for example. The or each instruction 1001 may be presented to a user in an instruction section 1001a of the image capture interface 100 which may be displayed throughout the entire (or substantially the entire) presentation of the image capture interface 100.

The image capture interface 100 may include one or more guide images 1002 overlaid with the sequential images, such that the or each guide image 1002 is visible to the user, on the display screen 125, at the same time as the sequential images. The or each guide image 1002 may be displayed throughout the entire (or substantially the entire) presentation of the image capture interface 100.

In some embodiments, the or each guide image 1002 includes an outline of a rectangle or 30 substantially rectangular shape. The or each guide image 1002 may be a shape representative of the shape of a card, such as a credit card or identity card, and may be a shape representative of a card according to ISO/IEC 7810:2003. The shape may be a card according to the ID-1 size specification in ISO/IEC 7810:2003 (e.g. about 85,60 mm by about 53,98 mm).

The shape may be scaled in its representation according to a desired distance of evidence from the camera 126 (the evidence being the evidence of which an image is to be captured -which may be an injury to a body part). The desired distance may be a distance from the camera 126 which is within a focal distance of the camera 126, for example.

The or each guide image 1002 may be an indicator in relation to which the user should seek to align a card or other object (of which a card is used as an example) within the images captured by the camera 126 and presented on the display screen 125.

So, for example, the user may place a card 200 -such as a credit card or other card 200 complying with ISO/IEC 7810:2003 (which may be according to the ID-1 size of that standard) -adjacent evidence. The user may then position the camera 126 (which may of course require positioning of the client device 12 itself) such that a guide image of the one or more guide images 1002 is generally aligned with the card 200 in the images presented on the display screen 125.

The or each instruction 1001 may include an instruction for the user to position the card 200 adjacent the evidence. This may include an instruction to position the card 200 in a plane in which the evidence is located and which may be a focal plane of the camera 126.

The or each instruction 1001 may include an instruction for the user to position the card and the camera 126 such that the card 200 substantially fills a guide image 1002 of the one or more guide images 1002.

The one or more guide images 1002 may be a broken-line outline of the shape of the card, 30 for example. The one or more guide images 1002 may include a plurality of guide images 1002. Each of these guide images 1002 may be representative of the shape of the card, for example, and may all be substantially the same size. The plurality of guide images 1002 may, however, include at least two guide images 1002 which represent the card in different orientations. Accordingly, the plurality of guide images 1002 may include a first guide image 1002 representing the card in a first orientation and a second guide image 1002 representing the card in a second orientation. The first and second orientations may be perpendicular to each other. In some embodiments, there may be four guide images 1002. The guide images 1002 may be distributed around the image capture interface 100 such that the user can use whichever guide image is most convenient.

In some embodiments, there is an upper guide image 1002 which represents the card in a landscape orientation, there is a lower guide image 1002 which represents the card in a landscape orientation, there is a first side guide image 1002 which represents the card in a portrait orientation, and there is a second side guide image 1002 which represents the card in a portrait orientation.

The evidence may be an injury and so the sequential images may be of a body part (which may be the body part indicated by the user in the body part selection interface, for example). The injury may be a cut, graze, bruise, or the like, for example.

The or each guide image 1002 may be located generally towards an edge (or respective edges) of the image capture interface 100 as displayed -such that there is a central part of the image capture interface 100 in which the evidence (e.g. injury) may be located.

In some embodiments, the client-side program is configured to perform real-time card (this being one example of a suitable object) identification within the image data which is generated by the camera 126 during presentation of the image capture interface 100. The client-side program is, therefore, configured to determine when the card 200 and camera 126 are in the required relative position and/or orientation. This required position and/or orientation may be a position and/or orientation in which the card 200 substantially fills one of the one or more guide images 1002, for example.

On identification of the required position and/or orientation being satisfied, the image may be captured and stored as a first captured image, for example. The capturing of the image may be automatic or may require user input using the user input apparatus 125a.

If the required position and/or orientation are not satisfied within a predetermined period and/or following capturing of the image (e.g. by user input), then the one or more instructions 1001 may be presented or re-presented to the user through the display screen 125 indicating a corrective action to be taken (e.g. to move the camera 126 and the card 200 closer together or further apart, whilst maintaining the relative in-plane position of the card 200 and the evidence, for example).

In some embodiments, the captured image (e.g. the first captured image) may be captured on user input using the user input apparatus 125a.

Following capture, the image data representing the captured image may be analysed by the client-side program to determine whether the card 200 and camera 126 are in the required relative position and/or orientation. This required position and/or orientation may be a position and/or orientation in which the card 200 substantially fills one of the one or more guide images 1002, for example. This analysis of the image data may still occur substantially immediately after the image capture and may be whilst the image capture interface 100 is still being presented to the user. Therefore, this analysis may also be considered to be in real-time.

If the required position and/or orientation are not satisfied for the captured image, then the one or more instructions 1001 may be presented or re-presented to the user through the display screen 125 indicating a corrective action to be taken (e.g. to move the camera 126 and the card 200 closer together or further apart, whilst maintaining the relative in-plane position of the card 200 and the evidence, for example).

In some embodiments, the or each instruction 1001 includes the original instructions presented to the user prior to or during capture of the image (i.e. re-iteration of the or each instruction) or may include at least one specific instruction which is determined by the client-side program based on the reason why the required position and/or orientation are not satisfied (as determined by the client-side program, for example).

The client-side program may be configured to present to the user, on the display screen 125 a further image capture interface 100 in which, for example, the size and/or position of the or each guide image 1002 is different (compared to in the previous image capture interface 100). For example, the guide image 1002 may be larger in this second image capture interface 100 such that the user must, for instance, move the camera 126 closer to the card 200 (and so also the evidence). This may be a "close-up" shot of the evidence, for example. Again, the relative size of the representation of the card in the or each guide image 1002 of the second image capture interface 100 may be such that the evidence will be within the focal distance of the camera 126 and may be in a focal plane of the camera 126. In some embodiments, the second image capture interface 100 is such so that alignment of the camera 126 with respect to the evidence is at a different angle than the previously acquired image data (i.e. that captured using the earlier (i.e. first) image capture interface 100). Again, the client-side program may provide real-time card identification in the same manner as described above.

The client-side program may be, therefore, configured to determine when the card 200 and camera 126 are in the required relative position and/or orientation for the second image capture interface 100. This required position and/or orientation may be a position and/or orientation in which the card 200 substantially fills one of the one or more guide images 1002, for example -which may, of course, be guide images 1002 of a different size and/or orientation to the guide images 1002 of the earlier, first, image capture interface 100.

On identification of the required position and/or orientation being satisfied, the image may be captured and stored as a second captured image, for example. The capturing of the image may be automatic or may require user input using the user input apparatus 125a.

If the required position and/or orientation are not satisfied within a predetermined period, then, similarly, the one or more instructions 1001 may be presented or re-presented to the user through the display screen 125 indicating a corrective action to be taken (e.g. to move the camera 126 and the card 200 closer together or further apart, whilst maintaining the relative in-plane position of the card 200 and the evidence, for example).

This may be repeated for one or more further image capture interfaces 100. Accordingly, a series of images may be captured (each image being provided as part of the image data) and this series of images may include one or more images of the same evidence from a different angle or at a different distance (of the camera 126 relative to the evidence).

In some embodiments, there is no real-time card identification and, instead, the user may use the user input apparatus 125a to trigger the capture of an image (provided as image data) when the user is of a view that the card 200 is in the correct location (e.g. substantially filling one of the one or more guide images 1002). In such embodiments, the client-side program may be configured to assess the captured image by processing the image data to determine whether the required position and/or orientation are satisfied. If not, then the user may be prompted -e.g. through an instruction of the one or more instructions 1001a and a new image capture interface 100 -to retake the image and one or more corrective actions may be indicated to the user in the or each instruction 1001a.

The first and/or second (and any further) captured images may be stored by the client-side program and this storage may be on the computer readable medium 123 (at least initially) and may be uploaded by interaction of the client-side program and the server-side program to the server 11 (for storage on the computer readable medium 113) -as described herein. The captured images are provided as image data and the image data (i.e. each captured image) may be associated with respective metadata and that metadata may be metadata as described herein.

The capturing of the card 200 within the image data enables processing of the image data with identification of the scale of the evidence represented within the image data -as the 30 card 200 is of a known size.

The client-side program may be, as will be understood, configured to identify the card 200 within the image data. In some embodiments, the image data is sent by the client-side program to the server-side program and the server-side program is configured to identify the card 200 within the image data (and to send this information back to the client-side program).

As will be appreciated, the accurate identification of the card 200 within the image data is a complex problem.

The following is a description of the processes involved in the identification of the card 200 within the image data. These processes may be performed by the server 11 and/or by the client device 12, for example.

Embodiments may use artificial intelligence techniques to identify the card 200 within the image data, such as machine learning and, in particular, deep learning image analysis techniques.

The analysis of the image data may use an implementation of the Mask R-CNN method [of Kaiming He, Georgia Gkioxari, Piotr Dollar, Ross Girshick; Mask R-CNN; 2017 IEEE International Conference on Computer Vision (ICCV); DOI: 10.1109/ICCV.2017.322].

In some embodiments, a pre-trained model is used based on an artificial neural network. The artificial neural network may be a deep neural network, for example, including a large number of layers between the input to the neural network and the output thereof. The pre-trained model may have been pre-trained with natural images (represented by image data) which are not domain-specific (e.g. not exclusively taken with a camera of the type expected to be used, and/or without a card 200 in the image data, and/or without an injury in the image data).

In some embodiments, therefore, the pre-trained model is then subjected to further training. In embodiments in which a pre-trained model is not used, then the process is much the same; however, training must be undertaken initially in order to reach largely the same stage in the training process as provided by the pre-trained model (albeit domain-specific images may be used to arrive at a partially trained model which is then subjected to the further training). In the further training, layers of the model may be frozen and further training (or fine-tuning) of the model is undertaken based on domain-specific images. The domain-specific images may be images which are more closely related to the intended evidence to be collected. In some of the examples this evidence concerns an injury. Therefore, the domain-specific images may include images of injuries. The domain-specific images include images of cards 200 (or any other object used for a similar purpose, as described). The domain-specific images may include images which were captured using a camera of the same type as it is expected will be used in practice (e.g. of the same type as the camera 126).

The domain-specific images may include images captured under various different conditions likely to be encountered in practice. This may include, for example, images captured using different camera types, images captured in different lighting conditions, images captured with and without the use of the illumination device of the camera, images captured with different distances to the injury, and the like.

The domain-specific images may include the card 200 (or other object being used) and/or may include an object of a similar form, with a view to training the model to distinguish between the card 200 (or other object being used) and objects of a similar appearance.

In some embodiments, the model may be evaluated for accuracy (e.g. by the server-side program) using domain-specific images which may be accompanied by annotations which define a boundary of the card 200 in each of the images. In embodiments in which the images include injuries, these may also be identified with annotations and/or a placeholder annotation may be made to represent a portion of the image which might have included an injury. The placeholder may have a known size, for example.

The annotations may be included with the image data representing the images which are then analysed using the model. For example, the image data including the or each annotation may be assessed by the artificial neural network to determine whether there is a card 200 present in the image data and/or the accuracy of the identification of the location of the card 200 in the image data. Training of the model may be periodically interrupted to undergo this evaluation and the model may be deemed to be adequately trained when a predetermined accuracy threshold has been reached.

A plurality of domain-specific images may be used for this further training process. In 10 some embodiments, more than one hundred domain-specific images may be used. In some embodiments, two hundred or more domain-specific images may be used.

As will be appreciated, the model may be trained in accordance with a training process. The training process may be undertaken in accordance with instructions which are 15 executable by a processor -such as the or each processor 111 of the server and/or the or each processor of the client device 121.

In some embodiments, the same model is used by multiple different client devices 12 and/or servers 11. In some embodiments, the model is only generated once (although it may be periodically fine-tuned with new domain-specific images, for example). Such fine-tuning may be required when, for example, a new type of client device 12 (e.g. with a new camera 126) is used in the system 1 or becomes prevalent in its use.

In embodiments, therefore, the model -already created -is provided to the client device 25 12 and/or the server 11. The model may be stored on the computer readable medium 123,113 of either device 12,11. In some embodiments, the client device 12 may request the model from the server 11 which may then provide the model to the client device 12.

Once trained, the model may be used, in accordance with embodiments, to analyse image 30 data captured by the camera 126, for example.

The analysis of the image data according to some embodiments is depicted in figure 8, for example.

In accordance with the analysis, the image data 401 may be provided.

In some embodiments, the analysis of the image may include use of the model to identify areas of interest 405 within the image data -an area of interest identification process. These areas of interest 405 (of which there may be one or more) are areas 405 which may include image data relating to the card 200 but may not be confined solely to image data representing a card 200. One or more additional image processing operations may be performed in relation to the or each area of interest 405 and different image processing operations may be performed in relation to different ones of the areas of interest 405 On embodiments including more than one such area). The or each area of interest 405 may include image data representing the card 200 and other image data from the surroundings of the card 200. The or each area of interest 405 may be a rectangular area of interest 405, although other shaped areas of interest 405 are envisaged.

The or each area of interest 405 is, for example, a region of the image data which has been identified for subsequent processing and this is to be distinguished from, for example, a bounding box 409 for the card 200. In particular, the or each area of interest 405 may include portions of the image data which do not represent at least part of a card 200. Indeed, at least one of the or each area of interest 405 may include no image data representative of at least part of the card 200 On some such instances, if there is a card 200 present in the image data then there would also be at least one area of interest 405 which does include image data representative of the card 200). The or each area of interest 405 may be referred to as a region of interest 405, for example.

In some embodiments a Region Proposal Network 404 may be used in the identification of the or each area of interest 405. The Region Proposal Network 404 may be configured to identify anchor points within a feature map 403 generated by a backbone convolutional neural network 402 (e.g. which may be part of the model) and which may be in the form of a Feature Pyramid Network). The feature map 403 may include one or more areas of interest 405 already but these may require refinement. The Region Proposal Network may generate anchor boxes for the anchor points (i.e. candidate refinements of the areas of interest 405). Convolutional layers may then be used (which may be part of the model) to refine the areas of interest 405. This may include, for example, the use of a regression and classification operation (distinguished from the classification process 407 described herein, although much the same process can be applied for the segmentation operation). The result of the use of the Region Proposal Network 404 may be one or more areas of interest 405 which are effectively refined areas of interest 405.

The identification of one or more areas of interest 405 in the image data provides a reduced dataset for further processing, for example.

The one or more areas of interest 405 may, therefore, form a feature map 403 of areas within the image data which represent features in relation to which it is intended to perform further processing (this might be referred to as a refined feature map 403 in embodiments in which the Region Proposal Network 404 was used; however, as will be appreciated, whether or not the Regional Proposal Network 404 was used may impact the accuracy of the feature map 403 but the refined feature map 403 is still a feature map 403 within the meaning of the language used herein).

The identification of the or each area of interest 405 in the image data may be undertaken in accordance with instructions which are executable by a processor -such as the or each processor 111 of the server and/or the or each processor of the client device 121.

Image data from the or each area of interest 405 may be subjected to one or more intermediate layers of processing prior to segmentation 406 and classification 407 (see below) -the use of the or each intermediate layer being an intermediate processing step.

These one or more intermediate layers may prepare the image data for the or each area of interest 405 for further processing and this may include, for example, a data pooling process to provide a fixed size representation of the image data of the area of interest 405. As will be appreciated, this may require some merging of image data from sample points to form the fixed size representation and a number of different methods could be used. In some embodiments, the RolAlign method is used (see the Mask R-CNN method).

Accordingly, the or each area of interest 405 may be a refined area of interest 405 and/or may be an area of interest 405 which has been normalised to a fixed size, for use in subsequent processing steps as discussed herein.

Subsequent to the identification of the one or more areas of interest 405, a segmentation process 406 may be performed in relation to the or each area of interest 405. This may result in a pixel-level indication of whether each pixel of the image data represents a part of the card 200.

In accordance with the segmentation process 406, the analysis may include, for example, the segmentation of the image data. As mentioned, the segmentation may identify, for example, parts of the image data which relate to the card 200 and parts which do not relate to the card 200. The segmentation process 406 may be performed on the whole or part of the feature map 403 -i.e. in relation to the or each identified area of interest 405 rather than in relation to the image data as a whole.

The parts may be of any predetermined size and this size may be defined in terms of the number of pixels in the part -so, for example, the image data (e.g. within the or each area of interest 405) may be divided into pads in a substantially uniform manner and the segmentation may then determine, in relation to each part, whether that part represents a card 200 or part thereof.

In some embodiments, each part is a single pixel. The segmentation may, therefore, be a pixel-level segmentation of the image data from the camera 126.

The segmentation of the image data may, therefore, include a categorisation for each image part (e.g. for each pixel) as to whether that image part represents at least part of a card 200 or not. This may include determining a categorisation for each part (e.g. for each pixel in turn) as a binary indication -i.e. whether the image part represents at least part of a card 200 as a yes or no indication.

As described herein, categorisation of objects represented by the image data (such as by the or each area of interest 405) may be performed in parallel with the segmentation process 406. Therefore, in some embodiments, the segmentation process 406 may determine whether each part of the image data (e.g. each pixel) represents part of an object (which may be the card 200) or not. The object may be the card 200 (e.g. in relation to an area of interest 405 which includes image data representing at least part of the card 200). In other words, the segmentation process 406 may be performed without knowledge (by the segmentation process 406) of the classification of an object represented by the image data being processed (e.g. without contemporaneous knowledge at the time of performance of the segmentation process 406). The segmentation process 406 may, therefore, be performed in relation multiple classes (i.e. in relation different objects). The categorisation may be used to selected the segmentation process 406 result (e.g. a mask) which is most likely to be segmentation process 406 based on the card 200. This may be viewed as a final part of the segmentation process 406, or as a step which is subsequent thereto.

The original image data may be stored (e.g. on the computer readable medium 123 of the client device 12 or on the computer readable medium 113 of the server 11, as described herein). The segmentation of the image data may be stored independently or in association therewith. In general, information defining the segmentation of the image data will be referred to as segmented image data 408. The segmented image data 408 may include the image data in some embodiments or may be in the form of information (such as a pixel map) to be assessed in combination with separate image data.

The segmented image data 408 may be stored (e.g. on the computer readable medium 123 of the client device 12 or on the computer readable medium 113 of the server 11, as described herein). The segmented image data 408 may be a mask.

The segmentation process 406 may use a convolutional neural network and may use a fully convolutional neural network (which may be part of the model).

The segmentation of the image data may be undertaken in accordance with instructions which are executable by a processor -such as the or each processor 111 of the server and/or the or each processor of the client device 121.

In parallel with the segmentation process 406, the or each area of interest 405 may be analysed to determine a bounding box 409 (or boxes) for the objects represented by the image data for that area of interest 405 (e.g. through regression) -a bounding box 409 defining a part of the image data representing an object (or thought to represent an object). The or each object within the area of interest 405 may then be classified (as being a card 200 representation or a representation of another object or being unclassifiable).

The determining of the or each bounding box 409 and the classification 410 may generally be part of a classification process 407. The classification 407 of the image data may be undertaken in accordance with instructions which are executable by a processor -such as the or each processor 111 of the server and/or the or each processor of the client device 121 The determining of the or each bounding box 409 and the classification 410 may be independent of the segmentation process 406.

Identification of the or each bounding box 409 may include the use of one or more fully connected layers (e.g. of the model) on the or each area of interest 405 (which may be a 30 refined area of interest 405 and/or may be an area of interest 405 which has been normalised to a fixed size). As described, regression may be used to provide the bounding box 409 (for the object in the area of interest 405, which may be the card 200) within the area of interest 405.

Classification may include the use of one or more fully connected layers (e.g. of the model) 5 on the or each area of interest 405 (which may be a refined area of interest 405 and/or may be an area of interest 405 which as been normalised to a fixed size). The classification may use a softmax classifier, for example.

The classification may include generating a confidence factor based on the bounding box 409 -e.g. based on the shape and/or size of the bounding box relative to the expected 10 shape and/or size of the card 200. The confidence factor may be indicative of the likelihood of the bounding box 409 representing the card 200.

The classification may further include applying a threshold to the confidence factor to reject image data defined by the bounding box 409 from one or more further processing steps when the confidence factor does not meet the threshold (and to perform the one or more further processing steps when the threshold is met or exceeded). The one or more further processing steps may include the performance of edge detection on the segmented image data for the same image data (i.e. segmented image data generated based on the same image data as defined by the bounding box 409).

The result may be, therefore, a pixel-level map of at least the or each area of interest 405 (which may encompass all of the image data), and one or more classified objects identified with respective boundary boxes.

At this stage, therefore, the likely location of the image data representing the card 200 may be identified using the classified boundary box or boxes and the associated segmented image data 408 can be selected using the or each boundary box. The analysis may identify from a plurality of boundary boxes being candidate cards (i.e. potentially being the boundaries of cards -e.g. areas of interest 405) the most likely image data representing the card by using the classification, for example.

With reference to figure 4, this figure shows a user holding a card 200 adjacent a wound 300 (as an example). A border marking the edge of the segmented image data 408 can be seen indicated with a broken line 201. As can be seen, part of the card 200 is occluded. In addition, as can also be seen, part of the user's finger has been segmented as part of the card 200 in the image data. The bounding box 409 for the card 200 within the image data is indicated by the solid line 202 and it is this bounding box 202/409 which may be used for classification 410 purposes.

The segmented image data 408 is then subjected to edge detection and this may be referred to as an edge detection process, for example -see figure 5, for example, which was generated using the same image data as figure 4. The edge detection process may be undertaken in accordance with instructions which are executable by a processor -such as the or each processor 111 of the server and/or the or each processor of the client device 121.

In accordance with the edge detection process, the segmented image data 408 is analysed to identify likely edges of the card 200. This may include all four edges of the card 200, for example In order to find the best quadrilateral, the edge detection process first applies a simple contour detection method. Contours are simply curves joining all the continuous points along the boundary. In the next step, the process approximates this contour with another contour/polygon with a small number of edges. At this point, these edges still do not represent the four edges required to describe the quadrilateral, and they may follow occlusions such as a finger holding the card. Thus, in some embodiments, there is further processing of these edges to heuristically pick among them four edges that are most likely to follow along the card's 200 four edges, and finally extrapolate these specific edges to find their intersection. The intersections of these edges thus represent an approximation of the vertices of the card in the image.

More specifically, the edge detection process may determine a contour of the card in the image using the segmented image data 408 -this contour is shown in figure 5 by the solid line 203. The contour may be described, by the edge detection process, in terms of one or more image vectors. Determining the contour may comprise defining a plurality of curved lines which circumscribe the area of image data identified in the segmentation process 406 as representing the card 200. As will be appreciated, this process may result in a plurality of lines which represent not only the edges of the card 200 but also other features within the segmented image data 408 (which may be the result of inaccuracies and noise in the processes, for example). Therefore, determining the contour may include identifying which of these lines are likely to represent the four edges of the card 200. A number of different methods may be used to achieve this and the approach may be heuristic. In some embodiments, the process uses the Douglas-Peucker algorithm. Determining the contour may include analysing these lines to identify one or more quadrilateral shapes which are likely (e.g. due to size and/or shape) to be a card 200.

It will be appreciated that one or more parts of the card 200 may be occluded. For example, a user may be holding the card 200 and a finger or thumb of the user may be covering part of the card 200 from the camera 126. Other occlusions may occur due to clothing or the like blocking at least part of the card 200 from the camera 126. Therefore, in some embodiments, determining the contour may include extrapolating one or more of the lines to form the quadrilateral shape which is likely to represent the card 200. In some embodiments, the quadrilateral shape is substantially quadrilateral to account for distortion of the image data, such as a result of the intrinsic parameters of the camera 126. In some embodiments, the client device 11 may have associated software for pre-processing the image data (prior to its processing by the aforementioned client-side program) to reduce the effects of distortion of the image data due to the intrinsic properties of the camera 126.

The analysis may identify from a plurality of boundary boxes 409 being candidate cards (i.e. potentially being the boundaries of cards) the most likely image data representing the card by using the classification 407, for example.

In some embodiments, the earlier analysis (in particular the classification 410 of the or each boundary box 409) may require a confidence threshold to be reached. This confidence threshold may be based on, for example, the size, shape, position, and/or colour -for example -of the object represented by that image data. If the confidence threshold is not met then the edge detection process may be omitted and an error message may be presented to the user (e.g. on the display screen 125) which may indicate that no card was detected.

The result of this part of the edge detection process may be stored independently or in association with the segmented image data 408. In general, information defining the card contour of the segmented image data 408 will be referred to as card contour data. The card contour data may include the image data (and/or segmented image data 408) in some embodiments or may be in the form of information (such as image vectors) to be assessed in combination with separate image data (and/or segmented image data 408).

The card contour data may be stored (e.g. on the computer readable medium 123 of the client device 12 or on the computer readable medium 113 of the server 11, as described herein).

The card contour data may then be subjected to a sizing process. The sizing process may be undertaken in accordance with instructions which are executable by a processor -such as the or each processor 111 of the server and/or the or each processor of the client device 121.

The sizing process may identify a dimensional translation between pixels and objects within a plane of the card 200 within the image data.

For example, in accordance with the sizing process, locations -within the image data (segmented image data 408 or card contour data) -may be identified for the corners of 30 the card 200. As will be appreciated one or more of the corners may be occluded or otherwise blocked. Therefore, the identification of the corners may include extrapolation of one or more of the lines of the contour along the same vector path as already defined by that contour (the vector path being the path of the edge within the image data, segmented image data 408, or card contour data, and may be defined by an image vector) -if not already performed, for example. The or each line may be extended in this manner until, for example, intersection with another line or extended line. The length of the extension may be limited to the expected maximum occlusion size within the image data and/or may be limited to the maximum dimension of the card 200 (as it is expected to be represented within the image data).

As will be appreciated, therefore, the sizing process may have identified corners of the card within the card contour data -corners which may not be visible within the original image data (or this information may have already been present in the card contour data).

Using the identified corners, the sizing process may then determine the length and/or width of the candidate card. The card 200 may be rectangular and so the determining a length may be determining the longest distance between two opposing sides of the candidate card along a line parallel with an edge of the candidate card. Determining the shortest distance between two opposing sides of the candidate card along a line parallel with another edge of the candidate card. The length and the width of the candidate card may be determined along respective lines which are substantially perpendicular to each other.

In some embodiments, the sizing process may account for distortion of the image data, such as a result of the intrinsic parameters of the camera 126. As described, in some embodiments, the client device 11 may have associated software for pre-processing the image data (prior to its processing by the aforementioned client-side program) to reduce the effects of distortion of the image data due to the intrinsic properties of the camera 126.

In some embodiments, one or more other dimensions may be determined. This may include one or more diagonal dimensions between corners of the candidate card.

These dimensions may be represented, at this stage of the sizing process, in terms of a number of pixels and are, in any event, lengths in the frame of reference of the card contour data (which may also be the same frame of reference as the image data and/or segmented image data 408). This will be referred to as a pixel length, for example.

The card 200 is of predetermined dimensions -as described herein.

Therefore, a translation between dimensions within the image data (i.e. the pixel lengths) and the actual (i.e. real-world) dimensions of the card 200 may be determined by comparing the pixel lengths to the actual card 200 dimensions. This translation may be in the form of a pixel length to centimetre translation, for example. The translation may also be referred to as a scaling factor, for example.

As will be appreciated, there may be various sources of error in the determining of this translation in accordance with the sizing process. For example, the card 200 in the image data is unlikely to be perfectly parallel with a focal plane of the camera 126, occluded parts of the card 200 may have resulted in pixel length inaccuracies, and intrinsic properties of the camera 126 may have distorted the image data in one or more axes.

The sizing process may, therefore, use averaging in the generation of the translation. This averaging may be, for example, determining a length of the candidate card 200 at multiple locations and averaging the pixel lengths. This averaging may be, for example, determining a width of the candidate card 200 at multiple locations and averaging the pixel lengths. These averages may, therefore, be performed prior to determining the translation.

This averaging may be, for example, an average of the translation itself. So, for instance, the averaging may include generating a translation based on at least two of: a determined length of the candidate card as a pixel length, a determined width of the candidate card as a pixel length, and/or a determined diagonal length of the candidate card as a pixel length.

The translations may then be averaged.

The average may be a mean, median, or modal value.

The translation may then be stored in association with the image data, and/or segmented 5 image data 408, and/or the card contour data. This may be on the computer readable medium 123 of the client device 12 or on the computer readable medium 113 of the server 11, for example.

The pixel dimensions of the candidate card may be used in an error detection process. 10 The error detection process may be undertaken in accordance with instructions which are executable by a processor -such as the or each processor 111 of the server and/or the or each processor of the client device 121.

In accordance with the error detection process, two or more of the pixel dimensions may be compared in order to determine a likely error in the orientation of the card 200 with respect to the focal plane of the camera 126, for example.

If the error detection process identifies that the card 200 is not sufficiently aligned with the focal plane of the camera 126, then the error detection process may be configured to generate an error message and the error message may include an indication of a corrective action to be taken by the user -which may be presented as one of the one or more instructions 1001, for example. The or each such instruction 1001 may include an instruction regarding the position and/or orientation of the card 200, and/or an instruction to recapture the image (and so to generate new image data). The or each instruction 1001 may be a re-iterated instruction as described.

As a result, therefore, it will be understood that a translation may be generated which enables an actual distance to be determined in relation to objects within the plane of the card 200 from a pixel length measured within the image data. Furthermore, instructions 1001 may be provided based on analysis of the image data to enable the user to correct the positioning of the card 200 and camera 126 so that this translation can be accurately determined.

Furthermore, as part of this process, the location of the card 200 within the image data 5 may also be determined.

With reference to figure 7, for example, and with the description herein in mind, some embodiments, may include the capture of image data 301 by a user, receipt 302 of the original image data from the user, the segmentation and classification 303 of the image data, the detection of edges from the segmented image data 304, and the determining of a translation (i.e. scaling factor) 305. The original image data and/or the translation may be stored in a database 307. Censored image data (see herein) may be generated 306 and this may also be stored in the database 307.

Embodiments include methods, therefore, for determining the location of an object, such as the card 200, within the image data. Embodiments may also include using the object, such as the card 200, to determine the translation between a length within the image data frame of reference and the actual length -which may be an object length (such as a dimension of the card 200) or a length of another object or feature (such as a wound 300) within the image data. Embodiments may include determining the translation between a length within the image data frame of reference (e.g. as a number of pixels) and an actual length in a plane defined by the object used (such as the card 200).

The processes of locating the card 200 within the image data and determining the translation may be performed by the client device 12, or by the server 11, or by a combination of both devices 12,11. In these processes, of some embodiments, the user is not required to identify or draw a boundary around the object (such as the card 200) and the identification of the object within the image data is substantially automatic.

In some embodiments, a censored image may be generated -see figure 6. The censored image may be defined by censored image data. The censored image data may be stored in association with the image data, and/or segmented image data 408, and/or the card contour data (and/or the translation). This may be on the computer readable medium 123 of the client device 12 or on the computer readable medium 113 of the server 11, for example.

The censored image data may include all or part of the image data. However, image data relating to the card 200 -e.g. that defined by the candidate card contour -may be blurred or otherwise distorted.

The censored image data may be generated using the card contour data, for example, to define a boundary of the card 200 and then modifying the image data within that boundary. The censored image data may be generated using the segmented image data 408, for example, to define parts of the image data which relate to the card 200 and then modifying the image data which relates to the card 200.

The one or more account specific interfaces may include one or more interfaces for the addition of further information associated with a case to which the image data relates. These may be referred to as, for example, further information interfaces and may be presented to the user using the display screen 125. The or each further information interface may prompt the user to enter specific information regarding the case -this may include, time, date, and/or location information and may include identification information for parties involved (e.g. a name and/or address and/or a description of someone else involved). The prompt may include requests for information about medical treatment, for example, and/or the value of property damage and/or a description of an incident associated with the evidence (to which the image data relates), and/or additional medical records, and/or other evidence. Each prompt may be associated with a field into which the user can enter information using the user input apparatus 125a, for example. One or more of the prompts may be associated by an upload option which, when selected, presents to the user a file selection user interface which allows a user (e.g. using the user input apparatus 125a) to select one or more files to associate with the case (e.g. with the image data). In some embodiments, there is a prompt for free-form text not otherwise covered by the more specific prompts for information from the user. In some embodiments, the prompts may be accompanied by multiple choice selection options and the user input, via the user input apparatus 125a may be a selection of one or more of the multiple choice options. The or each multiple choice option may include an injury categorisation -e.g. a stab wound 300, a bruise, or the like. The or each multiple choice option may include an evidence type such as a fingerprint, shoe mark, bloodstain, fire damage, associated object, weapon, digital media, or alike.

One or more of the account specific interfaces may include a thumbnail image representative of the image data (or censored image data) already captured. In some embodiments, there may be one thumbnail associated with each of a plurality of images captured by the user and represented by image data (or censored image data). One or more of such interfaces may include user selectable options (e.g. through the use of the user input apparatus 125a) to add, remove, or view an image of the plurality of images (which may be an image as defined by the censored image data, for example).

The one or more account specific interfaces may include a case summary interface which may, for example, list the evidence provided and provide prompts for the user to add additional information.

The one or more account specific interfaces (such as the case summary interface) may include an option to add contributors to the case. When selected by a user (e.g. using the user input apparatus 125a) the user may be presented with a contacts interface which lists one or more other users of the system 1 who may be linked to the current user. A link may be established through an invitation and acceptance approach -in which a link invitation is sent by a first user to a second user, acceptance of the link invitation by the second user will then cause the first and second users to be linked.

Link invitations may be sent, for example, through a separate communication system -such as email. Other invitation techniques can also be used.

A contributor may then be presented with interface screens in much the same manner but may, for example, select a different role -such as a witness. The contributor may not be presented with the image capture interface 100 but may be presented with any others of the interfaces described herein.

The data which is stored in accordance with embodiments may be organised by case (using, for example, a suitable identifier for the case). A case may, therefore, be associated with different data and that data may all concern the same case but may have been input by one or more different users.

The data may include metadata and this metadata may include metadata associated with image data, censored image data, segmented image data 408, and/or card contour data. Metadata associated with the original image data may be copied to each dataset at its creation and may be supplemented during each processing stage. Accordingly, the metadata provides information which links the datasets and which can be used to determine that the correct dataset is being considered and/or can be used to help to determine if there has been tampering with any of the datasets (e.g. through the identification of inconsistent metadata).

In some embodiments, all datasets which are generated based on particular image data are stored, along with the original image data.

A case stored in accordance with embodiments described herein may be saved in draft form, with the datasets stored the computer readable medium 123 of the client device 12 or on the computer readable medium 113 of the server 11, for example.

When a user is ready to submit the case to a law enforcement agency or authority, for example, the user may select the case using an interface of the one or more account-specific interfaces and then select an option to submit the case.

On submission, the case (and the associated datasets) may be made available to a law enforcement agency or authority. In some embodiments, users from the law enforcement agency or authority may access the case with the case stored on the computer readable medium 113 of the server 11. In some embodiments, the case -including one or more of the datasets thereof -may be transmitted to a computing device of the law enforcement agency or authority.

The law enforcement agency or authority may access the case and may perform one or more investigatory tasks in relation thereto.

In some embodiments, a law enforcement officer may be able to access the case through a user interface on a client device 12, generally as described, by the selection of the law enforcement officer role.

In some embodiments, an annotation tool 500 is provided -see figure 6 which shows an example tool 500. The annotation tool 500 may be undertaken in accordance with instructions which are executable by a processor -such as the or each processor 111 of the server and/or the or each processor of the client device 121.

The annotation tool 500 may provide an interface by which the image data associated with a case can be accessed. This may be original image data and/or may be censored image data, for example.

The annotation tool 500 is configured to present (i.e. render) to a user on a display screen 25 (which may be the display screen 125) the image represented by the image data (or censored image data) The interface may include one or more annotation selection options which the user is able to select in order to apply respective annotations to the image data (and/or censored image data). The user may, if using a client device 12, use the user input apparatus 125a to make the selection, for example.

The annotations include length indicator (see A and B in figure 6). For example, the user may identify a start point for the length and an end point, the interface may then use the translation (as described herein) to convert a pixel length distance between the start point and the end point to determine an actual distance. This actual distance may then be indicated in the user interface. For example, the actual distance may be overlaid on the image as presented (i.e. rendered) or may be located in a separate part of the interface. The start and end point may be indicated in the user interface and this indication may be overlaid on the image as presented (i.e. rendered), for example. In some embodiments, a line is also indicated (in the same manner) between the start and end points -indeed, the start and end points may be indicated by the respective ends of such a line.

In some embodiments, the start and/or end points may be moved and this may cause the actual distance to be recalculated based on the pixel length. The moving of the start and/or end point may be through a drag-and-drop action (e.g. with the user using the user input apparatus 125a). The indication of the actual distance may be updated in substantially real-time as the start and/or end point changes location. Indeed, in some embodiments, the actual distance is updated in substantially real-time as the user may, for example, move a cursor (using the user input apparatus 125a) to set the end point.

In some embodiments, other annotations may be possible. For example, the annotation interface may include a user selectable option to overlay a grid (as shown in figure 6, for example) on the image data (or censored image data). The spacing of the grid lines of the overlaid grid may be defined in terms of actual distances (determined using the pixel distances and the translation, as described). The annotation interface may include an operation of the distance between the grid lines to be changed (e.g. between 0.5cm, 1.0cm, 1.5cm, and/or 2cm, for example) and/or for an orientation of the grid to be changed.

The annotation interface may also, or alternatively, provide the user with options to zoom into parts of the image data (or censored image data) and, in particular, a rendering thereof. The scaling factor may be adjusted to take into account any zoom option which is being used.

The annotation interface may also, or alternatively, provide the user with options to crop parts of the image data (or censored image data) and, in particular, a rendering thereof.

The cropped image may be adjusted automatically to fill a predetermined space defined by the annotation interface. This is, as will be appreciated, a zoom function and, again, this may cause the scaling factor to be adjusted accordingly.

The annotation interface may also, or alternatively, provide the user with options to rotate the image data (or censored image data) and, in particular, a rendering thereof.

The annotation interface may be configured, as will be understood, to allow a plurality of these options to be applied in sequence.

The annotation interface may also, or alternatively, provide the user with options to undo or redo annotations (and/or the changes implemented by the options mentioned above).

The annotation interface may also, or alternatively, provide the user with options to 20 generate censored image data from image data -using the techniques described herein.

The annotation interface may provide an option for the user to save the annotated image. The annotated image may be saved as annotated image data (which may be accompanied by metadata, which may mirror the metadata associated with the image data (and which may include additional information regarding the annotation)). The annotated image may be stored, as annotated image data, on the computer readable medium 113 of the server 11, for example (and/or the computer readable medium 123 of the client device 12 if the annotation tool 500 is being used on the client device 12).

Some or all of the data stored in accordance with embodiments may be extracted and/or used in the automatic population of submissions in legal proceedings, for example.

As will be appreciated, therefore, embodiments provide mechanisms by which evidence can be securely and easily recorded. Embodiments also provide image data in relation to which scale information for what is shown in the image data is determined and may be used to annotate the image data. This process entails image processing and manipulation.

When used in this specification and claims, the terms "comprises" and "comprising" and variations thereof mean that the specified features, steps or integers are included. The terms are not to be interpreted to exclude the presence of other features, steps or components.

The features disclosed in the foregoing description, or the following claims, or the accompanying drawings, expressed in their specific forms or in terms of a means for performing the disclosed function, or a method or process for attaining the disclosed result, as appropriate, may, separately, or in any combination of such features, be utilised for realising the invention in diverse forms thereof.

Although certain example embodiments of the invention have been described, the scope of the appended claims is not intended to be limited solely to these embodiments. The claims are to be construed literally, purposively, and/or to encompass equivalents.

Claims

Claims 1. An image processing method including: receiving image data in relation to an object; segmenting the image data into data representing the object or a part thereof, to form segmented image data; determining one or more edges of the object from the segmented image data; determining a length, in a frame of reference of the image data, of the object using the or each edge; and using the length of the or each edge in the frame of reference of the image data and a predetermined dimension of the object to determine a translation between lengths within the frame of reference of the image and a length of the object.
2. A method according to claim 1, further including: using the translation to determine the length of another feature represented by the image data and annotating the image data with the determined length of the other feature.
3. A method according to claim 2, wherein annotating the image data includes overlaying the determined length of the other feature on the image data.
A method according to claim 2 or 3, wherein the other feature is a wound.
5. A method according to any preceding claim, wherein determining one or more edges of the object from the segmented image data includes determining a plurality of edges of the object, and the method further includes: determining one or more further dimensions of the object from the plurality of edges of the object; determining an orientation of the object by comparing the length and the one or more further dimensions with corresponding predetermined dimensions of the object; and generating an instruction for movement of the object to a predetermined orientation based on the determined orientation.
6. A method according to claim 5, wherein the image data is captured by a camera and the predetermined orientation is a predetermined orientation with respect to a focal plane of the camera.
7. A method according to any preceding claim, wherein determining a length, in a frame of reference of the image data, of the object using the or each edge includes extrapolating one or more edges of the object and determining a contour of a candidate object in the image data.
8. A method according to any preceding claim, further including modifying the image data within the contour of the candidate object in the image data, or which is segmented as part of the object, so as blur the image of the object.
9. A method according to any preceding claim, wherein the object is a card complying with the ID-1 size specification in ISO/IEC 7810:2003.
10. A method according to any preceding claim, further including classifying the image data independently of the segmenting of the image data. 20
11. A method according to claim 10, wherein classifying the image data includes determining a bounding box for the object using the image data and generating a confidence factor based on the bounding box, the confidence factor being indicative of the likelihood of the bounding box defining a part of the image data representing the object, and wherein the method further includes applying a threshold to the confidence factor to reject image data defined by the bounding box from one or more further processing steps when the confidence factor does not meet the threshold.
12. A computer readable medium storing instructions which, when executed by a processor, cause the method of any preceding claim.
13. A device configured to: receive image data in relation to an object; segment the image data into data representing the object or a part thereof, to form segmented image data; determine one or more edges of the object from the segmented image data; determine a length, in a frame of reference of the image data, of the object using the or each edge; and use the length of the or each edge in the frame of reference of the image data and a predetermined dimension of the object to determine a translation between lengths within the frame of reference of the image and a length of the object.
14. A device according to claim 13, further configured to: use the translation to determine the length of another feature represented by the image data and annotating the image data with the determined length of the other feature.
15. A device according to claim 14, wherein annotating the image data includes overlaying the determined length of the other feature on the image data.
16. A device according to any of claims 13 to 15, wherein the other feature is a wound. 20
17. A device according to any of claims 12 to 16, wherein determining one or more edges of the object from the segmented image data includes determining a plurality of edges of the object, and the device is further configured to: determine one or more further dimensions of the object from the plurality of edges of the object; determine an orientation of the object by comparing the length and the one or more further dimensions with corresponding predetermined dimensions of the object; and generate an instruction for movement of the object to a predetermined orientation based on the determined orientation.
18. A device according to claim 17, wherein the image data is captured by a camera and the predetermined orientation is a predetermined orientation with respect to a focal plane of the camera.
19. A device according to any of claims 13 to 18, wherein determining a length, in a frame of reference of the image data, of the object using the or each edge includes extrapolating one or more edges of the object and determining a contour of a candidate object in the image data.
20. A device according to any of claims 13 to 19, further configured to modify the image data within the contour of the candidate object in the image data, or which is segmented as part of the object, so as blur the image of the object.
21. A device according to any of claims 13 to 20, wherein the object is a card complying with the ID-1 size specification in ISO/IEC 7810:2003.
22. A device according to any of claims 13 to 21, further configured to classify the image data independently of the segmenting of the image data.
23. A device according to claim 22, wherein classifying the image data includes determining a bounding box for the object using the image data and generating a confidence factor based on the bounding box, the confidence factor being indicative of the likelihood of the bounding box defining a part of the image data representing the object, and wherein the device is further configured to apply a threshold to the confidence factor to reject image data defined by the bounding box from one or more further processing steps when the confidence factor does not meet the threshold.