WO2022115045A1 - Video analytics for industrial safety - Google Patents
Video analytics for industrial safety Download PDFInfo
- Publication number
- WO2022115045A1 WO2022115045A1 PCT/SG2021/050739 SG2021050739W WO2022115045A1 WO 2022115045 A1 WO2022115045 A1 WO 2022115045A1 SG 2021050739 W SG2021050739 W SG 2021050739W WO 2022115045 A1 WO2022115045 A1 WO 2022115045A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- adverse event
- images
- respective zone
- server
- neural network
- Prior art date
Links
- 238000013528 artificial neural network Methods 0.000 claims abstract description 111
- 238000001514 detection method Methods 0.000 claims description 46
- 238000000034 method Methods 0.000 claims description 15
- 230000002411 adverse Effects 0.000 abstract description 29
- 238000004891 communication Methods 0.000 description 13
- 238000010276 construction Methods 0.000 description 6
- 238000012549 training Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 3
- 230000001681 protective effect Effects 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 239000003086 colorant Substances 0.000 description 2
- 230000006378 damage Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 241000282412 Homo Species 0.000 description 1
- 206010067482 No adverse event Diseases 0.000 description 1
- 208000027418 Wounds and injury Diseases 0.000 description 1
- 230000036626 alertness Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 208000014674 injury Diseases 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
- 230000003442 weekly effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
- H04N7/181—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Definitions
- the present invention relates, in general terms, to systems and methods for improving industrial safety using video analytics.
- the present disclosure relate to a system for detecting an adverse event, comprising: one or more image capture devices, each device being positioned to capture images of a respective zone; and a server comprising a plurality of neural networks, each neural network receiving, and analysing, at least a portion of said images to identify an object or objects in the respective zone, the server being configured to detect if the respective zone comprises an adverse event based on the identified object or objects; and an alert server for identifying, for each detected adverse event, one or more stakeholders for the respective zone and transmitting an alert to the one or more stakeholders, the alert comprising identification information for the detected adverse event.
- the one or more image capture devices may comprise a plurality of image capture devices. Each image capture device may be configured to send video feed comprising the images of the respective zone.
- Each image capture device may be positioned based on: a relationship between a boundary of the respective zone and a field of view of the respective image capture device; a distance from the respective zone, a resolution of the respective image capture device and a size of a smallest said object sought to be identified in the respective zone.
- a minimum distance of the image capture device from the respective zone may be determined based on the boundary being wholly within the field of view, and a maximum distance of the image capture device from the respective zone is determined based on the resolution enabling detection of the smallest said object at a furthest point in the field of view.
- Each neural network may be configured to identify a different object.
- At least one neural network may configured to detect a human, and wherein the server may be configured to if the respective zone comprises an adverse event if a human is identified by the at least one neural network.
- the server may be configured to detect if the zone comprises an adverse event by calculating a relationship between a human and another identified object.
- the identified object may comprise equipment for use by a human, the server being configured to calculate the relationship by determining if the equipment is present in the images with the at least one human.
- the server may be configured to detect that the respective zone comprises an adverse event when either: the equipment is absent from the images; or a human and the safety equipment are present in the images, but not in association with one another.
- the equipment and human may be considered to be in association with one another if the equipment is being used as it is intended to be used.
- the identification information may comprise at least a location of the adverse event and one or more said images in which the adverse event is visible.
- Each neural network may be configured to label said at least a portion of the images analysed by the respective neural network, based on the respective neural network identifying the object or objects.
- the identification information may comprise said label.
- the alert server may be configured to send the alert to the human, the identification information specifying the equipment that is sought to be identified in association with the human.
- the system may further comprise an intermediary layer, the intermediary layer specifying particular neural networks of the plurality of neural networks for identifying a predetermined said adverse event.
- the system may further comprise an intermediary layer, the intermediary layer specifying particular neural networks of the plurality of neural networks for use with images from each respective image capture device.
- the disclosure also relates to a method for detecting an adverse event, comprising: capturing one or more images using one or more capture devices positioned to capture images of a respective zone; and receiving at least a portion of said images at a plurality of neural networks, the neural networks being configured to identify an object or objects in the respective zone; detecting if the respective zone comprises an adverse event based on the identified object or objects identifying, for each detected adverse event, one or more stakeholders for the respective zone; and transmitting an alert to the one or more stakeholders, the alert comprising identification information for the detected adverse event.
- the disclosure also relates to a system for detecting an adverse event, comprising: one or more image capture devices, each device being positioned to capture images of a respective zone; and a server comprising a plurality of neural networks, each neural network receiving, and analysing, at least a portion of said images to identify an object or objects in the respective zone, the server being configured to detect if the respective zone comprises an adverse event based on the identified object or objects; and the server further identifying, for each detected adverse event, one or more stakeholders for the respective zone and transmitting an alert to the one or more stakeholders, the alert comprising identification information for the detected adverse event.
- Figure 1 illustrates a system for detecting an adverse event
- Figure 2 illustrates a method for detecting an adverse event
- Figure 3 illustrates an exemplary computer system for implementing the server or alert server of the system for detecting an adverse event.
- the disclosure relates to systems and methods for detecting adverse events. These systems and methods can be implemented in workplaces and other areas that have visually identifiable safety risks, to reduce reliance on the alertness of humans that would otherwise monitor those workplaces and other areas.
- the systems and methods rely on video analytics for adverse event detection.
- the disclosed systems advantageously enable the automatic detection of hazards or adverse events such as absence of barricades, use of unsuitable equipment, non use of necessary equipment and other such safety hazards at worksites.
- the disclosed systems also advantageously generate notifications regarding the detected hazards and transmit the notifications to relevant stakeholders or personnel enabling a timely response to the detected adverse event.
- System 100 comprises one or more image capture devices (cameras) 110, each device being positioned to capture images of a respective zone 105.
- the video analytics server 120 comprises a plurality of neural networks 127.
- Each neural network 127 analyses some of the images captured by the camera 110 to identify an object or objects in the respective zone 105 - e.g. each neural network may analyse a subset of the images, or a portion of an image or images in which a particular safety hazard is likely to occur.
- Each neural network 127 may also be configured to label at least a portion of the images analysed by the respective neural network 127, the label indicating a class identifier of the object identified by the neural network 127.
- the video analytics server Based on the inferences generated by the neural networks 127, the video analytics server identified various adverse events. Adverse events could be detected based on the presence or absence of objects in images captured by camera 120. Adverse events could also be detected based on the relative position of objects in images captured by camera 120.
- System 100 also comprises an alert server 130 for identifying, for each detected adverse event, one or more stakeholders for the respective zone and transmitting an alert to the one or more stakeholder computing device 140, the alert comprising identification information for the detected adverse event.
- the identification information comprises a label or class information associated with each object detected in images associated with the adverse event.
- the video analytics server 120 comprises at least one processor 122 in communication with memory 124.
- Memory 124 comprises a neural networks module 125, an image labelling module 128 and a user interface module 129.
- the alert server 130 comprises at least one processor 132 and a memory 134.
- Memory 134 comprises a stakeholder database 136.
- the stakeholder database 136 comprises information including identification of stakeholder computing device identifiers and associations between stakeholders and the one or more respective zones 105.
- the stakeholder database 136 allows the direction of alerts regarding adverse events to a specific stakeholder computing device 140 belonging to a stakeholder responsible for the safety of the respective zone.
- the alert server 130 and the video analytics server 120 could be implemented as a common server, with the common server performing the functionality of both the alert server 130 and the video analytics server 120.
- the stakeholder computing device 140 may include an end-user computing device such as a smartphone, or personal computing device accessible to a stakeholder responsible for one or more respective zones 105.
- Each camera 110 is positioned to capture images of a respective zone 105(1)-105(N). Each camera is configured to send a video feed comprising the images of the respective zone 105 to the video analytics server 120.
- the cameras 110 are positioned to capture images with sufficient level of detail or at a sufficient resolution to accurately detect adverse events in the respective zones.
- the cameras are also positioned such that their field of view covers all objects or areas of interest for detection of adverse events.
- Each camera 110 is positioned based on a relationship between a boundary of the respective zone 105 and a field of view of the respective camera 110. Also relevant for positioning of cameras 110 is the distance from the respective zone 105, a resolution of the respective camera 110 and a size of a smallest said object sought to be identified in the respective zone 105.
- a minimum distance of the camera 110 from the respective zone 105 is determined based on the boundary of an area of interest being wholly within the field of view.
- a maximum distance of the camera 110 from the respective zone 105 is determined based on the resolution enabling detection of the smallest said object at a furthest point in the field of view of the camera 110. For example, a camera 110 capturing images of a higher resolution is placed farther away from the observed area while a camera 110 capturing images of a lower resolution is placed closed to the observed area.
- the camera 110 may include an IP camera and it streams a video feed from the area of interest (respective zone 105).
- the respective zone 105 may correspond to a zone where high-risk activities may occur.
- the high- risk activities may relate to high-risk activities in construction sites.
- the video footage captured by the camera 110 is transmitted through a network 115 to a Network Video Recorder (NVR) 117.
- NVR Network Video Recorder
- a video analytics server 120 in communication with the Network Video Recorder 117 then processes the video feed with the video analytics algorithms to identify safety hazards. Based on the video analytics performed by the video analytics server 120 when a safety violation or adverse event is observed, an alert is sent to a relevant stakeholder computing device 140 by an alert server 130 operating in communication with the video analytics server 120.
- the alert could be transmitted using a messaging platform such as the Telegram messaging platform.
- the alert may consist of a short video of the detected violation or adverse event with details on the type of violation as well as the location at which the violation was detected. Details of the violations may be stored in the video analytics server 120 or the alert server 130 and may be used to generate dashboards and/or graphs to analyse construction site safety.
- the system 100 is directed to particular applications that are focused on behavioural safety and industrial safety.
- the system 100 also generates real-time or near real-time alerts which immediately notify stakeholders such as project managers or safety officers to take action that could potentially prevent a hazard from occurring.
- the video analytics server 120 comprises neural networks that are specifically trained with relevant and specific to construction images and hazards.
- the video analytics server 120 also provides a user interface through an image labelling module 128 that allows a user to demarcate or label zones of interest in images of a video feed captured by the camera for the algorithms for adverse event detection to work effectively.
- a Neural network 127 is configured to detect a human, and the video analytics server 120 could determine if the respective zone 105 comprises an adverse event if a human is identified by the neural network 127 in images of the respective zone 105.
- the video analytics server 120 could detect if zone 105 comprises an adverse event by calculating a relationship between a human and another identified object detected in images of the respective zone 105. Calculating the relationship between a human and another identified object includes calculating a perceived proximity or distance or overlap of the human to the other identified object based on the images of the respective zone 105.
- the other identified object may comprise equipment for use by a human.
- the video analytics server 120 calculates the relationship by determining if the equipment is present in the images with at least one human.
- the identified object may include a safety harness and the video analytics server 120 could determine whether the human is wearing the safety harness to detect the occurrence of an adverse event.
- the video analytics server 120 could detect that the respective zone 105 comprises an adverse event when either: a necessary equipment is absent from the images, or a human and the safety equipment are present in the images, but not in association with one another.
- the equipment and human may be considered to be in association with one another if the equipment is being used as it is intended to be used, for example a safety harness or a mask must be worn by a human.
- the video analytics server 120 operating in concert with the alert server 130 could also transmit an alert to the human detected in the adverse event as not operating safely, for example operating without the requisite safety equipment.
- the system 100 detects various adverse events in various industrial or work settings. The following section illustrates examples of some adverse events.
- System 100 could detect the presence of barricades and triggers sending of an alert when a barricade isn't present when it is supposed to be.
- the alert can be sent to notify workers or supervisors of workers that the worksite is unsafe and a barricade has not been detected. Detection of the presence of platform ladders
- Platform ladders have a larger surface area such that workers have a more stable area to step onto before performing a task.
- the system 100 could detect the presence of a platform ladder to evaluate whether a safer option of a ladder is chosen and identify the occurrence of an adverse event if a platform ladder is not detected.
- A-Frame ladders have a small area to work on and may cause workers to fall off while using the ladder, therefore the system 100 could detect if the right type of ladder is used and identify the occurrence of an adverse event if an unsuitable ladder such as an A-Frame ladder is used.
- Excavators are dangerous vehicles in the construction sites and system 100 could detect where excavators are located within a worksite to identify if work is conducted safely and determine the occurrence of adverse events. For example, the detection of an excavator in a restricted zone on a worksite may be identified as an adverse event.
- the system 100 could detect the non-use or improper use of safety harnesses and identify such non-use or improper use as an adverse event.
- the alert server 130 comprises a neural network module 125 implementing deep learning algorithms that process the video feed or images captured by the camera 110 to detect the occurrence of adverse events.
- the neural network module 125 comprises a plurality of neural networks 127.
- Each neural network 127 could detect a specific object, such as a person or a safety harness or a specific kind of equipment.
- the use of neural networks implementing deep learning algorithms provides the flexibility of detection of adverse events associated with various use-cases mentioned above and a high level of accuracy of the detection.
- the neural network module 125 comprises additional logic or additional neural networks such as the intermediary layer 126 embodying the logic to determine the occurrence of adverse events.
- Each object detection neural network 127 detects a specific object in an image in a video feed captured by the camera 110.
- the object detection neural network 127 could also determine a bounding box around the detected object in an image.
- the multiple neural networks 127 detect multiple objects of interest in a single image and define a bounding box around each detected object.
- the relative position of the detected objects as demarcated by the respective bounding boxes is analysed by the neural network module 125 or other logic implemented by the video analytics server 120 to determine the occurrence of an adverse event.
- a first neural network 127 could detect a person and a second neural network 127 could detect safety harnesses in a common image. If the bounding boxes identified around the detected person and the safety harness sufficiently overlap, then this overlap is inferred by the neural network module 125 as a safe operation, i.e. no adverse event is detected.
- the neural network module 125 is trained using a training dataset comprising images with labelled objects and bounding boxes or segments defining objects of interest in the training images.
- the training dataset also includes a label associated with each training image indicating whether the training image corresponds to an adverse event or not.
- the various neural networks of the neural network module 125 are trained to both detect objects and to classify or identify adverse events based on the examples in the training dataset.
- This system 100 addresses several critical industrial safety related problems. Firstly, every construction site needs safety officers for observing the safety of the operations. A limited number of trained safety officers may be available to fully monitor a worksite. The safety officers may spend a significant amount of time ensuring safety regulations are complied with instead of focusing on their core task of detection of adverse events. Adverse event detection by the system 100 reduces the workload or responsibilities a safety officer is typically responsible for.
- Figure 2 illustrates a method 200 for detecting an adverse event that may be performed by the system 100 of Figure 1.
- Step 210 Capture Video
- Step 210 comprises capturing one or more images using one or more capture devices 110 positioned to capture images of a respective zone 105.
- One or more images are captured in the form of a video.
- the video is captured through a video acquisition module which is in the form of a hardware device or camera 110 that is used to capture video.
- the hardware module consists of a camera that may be placed at an angle of inclination of between 30 degrees to 80 degrees and overlooks a region of high risk in an industrial setting or a worksite. The camera is placed at a distance far enough to view the entire or a substantial part of the high- risk activity that could potentially occur.
- the video streams are captured from multiple hardware modules of multiple image capture devices 110 overlooking various types of high- risk activities in industrial sites. They are continually streamed to the video server 120 which performs the analytics.
- the video streams from various sources are stored in the network video recorder 117.
- the video analytics server 120 could query the network video recorder 117 to retrieve video streams associated with a specific camera 110 and captured over a specific period on a particular date.
- the video streams stored in the network video recorder 117 could be queried on an ad-hoc basis by the stakeholder computing device 140 and the results of the query are transmitted by the Video analytics server to the stakeholder computing device 140.
- the video feeds captured by the camera 110 could be streamed in real-time or near real-time to the stakeholder computing device 140.
- the video stream captured by the camera 110 may be of a resolution equal to or greater than 480p.
- the video feed may be captured at any predefined FPS in an H.264 format.
- hardware modules that capture these kinds of video streams include IP cameras (Hikvision, Dahua, Pelco, etc.), mobile phones, and wearable cameras, etc.
- Step 220 Receiving at least a portion of said images at a plurality of neural networks to detect objects of interest
- the at least a portion of the images captured by the cameras 110 are received at the processor 122 of the video analytics server 120 and processed by the neural networks embodied in the neural network module 125 to identify an object or objects in the respective zone.
- the video streams originating from the camera 110 are transmitted over a 3G/4G/5G or a WIFI or a wired network to the video analytics server 120 located physically on-site or in the cloud.
- the video stream may comply with the Real-Time Streaming Protocol (RTSP), Flypertext Transfer Protocol Live Streaming (HLS) Protocol, Session Description Protocol (SDP), or Audio Video Interleave (AVI) Protocol, etc.
- RTSP Real-Time Streaming Protocol
- HLS Flypertext Transfer Protocol Live Streaming
- SDP Session Description Protocol
- AVI Audio Video Interleave Protocol
- the intermediary layer 126 specificities one or more neural networks 127 of the plurality of neural networks for identifying an adverse event.
- the video analytics server 120 comprise the following 14 custom trained exemplary neural networks 127:
- This neural network is trained on several custom obtained images from various open-source locations and photos from various industrial sites.
- the neural network detects the presence of an excavator from any angle and at a distance more than 80m away.
- This neural network is trained to detect the presence of workers at any location in the industrial site video frame. 3) Neural network for Detection of Personal Protective Equipment
- This neural network is trained on a dataset that allows the detection of the helmet, safety vest, boots, gloves, and goggles present on a worker detected in the industrial site.
- This neural network enables the detection of the presence of barricades of different colours in the zone of interest which is defined by the front- end user interface of the video analytics platform 6) Neural network for Detection of guardrails
- This neural network enables the detection of the presence of guardrails of different colours in the zone of interest which is defined by the front- end user interface of the video analytics platform 7) Neural network for Detection of A-Frame ladders
- This neural network enables the detection of A-frame ladders present in the video stream.
- This neural network enables the detection of platform ladders present in the video stream.
- This neural network enables the detection of various types of traffic conditions present in the industrial site, dense, and sparse traffic as well as detects accidents. 10) Neural network for Detection of stagnant water
- This neural network enables the detection of the presence of stagnant water in the video stream.
- This neural network enables the detection of the presence of potholes in the video stream.
- This neural network enables the detection of the presence of housekeeping hazards such as cluttered piping, and over-stacked materials. 13) Neural network for Detection of forklifts
- This neural network is trained on several custom obtained images from various open-source locations and photos from various industrial sites.
- the neural network detects the presence of a forklift from any angle and at a distance more than 80m away.
- This neural network enables the detection of masks present on or worn by workers.
- the above 14 neural networks 127 are utilized in combination with one another to identify safety risks present in the industrial site.
- a front-end user interface provided by the user interface module 129 allows a user to select images captured by a specific camera 110 and view which of the above-noted safety hazards have been detected.
- Step 230 Detecting if the respective zone comprises an adverse event based on the identified object or objects
- the video analytics server 120 detects if the respective zone comprises an adverse event based on the object or objects identified by the neural networks 127.
- the intermediary layer 126 between the input layer receiving the images captured by the cameras 110 and the 14 neural networks acts as the channel through which the safety hazard details are broken down and the relevant neural network is activated to identify the presence of a specific safety hazard.
- the following is a list of examples of safety hazards or adverse events:
- the neural networks 127 work in cohesion to identify the presence of each of the above exemplary safety hazards or adverse events and other such adverse events. Each detected safety hazard or adverse event contribute to a safety score level which is all added together to provide an overall industrial site safety score.
- Step 240 Identifying, for each detected adverse event, one or more stakeholders for the respective zone
- Memory 124 of the video analytics server 120 comprises a stakeholder database 132.
- the stakeholder database 132 could be stored on a computing device other than the video analytics server 120 (such as the alert server 130) and may be accessible to the video analytics server 120.
- the stakeholder database 132 comprises a mapping of the respective zones 105 with one or more stakeholder or stakeholder identities.
- the stakeholder database 132 also comprises information to enable the specific direction of alerts or messages to a particular stakeholder computing device 140.
- the video analytics server 120 identifies one or more stakeholders responsible for the zone in which the adverse event was detected.
- Step 250 Transmitting an alert to the one or more stakeholders
- an alert is transmitted to the identified stakeholders.
- the alert comprises identification information associated with the adverse event.
- the identification information comprises at least a location of the adverse event and one or more said images in which the adverse event is visible.
- the identification information could also comprise the label or class information associated with objects detected/not detected in relation to the identified adverse event.
- the alert is sent to the stakeholder computing device 140.
- the alert could be in the form of a telegram channel message or WhatsApp group message or WeChat message or Email or text message, for example.
- the alert comprises information regarding the adverse event enabling decision making in response to the detected adverse event by the stakeholder.
- the notification or alert comprises a zone identifying information in which the violation or adverse event is detected, the timing of the violation, a short video of the violation and the risk level of the violation.
- the alert could be transmitted by the alert server 130.
- the transmission of the alert by alert server 130 may be initiated by the video analytics server 120.
- Data regarding the violations or adverse events detected by the video analytics server 120 is stored in memory 124 or a storage device accessible to the video analytics server 120.
- Data regarding the adverse events is used to populate a dashboard of adverse events.
- the dashboard is served or made available to the stakeholder computing device 140 through the user interface module 129 of the video analytics server.
- the dashboard contains details of the violation or adverse event, location, timing, risk level and a video of the violation.
- the dashboard comprises graphs containing historical information of the site safety conditions, broken down on a daily, weekly, or monthly basis.
- a safety/unsafety score may be allocated to each detected adverse event.
- the safety scores may be added together to produce a safety score for an industrial site or worksite.
- Figure 3 illustrates an example computer system 300.
- the video analytics server 120, alert server 130 and the stakeholder computing device 140 may comprise one or more components described with reference to the computer system 300 to provide the requisite functionality.
- one or more computer systems 300 perform one or more steps of one or more methods described or illustrated herein.
- one or more computer systems 300 provide functionality described or illustrated herein.
- software running on computer system/systems 300 performs steps the methods described or illustrated herein or provides functionality described or illustrated herein.
- Particular embodiments include one or more portions of one or more computer systems 300.
- reference to a computer system may encompass a computing device, and vice versa, where appropriate.
- reference to a computer system may encompass one or more computer systems, where appropriate.
- computer system 300 may be an embedded computer system, a system-on-chip (SOC), a single-board computer system (SBC), a desktop computer system, a laptop or notebook computer system, a mesh of computer systems, a mobile telephone, a server, a tablet computer system, or a combination of two or more of these.
- computer system 300 may include one or more computer systems 300; be unitary or distributed; span multiple locations; span multiple machines; span multiple data centres; or reside in a cloud, which may include one or more cloud components in one or more networks.
- one or more computer systems 300 may perform without substantial spatial or temporal limitation one or more steps of one or more methods described or illustrated herein. As an example and not by way of limitation, one or more computer systems 300 may perform in real-time or in batch mode one or more steps of one or more methods described or illustrated herein. One or more computer systems 300 may perform at different times or at different locations one or more steps of one or more methods described or illustrated herein, where appropriate.
- Computer system 300 includes a processor 302, memory 304, storage 306, an input/output (I/O) interface 308, a communication interface 310, and a bus 312.
- processor 302 memory 304
- storage 306 storage 306
- I/O input/output
- communication interface 310 communication interface 310
- bus 312 bus 312.
- Processor 302 may include hardware for executing instructions, such as those making up a computer program. As an example and not by way of limitation, to execute instructions, processor 302 may retrieve (or fetch) the instructions from an internal register, an internal cache, memory 304, or storage 306; decode and execute them; and then write one or more results to an internal register, an internal cache, memory 304, or storage 306. Processor 302 may include one or more internal caches for data, instructions, or addresses. Instructions in the instruction caches of processor 302 may be copies of instructions in memory 304 or storage 306, and the instruction caches may speed up retrieval of those instructions by processor 302.
- Data in the data caches may be copies of data in memory 304 or storage 306 for instructions executing at processor 302 to operate on; the results of previous instructions executed at processor 302 for access by subsequent instructions executing at processor 302 or for writing to memory 304 or storage 306; or other suitable data.
- Processor 302 may include one or more arithmetic logic units (ALUs); be a multi-core processor; or include one or more processors 302. Although this disclosure describes and illustrates a particular processor, this disclosure contemplates any suitable processor.
- Memory 304 includes main memory for storing instructions for processor 302 to execute or data for processor 302 to operate on.
- computer system 300 may load instructions from storage 306 or another source (for example, another computer system 300) to memory 304.
- Processor 302 may then load the instructions from memory 304 to an internal register or internal cache.
- processor 302 may retrieve the instructions from the internal register or internal cache and decode them.
- processor 302 may write one or more results (which may be intermediate or final results) to the internal register or internal cache.
- Processor 302 may then write one or more of those results to memory 304.
- One or more memory buses (which may each include an address bus and a data bus) may couple processor 302 to memory 304.
- Bus 312 may include one or more memory buses.
- one or more memory management units (MMUs) reside between processor 302 and memory 304 and facilitate access to memory 304 requested by processor 302.
- MMUs memory management units
- Storage 306 may include mass storage for data or instructions.
- storage 306 may include a hard disk drive (HDD), a floppy disk drive, flash memory, an optical disc, a magneto-optical disc, magnetic tape, or a Universal Serial Bus (USB) drive or a combination of two or more of these.
- Storage 306 may include removable or non-removable (or fixed) media, where appropriate.
- Storage 306 may be internal or external to computer system 300, where appropriate. This disclosure contemplates mass storage 306 taking any suitable physical form.
- Storage 306 may include one or more storage control units facilitating communication between processor 302 and storage 306, where appropriate. Where appropriate, storage 306 may include one or more storage 306.
- I/O interface 308 includes hardware, software, or both, providing one or more interfaces for communication between computer system 300 and one or more I/O devices.
- Computer system 300 may include one or more of these I/O devices, where appropriate.
- One or more of these I/O devices may enable communication between a person and computer system 300.
- I/O interface 308 may include one or more device or software drivers enabling processor 302 to drive one or more of these I/O devices.
- I/O interface 308 may include one or more I/O interfaces 308, where appropriate.
- Communication interface 310 may include hardware, software, or both providing one or more interfaces for communication (such as, for example, packet-based communication) between computer system 300 and one or more other computer systems 300 or one or more networks.
- communication interface 310 may include a network interface controller (NIC) or network adapter for communicating with an Ethernet or other wire-based network or a wireless NIC (WNIC) or wireless adapter for communicating with a wireless network, such as a WI-FI network.
- NIC network interface controller
- WNIC wireless NIC
- Bus 312 may include hardware, software, or both coupling components of computer system 300 to each other.
- adverse event refers to an event giving rise to a potential for injury, equipment damage, idle worker(s) and other undesirable events.
- the intermediary layer governs the use of particular neural networks on images from particular image capture devices - i.e. cameras.
- This ensures neural networks used for detecting adverse events are limited to those relevant to one or both of (i) adverse events (e.g. fall hazards, human-equipment interactions and others) anticipated to occur in the respective zone for the particular image capture device, and (ii) adverse events expected to occur in the workplace in which the image capture device is located.
- adverse events e.g. fall hazards, human-equipment interactions and others
- adverse events expected to occur in the workplace in which the image capture device is located For example, for a camera positioned to detect proper use of harnesses on an elevated platform, there is no need to use a neural network trained to recognise heavy equipment such as excavators.
- the neural networks may be limited to those identifying personal protective equipment (PPE), forklifts and other potential adverse events occurring in that factory setting.
- PPE personal protective equipment
- a user may also specify particular hazards they wish to identify in the zone associated with a particular image capture
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Image Analysis (AREA)
Abstract
Systems for detecting adverse events. The system comprises a server comprising a plurality of neural networks for analysing images to identify an object or objects and detect if the respective zone comprises an adverse event based on the identified object or objects. The server identifies, for each detected adverse event, stakeholders and transmits an alert to the stakeholders.
Description
Video Analytics for Industrial Safety
Technical Field The present invention relates, in general terms, to systems and methods for improving industrial safety using video analytics.
Background Worksites such as construction, building and industrial sites have several safety hazards that present a risk to operational health and safety. Safety breaches may not only impact the health of workers but also the productivity of the worksite. The hazards may originate from various sources such as dangerous machinery or an inherently dangerous work environment such as a mining operation. One approach to addressing such safety hazards is to place IoT sensors on dangerous machinery and workers and trigger an alarm when a worker is in unsafe proximity to the dangerous machinery. The IoT sensor-based approach requires a large number of IoT sensors on large worksites. It further necessitates the calibration and management of a large number of IoT sensors and the networking infrastructure to support operation of the IoT sensors.
It would be desirable to overcome or alleviate at least one of the above- described problems, or at least to provide a useful alternative.
Summary
The present disclosure relate to a system for detecting an adverse event, comprising:
one or more image capture devices, each device being positioned to capture images of a respective zone; and a server comprising a plurality of neural networks, each neural network receiving, and analysing, at least a portion of said images to identify an object or objects in the respective zone, the server being configured to detect if the respective zone comprises an adverse event based on the identified object or objects; and an alert server for identifying, for each detected adverse event, one or more stakeholders for the respective zone and transmitting an alert to the one or more stakeholders, the alert comprising identification information for the detected adverse event.
The one or more image capture devices may comprise a plurality of image capture devices. Each image capture device may be configured to send video feed comprising the images of the respective zone.
Each image capture device may be positioned based on: a relationship between a boundary of the respective zone and a field of view of the respective image capture device; a distance from the respective zone, a resolution of the respective image capture device and a size of a smallest said object sought to be identified in the respective zone.
A minimum distance of the image capture device from the respective zone may be determined based on the boundary being wholly within the field of view, and a maximum distance of the image capture device from the respective zone is determined based on the resolution enabling detection of the smallest said object at a furthest point in the field of view.
Each neural network may be configured to identify a different object. At least one neural network may configured to detect a human, and wherein the server may be configured to if the respective zone comprises an adverse event if a human is identified by the at least one neural network.
The server may be configured to detect if the zone comprises an adverse event by calculating a relationship between a human and another identified object.
The identified object may comprise equipment for use by a human, the server being configured to calculate the relationship by determining if the equipment is present in the images with the at least one human.
The server may be configured to detect that the respective zone comprises an adverse event when either: the equipment is absent from the images; or a human and the safety equipment are present in the images, but not in association with one another.
The equipment and human may be considered to be in association with one another if the equipment is being used as it is intended to be used.
The identification information may comprise at least a location of the adverse event and one or more said images in which the adverse event is visible.
Each neural network may be configured to label said at least a portion of the images analysed by the respective neural network, based on the respective neural network identifying the object or objects. The identification information may comprise said label.
The alert server may be configured to send the alert to the human, the identification information specifying the equipment that is sought to be identified in association with the human.
The system may further comprise an intermediary layer, the intermediary layer specifying particular neural networks of the plurality of neural networks for identifying a predetermined said adverse event.
The system may further comprise an intermediary layer, the intermediary layer specifying particular neural networks of the plurality of neural networks for use with images from each respective image capture device.
The disclosure also relates to a method for detecting an adverse event, comprising: capturing one or more images using one or more capture devices positioned to capture images of a respective zone; and receiving at least a portion of said images at a plurality of neural networks, the neural networks being configured to identify an object or objects in the respective zone; detecting if the respective zone comprises an adverse event based on the identified object or objects identifying, for each detected adverse event, one or more stakeholders for the respective zone; and transmitting an alert to the one or more stakeholders, the alert comprising identification information for the detected adverse event.
The disclosure also relates to a system for detecting an adverse event, comprising:
one or more image capture devices, each device being positioned to capture images of a respective zone; and a server comprising a plurality of neural networks, each neural network receiving, and analysing, at least a portion of said images to identify an object or objects in the respective zone, the server being configured to detect if the respective zone comprises an adverse event based on the identified object or objects; and the server further identifying, for each detected adverse event, one or more stakeholders for the respective zone and transmitting an alert to the one or more stakeholders, the alert comprising identification information for the detected adverse event.
Brief description of the drawings
Embodiments of the present invention will now be described, by way of non-limiting example, with reference to the drawings in which:
Figure 1 illustrates a system for detecting an adverse event;
Figure 2 illustrates a method for detecting an adverse event; and
Figure 3 illustrates an exemplary computer system for implementing the server or alert server of the system for detecting an adverse event.
Detailed description
The disclosure relates to systems and methods for detecting adverse events. These systems and methods can be implemented in workplaces and other areas that have visually identifiable safety risks, to reduce
reliance on the alertness of humans that would otherwise monitor those workplaces and other areas. The systems and methods rely on video analytics for adverse event detection. The disclosed systems advantageously enable the automatic detection of hazards or adverse events such as absence of barricades, use of unsuitable equipment, non use of necessary equipment and other such safety hazards at worksites. The disclosed systems also advantageously generate notifications regarding the detected hazards and transmit the notifications to relevant stakeholders or personnel enabling a timely response to the detected adverse event.
One such system 100 is shown in Figure 1. System 100 comprises one or more image capture devices (cameras) 110, each device being positioned to capture images of a respective zone 105. The video analytics server 120 comprises a plurality of neural networks 127. Each neural network 127 analyses some of the images captured by the camera 110 to identify an object or objects in the respective zone 105 - e.g. each neural network may analyse a subset of the images, or a portion of an image or images in which a particular safety hazard is likely to occur. Each neural network 127 may also be configured to label at least a portion of the images analysed by the respective neural network 127, the label indicating a class identifier of the object identified by the neural network 127. Based on the inferences generated by the neural networks 127, the video analytics server identified various adverse events. Adverse events could be detected based on the presence or absence of objects in images captured by camera 120. Adverse events could also be detected based on the relative position of objects in images captured by camera 120.
System 100 also comprises an alert server 130 for identifying, for each detected adverse event, one or more stakeholders for the respective zone and transmitting an alert to the one or more stakeholder computing
device 140, the alert comprising identification information for the detected adverse event. The identification information comprises a label or class information associated with each object detected in images associated with the adverse event. The video analytics server 120 comprises at least one processor 122 in communication with memory 124. Memory 124 comprises a neural networks module 125, an image labelling module 128 and a user interface module 129. The alert server 130 comprises at least one processor 132 and a memory 134. Memory 134 comprises a stakeholder database 136. The stakeholder database 136 comprises information including identification of stakeholder computing device identifiers and associations between stakeholders and the one or more respective zones 105. The stakeholder database 136 allows the direction of alerts regarding adverse events to a specific stakeholder computing device 140 belonging to a stakeholder responsible for the safety of the respective zone. The alert server 130 and the video analytics server 120 could be implemented as a common server, with the common server performing the functionality of both the alert server 130 and the video analytics server 120. The stakeholder computing device 140 may include an end-user computing device such as a smartphone, or personal computing device accessible to a stakeholder responsible for one or more respective zones 105.
Each camera 110 is positioned to capture images of a respective zone 105(1)-105(N). Each camera is configured to send a video feed comprising the images of the respective zone 105 to the video analytics server 120.
The cameras 110 are positioned to capture images with sufficient level of detail or at a sufficient resolution to accurately detect adverse events in the respective zones. The cameras are also positioned such that their
field of view covers all objects or areas of interest for detection of adverse events. Each camera 110 is positioned based on a relationship between a boundary of the respective zone 105 and a field of view of the respective camera 110. Also relevant for positioning of cameras 110 is the distance from the respective zone 105, a resolution of the respective camera 110 and a size of a smallest said object sought to be identified in the respective zone 105.
A minimum distance of the camera 110 from the respective zone 105 is determined based on the boundary of an area of interest being wholly within the field of view. A maximum distance of the camera 110 from the respective zone 105 is determined based on the resolution enabling detection of the smallest said object at a furthest point in the field of view of the camera 110. For example, a camera 110 capturing images of a higher resolution is placed farther away from the observed area while a camera 110 capturing images of a lower resolution is placed closed to the observed area.
The camera 110 may include an IP camera and it streams a video feed from the area of interest (respective zone 105). The respective zone 105 may correspond to a zone where high-risk activities may occur. The high- risk activities may relate to high-risk activities in construction sites. The video footage captured by the camera 110 is transmitted through a network 115 to a Network Video Recorder (NVR) 117. The Network Video
Recorder 117 may be located on the same site that is being monitored by system 100. A video analytics server 120 in communication with the Network Video Recorder 117 then processes the video feed with the video analytics algorithms to identify safety hazards. Based on the video analytics performed by the video analytics server 120 when a safety violation or adverse event is observed, an alert is sent to a relevant stakeholder computing device 140 by an alert server 130 operating in
communication with the video analytics server 120. The alert could be transmitted using a messaging platform such as the Telegram messaging platform.
The alert may consist of a short video of the detected violation or adverse event with details on the type of violation as well as the location at which the violation was detected. Details of the violations may be stored in the video analytics server 120 or the alert server 130 and may be used to generate dashboards and/or graphs to analyse construction site safety.
The system 100 is directed to particular applications that are focused on behavioural safety and industrial safety. The system 100 also generates real-time or near real-time alerts which immediately notify stakeholders such as project managers or safety officers to take action that could potentially prevent a hazard from occurring. The video analytics server 120 comprises neural networks that are specifically trained with relevant and specific to construction images and hazards. The video analytics server 120 also provides a user interface through an image labelling module 128 that allows a user to demarcate or label zones of interest in images of a video feed captured by the camera for the algorithms for adverse event detection to work effectively. A Neural network 127 is configured to detect a human, and the video analytics server 120 could determine if the respective zone 105 comprises an adverse event if a human is identified by the neural network 127 in images of the respective zone 105. The video analytics server 120 could detect if zone 105 comprises an adverse event by calculating a relationship between a human and another identified object detected in images of the respective zone 105. Calculating the relationship between a human and another identified object includes calculating a perceived proximity or distance or overlap of the human to the other identified
object based on the images of the respective zone 105. The other identified object may comprise equipment for use by a human. The video analytics server 120 calculates the relationship by determining if the equipment is present in the images with at least one human. For example, the identified object may include a safety harness and the video analytics server 120 could determine whether the human is wearing the safety harness to detect the occurrence of an adverse event.
The video analytics server 120 could detect that the respective zone 105 comprises an adverse event when either: a necessary equipment is absent from the images, or a human and the safety equipment are present in the images, but not in association with one another. The equipment and human may be considered to be in association with one another if the equipment is being used as it is intended to be used, for example a safety harness or a mask must be worn by a human. The video analytics server 120 operating in concert with the alert server 130 could also transmit an alert to the human detected in the adverse event as not operating safely, for example operating without the requisite safety equipment. The system 100 detects various adverse events in various industrial or work settings. The following section illustrates examples of some adverse events.
Detection of the presence of barricades
Barricades are present in industrial settings to prevent workers from falling out of heights. System 100 could detect the presence of barricades and triggers sending of an alert when a barricade isn't present when it is supposed to be. The alert can be sent to notify workers or supervisors of workers that the worksite is unsafe and a barricade has not been detected.
Detection of the presence of platform ladders
Platform ladders have a larger surface area such that workers have a more stable area to step onto before performing a task. The system 100 could detect the presence of a platform ladder to evaluate whether a safer option of a ladder is chosen and identify the occurrence of an adverse event if a platform ladder is not detected.
Detection of the presence of A-Frame ladders
A-Frame ladders have a small area to work on and may cause workers to fall off while using the ladder, therefore the system 100 could detect if the right type of ladder is used and identify the occurrence of an adverse event if an unsuitable ladder such as an A-Frame ladder is used.
Detection of the presence of excavators
Excavators are dangerous vehicles in the construction sites and system 100 could detect where excavators are located within a worksite to identify if work is conducted safely and determine the occurrence of adverse events. For example, the detection of an excavator in a restricted zone on a worksite may be identified as an adverse event.
Detection of the presence of safety harness
Workers working at heights always need to have safety harness on. Working without a harness may cause them to fall from heights. The system 100 could detect the non-use or improper use of safety harnesses and identify such non-use or improper use as an adverse event.
Once any of the safety hazards based on the described applications or other such applications are detected by the video analytics server 120, notifications to a stakeholder computing device 140 are sent by the alert server 130. The alerts may be in the form of a Telegram message,
WhatsApp message, WeChat message, Email or SMS or any other suitable messaging or communication technology. The messages could be sent in real-time or near real-time with respect to the occurrence or detection of the adverse event. Memory 124 of the video analytics server 124 comprises a neural network module 125 implementing deep learning algorithms that process the video feed or images captured by the camera 110 to detect the occurrence of adverse events. The neural network module 125 comprises a plurality of neural networks 127. Each neural network 127 could detect a specific object, such as a person or a safety harness or a specific kind of equipment. The use of neural networks implementing deep learning algorithms provides the flexibility of detection of adverse events associated with various use-cases mentioned above and a high level of accuracy of the detection. While each neural network 127 is configured to detect a specific object or a class of objects, the neural network module 125 comprises additional logic or additional neural networks such as the intermediary layer 126 embodying the logic to determine the occurrence of adverse events. Each object detection neural network 127 detects a specific object in an image in a video feed captured by the camera 110. The object detection neural network 127 could also determine a bounding box around the detected object in an image. The multiple neural networks 127 detect multiple objects of interest in a single image and define a bounding box around each detected object. The relative position of the detected objects as demarcated by the respective bounding boxes is analysed by the neural network module 125 or other logic implemented by the video analytics server 120 to determine the occurrence of an adverse event.
For example, a first neural network 127 could detect a person and a second neural network 127 could detect safety harnesses in a common image. If the bounding boxes identified around the detected person and the safety harness sufficiently overlap, then this overlap is inferred by the neural network module 125 as a safe operation, i.e. no adverse event is detected. However, if the bounding boxes identified around the detected person and the safety harness do not sufficiently overlap or no safety harness is detected, then this condition or outcome is flagged as an adverse event by the neural network module 125. The neural network module 125 is trained using a training dataset comprising images with labelled objects and bounding boxes or segments defining objects of interest in the training images. In addition, the training dataset also includes a label associated with each training image indicating whether the training image corresponds to an adverse event or not. The various neural networks of the neural network module 125 are trained to both detect objects and to classify or identify adverse events based on the examples in the training dataset.
Commercial Applications
This system 100 addresses several critical industrial safety related problems. Firstly, every construction site needs safety officers for observing the safety of the operations. A limited number of trained safety officers may be available to fully monitor a worksite. The safety officers may spend a significant amount of time ensuring safety regulations are complied with instead of focusing on their core task of detection of adverse events. Adverse event detection by the system 100 reduces the workload or responsibilities a safety officer is typically responsible for.
If adverse events are not proactively detected, then they may lead to a safety hazard that may result in the issuance of a stop-work order. Such
stop-work orders may result in the waste of all the hired manpower, machinery and equipment on site. As a result of the stop-work order, there is also an increased chance of delay in the delivery of a project. With the assistance of adverse event detection by the system 100 in place, safety officers can spend their time more effectively. System 100 provides all around the clock safety monitoring and sends real-time alerts to safety managers to take rectification actions before the adverse events result in the issuance of non-compliance or stop-work orders.
Figure 2 illustrates a method 200 for detecting an adverse event that may be performed by the system 100 of Figure 1.
Step 210: Capture Video
Step 210 comprises capturing one or more images using one or more capture devices 110 positioned to capture images of a respective zone 105. One or more images are captured in the form of a video. The video is captured through a video acquisition module which is in the form of a hardware device or camera 110 that is used to capture video. The hardware module consists of a camera that may be placed at an angle of inclination of between 30 degrees to 80 degrees and overlooks a region of high risk in an industrial setting or a worksite. The camera is placed at a distance far enough to view the entire or a substantial part of the high- risk activity that could potentially occur.
The video streams are captured from multiple hardware modules of multiple image capture devices 110 overlooking various types of high- risk activities in industrial sites. They are continually streamed to the video server 120 which performs the analytics. The video streams from various sources are stored in the network video recorder 117. The video analytics server 120 could query the network video recorder 117 to retrieve video streams associated with a specific camera 110 and
captured over a specific period on a particular date. The video streams stored in the network video recorder 117 could be queried on an ad-hoc basis by the stakeholder computing device 140 and the results of the query are transmitted by the Video analytics server to the stakeholder computing device 140. The video feeds captured by the camera 110 could be streamed in real-time or near real-time to the stakeholder computing device 140.
The video stream captured by the camera 110 may be of a resolution equal to or greater than 480p. The video feed may be captured at any predefined FPS in an H.264 format. Examples of hardware modules that capture these kinds of video streams include IP cameras (Hikvision, Dahua, Pelco, etc.), mobile phones, and wearable cameras, etc.
Step 220: Receiving at least a portion of said images at a plurality of neural networks to detect objects of interest At step 220, the at least a portion of the images captured by the cameras 110 are received at the processor 122 of the video analytics server 120 and processed by the neural networks embodied in the neural network module 125 to identify an object or objects in the respective zone.
The video streams originating from the camera 110 are transmitted over a 3G/4G/5G or a WIFI or a wired network to the video analytics server 120 located physically on-site or in the cloud. The video stream may comply with the Real-Time Streaming Protocol (RTSP), Flypertext Transfer Protocol Live Streaming (HLS) Protocol, Session Description Protocol (SDP), or Audio Video Interleave (AVI) Protocol, etc. The video streams are then processed by the video analytics server 120. The video streams first reach an intermediary layer 126 which acts as an
intermediary between the video stream data and the various neural networks 127 present in the neural network module 125.
The intermediary layer 126 specificities one or more neural networks 127 of the plurality of neural networks for identifying an adverse event. The video analytics server 120 comprise the following 14 custom trained exemplary neural networks 127:
1) Neural network for Detection of excavators
This neural network is trained on several custom obtained images from various open-source locations and photos from various industrial sites. The neural network detects the presence of an excavator from any angle and at a distance more than 80m away.
2) Neural network for Detection of workers
This neural network is trained to detect the presence of workers at any location in the industrial site video frame. 3) Neural network for Detection of Personal Protective Equipment
This neural network is trained on a dataset that allows the detection of the helmet, safety vest, boots, gloves, and goggles present on a worker detected in the industrial site.
4) Neural network for Detection of safety harnesses This neural network enables the detection of the presence of safety harnesses present on workers working at height.
5) Neural network for Detection of barricades
This neural network enables the detection of the presence of barricades of different colours in the zone of interest which is defined by the front- end user interface of the video analytics platform
6) Neural network for Detection of guardrails
This neural network enables the detection of the presence of guardrails of different colours in the zone of interest which is defined by the front- end user interface of the video analytics platform 7) Neural network for Detection of A-Frame ladders
This neural network enables the detection of A-frame ladders present in the video stream.
8) Neural network for Detection of platform ladders
This neural network enables the detection of platform ladders present in the video stream.
9) Neural network for Detection of industrial traffic congestion
This neural network enables the detection of various types of traffic conditions present in the industrial site, dense, and sparse traffic as well as detects accidents. 10) Neural network for Detection of stagnant water
This neural network enables the detection of the presence of stagnant water in the video stream.
11) Neural network for Detection of potholes
This neural network enables the detection of the presence of potholes in the video stream.
12) Neural network for Detection of housekeeping hazards
This neural network enables the detection of the presence of housekeeping hazards such as cluttered piping, and over-stacked materials.
13) Neural network for Detection of forklifts
This neural network is trained on several custom obtained images from various open-source locations and photos from various industrial sites. The neural network detects the presence of a forklift from any angle and at a distance more than 80m away.
14) Neural network for Detection of masks
This neural network enables the detection of masks present on or worn by workers.
The above 14 neural networks 127 are utilized in combination with one another to identify safety risks present in the industrial site. A front-end user interface provided by the user interface module 129 allows a user to select images captured by a specific camera 110 and view which of the above-noted safety hazards have been detected.
Step 230: Detecting if the respective zone comprises an adverse event based on the identified object or objects
At step 230, based on an output of the neural networks 127, the video analytics server 120 detects if the respective zone comprises an adverse event based on the object or objects identified by the neural networks 127. The intermediary layer 126 between the input layer receiving the images captured by the cameras 110 and the 14 neural networks acts as the channel through which the safety hazard details are broken down and the relevant neural network is activated to identify the presence of a specific safety hazard. The following is a list of examples of safety hazards or adverse events:
1) Worker is closer than lm to an excavator or a forklift
2) An A-frame ladder is used instead of a platform ladder
3) Worker is not wearing personal protective equipment when necessary
4) Worker is not wearing a mask when necessary
5) Worker is not wearing a safety harness when working at heights 6) Worker is leaning on guardrails or barricades at heights
7) Barricades are not present at open edges at heights
8) Barricades are not present around dangerous machinery
9) Accidents or heavy traffic exists around the industrial site
10) Worker crosses a restricted zone in the industrial site 11) Stagnant water is present around the industrial site
12) Potholes and uneven surfaces are present on the site
13) Poor housekeeping in the form of cluttered pipes, cables and over stacked material are present on the site
The neural networks 127 work in cohesion to identify the presence of each of the above exemplary safety hazards or adverse events and other such adverse events. Each detected safety hazard or adverse event contribute to a safety score level which is all added together to provide an overall industrial site safety score.
Step 240: Identifying, for each detected adverse event, one or more stakeholders for the respective zone
Memory 124 of the video analytics server 120 comprises a stakeholder database 132. The stakeholder database 132 could be stored on a computing device other than the video analytics server 120 (such as the alert server 130) and may be accessible to the video analytics server 120.
The stakeholder database 132 comprises a mapping of the respective zones 105 with one or more stakeholder or stakeholder identities. The stakeholder database 132 also comprises information to enable the specific direction of alerts or messages to a particular stakeholder computing device 140. At step 240, based on the adverse event detected at 230, the video analytics server 120 identifies one or more stakeholders responsible for the zone in which the adverse event was detected.
Step 250: Transmitting an alert to the one or more stakeholders
At step 250 an alert is transmitted to the identified stakeholders. The alert comprises identification information associated with the adverse event. The identification information comprises at least a location of the adverse event and one or more said images in which the adverse event is visible. The identification information could also comprise the label or class information associated with objects detected/not detected in relation to the identified adverse event.
Once any safety hazard or violation is detected by the video analytics server 120, the alert is sent to the stakeholder computing device 140. The alert could be in the form of a telegram channel message or WhatsApp group message or WeChat message or Email or text message, for example. The alert comprises information regarding the adverse event enabling decision making in response to the detected adverse event by the stakeholder. The notification or alert comprises a zone identifying information in which the violation or adverse event is detected, the timing of the violation, a short video of the violation and the risk level of the violation.
The alert could be transmitted by the alert server 130. The transmission of the alert by alert server 130 may be initiated by the video analytics server 120.
Data regarding the violations or adverse events detected by the video analytics server 120 is stored in memory 124 or a storage device accessible to the video analytics server 120. Data regarding the adverse events is used to populate a dashboard of adverse events. The dashboard is served or made available to the stakeholder computing device 140 through the user interface module 129 of the video analytics server. The dashboard contains details of the violation or adverse event, location, timing, risk level and a video of the violation. The dashboard comprises graphs containing historical information of the site safety conditions, broken down on a daily, weekly, or monthly basis. A safety/unsafety score may be allocated to each detected adverse event. The safety scores may be added together to produce a safety score for an industrial site or worksite.
Figure 3 illustrates an example computer system 300. The video analytics server 120, alert server 130 and the stakeholder computing device 140 may comprise one or more components described with reference to the computer system 300 to provide the requisite functionality. In particular embodiments, one or more computer systems 300 perform one or more steps of one or more methods described or illustrated herein. In particular embodiments, one or more computer systems 300 provide functionality described or illustrated herein. In particular embodiments, software running on computer system/systems 300 performs steps the methods described or illustrated herein or provides functionality described or illustrated herein. Particular embodiments include one or more portions of one or more computer systems 300. Herein, reference to a computer system may encompass a computing device, and vice versa, where appropriate. Moreover, reference to a computer system may encompass one or more computer systems, where appropriate.
This disclosure contemplates any suitable number of computer systems 300. This disclosure contemplates computer system 300 taking any suitable physical form. As an example and not by way of limitation, computer system 1600 may be an embedded computer system, a system-on-chip (SOC), a single-board computer system (SBC), a desktop computer system, a laptop or notebook computer system, a mesh of computer systems, a mobile telephone, a server, a tablet computer system, or a combination of two or more of these. Where appropriate, computer system 300 may include one or more computer systems 300; be unitary or distributed; span multiple locations; span multiple machines; span multiple data centres; or reside in a cloud, which may include one or more cloud components in one or more networks. Where appropriate, one or more computer systems 300 may perform without substantial spatial or temporal limitation one or more steps of one or more methods described or illustrated herein. As an example and not by way of limitation, one or more computer systems 300 may perform in real-time or in batch mode one or more steps of one or more methods described or illustrated herein. One or more computer systems 300 may perform at different times or at different locations one or more steps of one or more methods described or illustrated herein, where appropriate.
Computer system 300 includes a processor 302, memory 304, storage 306, an input/output (I/O) interface 308, a communication interface 310, and a bus 312. Although this disclosure describes and illustrates a particular computer system having a particular number of particular components in a particular arrangement, this disclosure contemplates any suitable computer system having any suitable number of any suitable components in any suitable arrangement.
Processor 302 may include hardware for executing instructions, such as those making up a computer program. As an example and not by way of
limitation, to execute instructions, processor 302 may retrieve (or fetch) the instructions from an internal register, an internal cache, memory 304, or storage 306; decode and execute them; and then write one or more results to an internal register, an internal cache, memory 304, or storage 306. Processor 302 may include one or more internal caches for data, instructions, or addresses. Instructions in the instruction caches of processor 302 may be copies of instructions in memory 304 or storage 306, and the instruction caches may speed up retrieval of those instructions by processor 302. Data in the data caches may be copies of data in memory 304 or storage 306 for instructions executing at processor 302 to operate on; the results of previous instructions executed at processor 302 for access by subsequent instructions executing at processor 302 or for writing to memory 304 or storage 306; or other suitable data. Processor 302 may include one or more arithmetic logic units (ALUs); be a multi-core processor; or include one or more processors 302. Although this disclosure describes and illustrates a particular processor, this disclosure contemplates any suitable processor.
Memory 304 includes main memory for storing instructions for processor 302 to execute or data for processor 302 to operate on. As an example and not by way of limitation, computer system 300 may load instructions from storage 306 or another source (for example, another computer system 300) to memory 304. Processor 302 may then load the instructions from memory 304 to an internal register or internal cache. To execute the instructions, processor 302 may retrieve the instructions from the internal register or internal cache and decode them. During or after execution of the instructions, processor 302 may write one or more results (which may be intermediate or final results) to the internal register or internal cache. Processor 302 may then write one or more of those results to memory 304. One or more memory buses (which may
each include an address bus and a data bus) may couple processor 302 to memory 304. Bus 312 may include one or more memory buses. In particular embodiments, one or more memory management units (MMUs) reside between processor 302 and memory 304 and facilitate access to memory 304 requested by processor 302.
Storage 306 may include mass storage for data or instructions. As an example and not by way of limitation, storage 306 may include a hard disk drive (HDD), a floppy disk drive, flash memory, an optical disc, a magneto-optical disc, magnetic tape, or a Universal Serial Bus (USB) drive or a combination of two or more of these. Storage 306 may include removable or non-removable (or fixed) media, where appropriate. Storage 306 may be internal or external to computer system 300, where appropriate. This disclosure contemplates mass storage 306 taking any suitable physical form. Storage 306 may include one or more storage control units facilitating communication between processor 302 and storage 306, where appropriate. Where appropriate, storage 306 may include one or more storage 306. Although this disclosure describes and illustrates particular storage, this disclosure contemplates any suitable storage. In particular embodiments, I/O interface 308 includes hardware, software, or both, providing one or more interfaces for communication between computer system 300 and one or more I/O devices. Computer system 300 may include one or more of these I/O devices, where appropriate. One or more of these I/O devices may enable communication between a person and computer system 300. This disclosure contemplates any suitable I/O devices and any suitable I/O interfaces 308 for them. Where appropriate, I/O interface 308 may include one or more device or software drivers enabling processor 302 to
drive one or more of these I/O devices. I/O interface 308 may include one or more I/O interfaces 308, where appropriate.
Communication interface 310 may include hardware, software, or both providing one or more interfaces for communication (such as, for example, packet-based communication) between computer system 300 and one or more other computer systems 300 or one or more networks. As an example and not by way of limitation, communication interface 310 may include a network interface controller (NIC) or network adapter for communicating with an Ethernet or other wire-based network or a wireless NIC (WNIC) or wireless adapter for communicating with a wireless network, such as a WI-FI network. This disclosure contemplates any suitable network and any suitable communication interface 310 for it. Bus 312 may include hardware, software, or both coupling components of computer system 300 to each other.
The term "adverse event" refers to an event giving rise to a potential for injury, equipment damage, idle worker(s) and other undesirable events.
The intermediary layer governs the use of particular neural networks on images from particular image capture devices - i.e. cameras. This ensures neural networks used for detecting adverse events are limited to those relevant to one or both of (i) adverse events (e.g. fall hazards, human-equipment interactions and others) anticipated to occur in the respective zone for the particular image capture device, and (ii) adverse events expected to occur in the workplace in which the image capture device is located. For example, for a camera positioned to detect proper use of harnesses on an elevated platform, there is no need to use a neural network trained to recognise heavy equipment such as excavators. Similarly, in a factory setting, the neural networks may be limited to those identifying personal protective equipment (PPE), forklifts and other
potential adverse events occurring in that factory setting. A user may also specify particular hazards they wish to identify in the zone associated with a particular image capture device.
It will be appreciated that many further modifications and permutations of various aspects of the described embodiments are possible. Accordingly, the described aspects are intended to embrace all such alterations, modifications, and variations that fall within the spirit and scope of the appended claims.
Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", and variations such as "comprises" and "comprising", will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps.
The reference in this specification to any prior publication (or information derived from it), or to any matter which is known, is not, and should not be taken as an acknowledgment or admission or any form of suggestion that that prior publication (or information derived from it) or known matter forms part of the common general knowledge in the field of endeavour to which this specification relates.
Claims
1. A system for detecting an adverse event, comprising: one or more image capture devices, each device being positioned to capture images of a respective zone; and a server comprising a plurality of neural networks, each neural network receiving, and analysing, at least a portion of said images to identify an object or objects in the respective zone, the server being configured to detect if the respective zone comprises an adverse event based on the identified object or objects; and an alert server for identifying, for each detected adverse event, one or more stakeholders for the respective zone and transmitting an alert to the one or more stakeholders, the alert comprising identification information for the detected adverse event.
2. The system of claim 1, wherein the one or more image capture devices comprise a plurality of image capture devices.
3. The system of claim 1 or claim 2, wherein each image capture device is configured to send video feed comprising the images of the respective zone.
4. The system of any one of claims 1 to 3, wherein each image capture device is positioned based on: a relationship between a boundary of the respective zone and a field of view of the respective image capture device; a distance from the respective zone, a resolution of the respective image capture device and a size of a smallest said object sought to be identified in the respective zone.
5. The system of claim 4, wherein a minimum distance of the image capture device from the respective zone is determined based on the boundary being wholly within the field of view, and a maximum distance of the image capture device from the respective zone is determined based on the resolution enabling detection of the smallest said object at a furthest point in the field of view.
6. The system of any one of claims 1 to 5, wherein each neural network is configured to identify a different object.
7. The system of claim 6, wherein at least one neural network is configured to detect a human, and wherein the server is configured to detect if the respective zone comprises an adverse event if a human is identified by the at least one neural network.
8. The system of claim 7, wherein the server is configured to detect if the zone comprises an adverse event by calculating a relationship between a human and another identified object.
9. The system of claim 8, wherein said another identified object comprises equipment for use by a human, the server being configured to calculate the relationship by determining if the equipment is present in the images with the at least one human.
10. The system of claim 9, wherein the server is configured to detect that the respective zone comprises an adverse event when either: the equipment is absent from the images; or a human and the safety equipment are present in the images, but not in association with one another.
11. The system of claim 10, wherein the equipment and human are in association with one another if the equipment is being used as it is intended to be used.
12. The system of any one of claims 1 to 11, wherein the identification information comprises at least a location of the adverse event and one or more said images in which the adverse event is visible.
13. The system of claim 12, wherein each neural network is configured to label said at least a portion of the images analysed by the respective neural network, based on the respective neural network identifying the object or objects.
14. The system of claim 13, wherein the identification information comprises said label.
15. The system of claim 10, wherein the alert server is configured to send the alert to the human, the identification information specifying the equipment that is sought to be identified in association with the human.
16. The system of any one of claims 1 to 15, further comprising an intermediary layer, the intermediary layer specifying particular neural networks of the plurality of neural networks for identifying a predetermined said adverse event.
17. The system of any one of claims 1 to 15, further comprising an intermediary layer, the intermediary layer specifying particular neural networks of the plurality of neural networks for use with images from each respective image capture device.
18. A method for detecting an adverse event, comprising: capturing one or more images using one or more capture devices positioned to capture images of a respective zone; and receiving at least a portion of said images at a plurality of neural networks, the neural networks being configured to identify an object or objects in the respective zone; detecting if the respective zone comprises an adverse event based on the identified object or objects identifying, for each detected adverse event, one or more stakeholders for the respective zone; and transmitting an alert to the one or more stakeholders, the alert comprising identification information for the detected adverse event.
19. A system for detecting an adverse event, comprising: one or more image capture devices, each device being positioned to capture images of a respective zone; and a server comprising a plurality of neural networks, each neural network receiving, and analysing, at least a portion of said images to identify an object or objects in the respective zone, the server being configured to detect if the respective zone comprises an adverse event based on the identified object or objects; and the server further identifying, for each detected adverse event, one or more stakeholders for the respective zone and transmitting an alert to the one or more stakeholders, the alert comprising identification information for the detected adverse event.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SG10202011938P | 2020-11-30 | ||
SG10202011938P | 2020-11-30 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022115045A1 true WO2022115045A1 (en) | 2022-06-02 |
Family
ID=81756300
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/SG2021/050739 WO2022115045A1 (en) | 2020-11-30 | 2021-11-30 | Video analytics for industrial safety |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2022115045A1 (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120146789A1 (en) * | 2010-12-09 | 2012-06-14 | Nicholas De Luca | Automated monitoring and control of safety in a production area |
US20140307076A1 (en) * | 2013-10-03 | 2014-10-16 | Richard Deutsch | Systems and methods for monitoring personal protection equipment and promoting worker safety |
US20190385430A1 (en) * | 2018-06-15 | 2019-12-19 | American International Group, Inc. | Hazard detection through computer vision |
RU2724785C1 (en) * | 2020-02-20 | 2020-06-25 | ООО "Ай Ти Ви групп" | System and method of identifying personal protective equipment on a person |
CN111753705A (en) * | 2020-06-19 | 2020-10-09 | 神思电子技术股份有限公司 | Detection method for intelligent construction site safety operation based on video analysis |
-
2021
- 2021-11-30 WO PCT/SG2021/050739 patent/WO2022115045A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120146789A1 (en) * | 2010-12-09 | 2012-06-14 | Nicholas De Luca | Automated monitoring and control of safety in a production area |
US20140307076A1 (en) * | 2013-10-03 | 2014-10-16 | Richard Deutsch | Systems and methods for monitoring personal protection equipment and promoting worker safety |
US20190385430A1 (en) * | 2018-06-15 | 2019-12-19 | American International Group, Inc. | Hazard detection through computer vision |
RU2724785C1 (en) * | 2020-02-20 | 2020-06-25 | ООО "Ай Ти Ви групп" | System and method of identifying personal protective equipment on a person |
CN111753705A (en) * | 2020-06-19 | 2020-10-09 | 神思电子技术股份有限公司 | Detection method for intelligent construction site safety operation based on video analysis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11164134B2 (en) | Systems and methods for improving process safety in an industrial environment | |
EP3370416B1 (en) | Cctv automatic selection monitoring system, and cctv automatic selection monitoring management server and management method | |
Isaac et al. | A statistical model for dynamic safety risk control on construction sites | |
KR20170006095A (en) | Safety managing system and method for construction site and industries | |
CN103391432A (en) | Intelligent video monitoring system for safety early warning of scenic spots and monitoring method | |
EP3391352B1 (en) | Incident prediction system | |
CN110895881A (en) | Traffic data processing method, device and storage medium | |
CN113011833A (en) | Safety management method and device for construction site, computer equipment and storage medium | |
KR102332818B1 (en) | Risk notification system and the method thereof using virtual proxy sensor and virtual space-based danger zone setting with rtls sensor | |
KR102219809B1 (en) | Safety Work Management System by Image Analysis | |
TWM607740U (en) | Smart construction site management equipment | |
CN116229688A (en) | Engineering construction safety risk early warning method and system | |
US20230083161A1 (en) | Systems and methods for low latency analytics and control of devices via edge nodes and next generation networks | |
CN115208887A (en) | Chemical plant safety monitoring system based on cloud edge cooperation | |
WO2022115045A1 (en) | Video analytics for industrial safety | |
KR101704163B1 (en) | Input Output Integration Control Apparatus for Safety Management System | |
CN111160272B (en) | Intelligent fault judging and early warning system for images | |
CN112039686B (en) | Data stream transmission control method, device, monitoring equipment and storage medium | |
US20140297802A1 (en) | Method and system for location-based delivery of notices of alarms and events | |
KR20140056674A (en) | Commnication service system, method and apparatus for supporting process of accident in the system | |
CN108023741B (en) | Monitoring resource use method and server | |
KR102389521B1 (en) | Convergence Monitoring System and Method for Industry Safety | |
CN114240013A (en) | Key information infrastructure-oriented defense command method and system | |
KR20220004399A (en) | A recorded program media for providing a security surveillance service based on user involvement | |
CN117389740B (en) | Regional safety emergency platform system based on artificial intelligence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21898822 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21898822 Country of ref document: EP Kind code of ref document: A1 |