AU2021104681A4

AU2021104681A4 - An Autonomous Early Detection analysis of User Drowsiness through Facial Recognition using Artificial Intelligence

Info

Publication number: AU2021104681A4
Application number: AU2021104681A
Authority: AU
Inventors: Bosubabu Sambana; Ramesh Yagireddi
Original assignee: Yagireddi Ramesh Dr
Current assignee: Yagireddi Ramesh Dr
Priority date: 2021-01-02
Filing date: 2021-07-28
Publication date: 2022-05-05
Anticipated expiration: 2029-07-28

Abstract

An Autonomous Early Detection analysis of User Drowsiness through Facial Recognition using Artificial Intelligence ABSTRACT The present invention' An Autonomous Early Detection analysis of User Drowsiness through Facial Recognition using Artificial Intelligence" Drowsy driving is a major problem in 1NDIA and across the globe. The risk, danger, and often tragic results (06) of drowsy driving are alarming. Drowsy driving is the dangerous (01) combination of driving and sleepiness or fatigue. This usually happens when a driver has not slept enough, but it can also happen because of untreated sleep disorders (03), medications, drinking alcohol, or shift work. No one knows the exact moment when sleep (05) comes over their body. Drowsiness detection (05) is a safety technology that can prevent accidents that are caused by drivers who fell asleep while driving (04). Drowsy driver detection is one of the potential applications of intelligent vehicle systems (07). Drowsiness detection (11) primarily make pre-assumptions about the focusing on blink rate (02), eye closure (09). Here we employ machine learning to determine actual human behavior (04) during drowsiness episodes (10). We proposed this artificial intelligence approach to detection analysis and awaking system (06) makes secure authentication (12)and during user money transaction, and user driving module by using facial recognition mapping techniques (08). This mechanism is used to various applications and significant sectors based on technological needs. Adjust brightness S. corrtrastA F acedeetection EyeYes Detection y eeto Extract eye region Eye PeaWareExtraction from Eye Regions Determine whether Open/ClosecI Eyes D rowsinessI Calculation of judging Ca Iculation drows nessJ Function 2 No Yes Aiarmi I Signal Figure.12: Flow of Project 12 out of 16 Page 25 of 29

Description

Adjust brightness S. corrtrastA

Facedeetection

EyeYes Detection y eeto

Extract eye region

Eye PeaWareExtraction from Eye Regions

Determine whether Open/ClosecI Eyes

D rowsinessI Calculation of judging Ca Iculation drows nessJ Function

2 No

Yes Aiarmi I Signal

Figure.12: Flow of Project

12 out of 16

Page 25 of 29

AUSTRALIA Patent Act 1990

COMPLETE SPECIFICATIONS INNOVATION PATENT

" An Autonomous Early Detection analysis of User Drowsiness through Facial Recognition using Artificial Intelligence "

The following specification particularly describes the invention and the manner in which it is to be performed, and including the best method of performing it known to me:

Page 1 of 25

An Autonomous Early Detection analysis of User Drowsiness through Facial Recognition using Artificial Intelligence

FIELD OF INNOVATION

[001]. The present invention is related to the field of artificial intelligence approach to detection analysis (03) and awaking system makes secure authentication (11) and during user money transaction, and user driving module (02) by using facial recognition mapping techniques (10). This mechanism is to various applications and significant sectors for social and social impact. A convolutional neural network is a special type of deep neuralnetwork which performs extremely well for image classification purposes.

OBJECT OF INVENTION

[002]. The object of invention is to develop an easier payment method by using the existing secure authentication method.

[003]. Yet another object of invention is to have a more secure means of personal alert system to safeguard the personalinformation stored and alert mechanism.

BACKGROUND OF INVENTION

[004]. An artificial intelligence approach to a proposed mechanism to utilize the existing. A countless number of people drive (01)on the highway and country roads each and every day for hours. Truck drivers, Bus drivers (01), Taxi drivers their livelihood depends upon driving and common people also drive for long hours. Due to this they have a good chance to fall asleep while they are driving due to stress or lack of sleep (05). Road traffic injuries and deaths have a terrible impact on individuals, communities and countries.

[005]. They involve massive costs to often overburdened health care systems occupy scarce hospital beds consume resources and result (12) in significant losses of productivity and prosperity, with deep social and economic repercussions. According to the 2018 report of world health Organization, 299,091 million road traffic deaths occur in India. This makes it the number one cause of death among those aged 15-29 years. This number is predicted to increase to around 1.9 million by 2030 and to become the

Page 2 of 25 seventh leading cause of death if no action is to be taken. According to the National Highway Traffic Safety Administration, every year about 100,000 police-reported crashes involve drowsy driving. These crashes result in more than 1,550 fatalities and 71,000 injuries. However, these numbers are underestimated, and up to 6,000 fatal crashes eachyear may be caused by drowsy drivers.

[006]. The main idea behind this project is to develop a nonintrusive system which can detect (05) fatigue of any human and can issue a timely warning (06). Drivers who do not take regular breaks when driving long distances run a high risk of becoming drowsy a state which they often fail to recognize (03) early enough. Real Time Drowsiness behaviors (09) which are related to fatigue are in the form of eye closing, head nodding or the brain activity.

[007]. Hence, we can either measure change in physiological signals, such as brain waves, heart rate and Eye blinking to monitor drowsiness ( 05) or consider physical changes such as sagging posture, leaning of driver's head and open/close state of eyes (09). The former technique, while more accurate, is not realistic since highly sensitive electrodes would have to be attached directly on the driver' body and hence which can be annoying and distracting to the driver. In addition, long time working would result in perspiration on the sensors, diminishing their ability to monitor accurately (11). The second technique is to measure physical changes (i.e. open/closed eyes to detect fatigue) is well suited for real world conditions since it is non-intrusive by using a video camera to detect changes. In addition, micro sleeps that are short period of sleeps lasting 2 to 3 minutes are good indicators of fatigue (05). Thus, by continuously monitoring the eyes of the driver one can detect the sleepy state of driver and (12) a timely warning is issued.

[008].Benefits of system:

• It ensures the safety of the driver.

• The system will provide an alert when driver is in state of drowsiness.

• Transportation business where almost daily accidents occur due to driver fatigue can be saved.

• Military applications where high intensity monitoring of soldier is needed.

• Operators at nuclear power plants where continuous monitoring is necessary. • In classrooms where students feel drowsy and inattentive during the class.

Page 3 of 25

[009]. Algorithm Used:

Convolution Neural Network (CNN): A Convolution Neural Network (CNN) is comprised of one or more convolution layers (often with a sub sampling step) and then followed by one or more fully connected layers as in a standard multilayer neural network (07). The architecture of a CNN is designed to take advantage of the 2D structure of an input image (or other 2D input such as a speech signal). This is achieved with local connections and tied weights followed by some form of pooling which results in translation invariant features. Another benefit of CNNs (08) is that they are easier to train and have many fewer parameters than fully connected networks with the same number of hidden units. In this article we will discuss the architecture of a CNN and the back propagation algorithm (07) to compute the gradient with respect to the parameters of the model in order to use gradient based optimization.

CNNs have two main parts:

1. A convolution/pooling mechanism that breaks up the image into features and analyzes them

2. A Classifier that takes the output of convolution/pooling and predicts the best label to describe the image

CNN Concepts:

CNNs have an associated terminology and a set of concepts that is unique to them, and that sets them apart from other types ofneural network architectures. The main ones are explained as follows:

Input / Output Volumes: CNNs are usually applied to image data. Every image is a matrix of pixel values. The range of values that can be encoded in each pixel depends upon its bit size. Most commonly, we have 8 bit or 1 Byte-sized pixels. Thus the possible range of values a single pixel can represent is [0, 255]. However, with colored images, particularly RGB (Red, Green, Blue)-based images, the presence of separate color channels (3 in the case of RGB images) introduces an additional 'depth' field to the data, making the input 3- dimensional. Hence, for a given RGB image of size, say 255x255 (Width x Height) pixels, we'll have 3 matrices associated with each image, one for each of the color channels.

Filters (Convolution Kernels): A filter (or kernel) is an integral component of the layered architecture. The kernels are then convolved with the input volume to obtain so-called 'activation maps'. Activation Page 4 of 25 maps indicate 'activated' regions, i.e. regions where features specific to the kernel have been detected in the input. The real values of the kernel matrix change with each learning iteration over the training set, indicating that the network is learning to identify which regions are of significance for extracting features from the data.

Convolution layer: Also referred to as Conv. layer, it forms the basis of the CNN and performs the core operations of training and consequently firing the neurons of the network. It performs the convolution operation over the input volume as specified in the previous section, and consists of a 3 dimensional arrangement of neurons (08) (a stack of 2-dimensional layers of neurons, one for each channel depth). Each neuron is connected to a certain region of the input volume called the receptive field (explained in the previous section).

Forexample, for an input image of dimensions 28x28x3, if the receptive field is 5 x 5, then each neuron in the Conv. layer is connected to a region of 5x5x3 (the region always comprises the entire depth of the input, i.e. all the channel matrices) in the input volume. Hence each neuron will have 75 weighted inputs. For a particular value of R (receptive field), we have a cross-section of neurons entirely dedicated totaking inputs from this region. Such a cross-section is called a 'depth column'. It extends to the entire depth of the Conv. layer.

The ReLu (Rectified Linear Unit) Layer:

ReLu refers to the Rectifier Unit, the most commonly deployed activation function for the outputs of the CNN neurons.Mathematically, it's described as:

Eq.3 : inax(0, x)

Unfortunately, the ReLu function is not differentiable at the origin, which makes it hard to use with backpropagation training. Instead,a smoothed version called the Softplus function is used in practice:

Eq.4 : f(x) = lIn(1+ e')

The derivative of the softplus function is the sigmoid function, as mentioned in a prior blog post.

Page 5 of 25 d(ln(1+±e'))_ =11 e__ Eq.5: f'(x) = = dx + eX + e-X The Pooling Layer: The pooling layer is usually placed after the Convolution layer. Its primary utility lies in reducing the spatial dimensions (Width x Height) of the Input Volume for the next Convolution Layer. It does not affect the depth dimension of the Volume. The operation performed by this layer is also called 'down-sampling', as the reduction of size leads to loss of information as well. However, such a loss is beneficial for the network for two reasons:

• the decrease in size leads to less computational overhead for the upcoming layers of the network; • it works against over-fitting.

The Fully Connected Layer:

The Fully Connected layer is configured exactly the way its name implies: it is fully connected with the output of the previous layer. Fully-connected layers are typically used in the last stages of the CNN to connect to the output layer and construct the desired number of outputs (08).

CNN Design Principles: Given the aforementioned building blocks, the last detail before implementing a CNN is to specify its design end to end, and to decide on the layer dimensions of the Convolution layers.

For each (ith) dimension of the input volume, pick:

1. , =. 1 i ) =(i) - R -+- 2P S

To better understand better how it works, let's consider the following example:

1. Let the dimensions of the input volume be 288x288x3, the stride value be 2 (both along horizontal and vertical directions). 2. Now, since Wn= 2 8 8 and S = 2, (2.P - R) must be an even integer for the calculated value to be an integer. If we set the padding to 0 and R = 4, we get WOut=(288-4+2.0)/2+1 =284/2 + 1 = 143. As the spatial dimensions are symmetrical (same value for width and height), the output dimensions are going to be: 143 x 143 x K, where K is the depth of the layer. K can be set to any value, with increasing values for every Conv. layer added. For larger networks values of 512 are common. Page 6 of 25

3. The output volume from a Conv. layer either has the same dimensions as that of the Conv. layer (143x143x2 for the example considered above), or the same as that of the input volume (288x288x3 for the example above). The generic arrangement of layers can thus be summarized as follows:

Input

[([Conv -- ReLu] x N) -> Pool?] x M ->

[FullyConnected - ReLu] x K -+ FullyConnected

Where N usually takes values between 0 and 3, M >= 0 and K[0,3). The expression indicates multiple layers, with or without perlayer-Pooling. The final layer is the fully- connected output layer.

[0010]. PROPOSED SYSTEM AND METHODOLOGY

Methodology: There are different types of methodologies that have been developed to find out if the person is in state ofdrowsiness or not.

Physiological level approach: This technique is an intrusive method where in electrodes are used to obtain pulse rate, heart rate and brain activity information. ECG is used to calculate the variations in heart rate and detect different conditions for drowsiness. The correlation between different signals such as ECG (electrocardiogram), EEG (electroencephalogram), and EMG (electromyogram) are made and then the output is generated whether the person is drowsy or not.

Behavioral based approach: In this technique eye blinking frequency, head pose, etc. of a person is monitored through a camera and theperson is alerted if any of these drowsiness symptoms are detected.

[0011]. PROPOSED SYSTEM:

Image processing has been used for many decades for processing videos and images for different real time applications. With the invention of new processors of high processing capabilities and high definitions cameras, it became much easier to develop real-time application that can perform similar than humans with much better accuracy and less expenses. To do so, many software and libraries has been introduced to help researches and developer in building their systems in a faster and easier way.

Page 7 of 25

[0012]. The system we are proposing, it will be using behavioral based approached. As humans detects drowsiness by the state of eyes of other person, we are also going to detect in the similar way. But instead of human eyes we will be using a webcam and instead of human judgment we will be using machine learning algorithms.

[0013]. Drowsiness detection system that captures, processes, recognizes and provides results to the user, which user can take actions on the events Detection of fatigue involves a sequence of images of a face, and the observation of eye movements and blink patterns. By monitoring the eyes, it is believed that the symptoms of driver fatigue can be detected early enough to avoid a car accident. Detection of fatigue involves a sequence of images of a face, and the observation of eye movements and blink patterns.

The analysis of face images is a popular research area with applications such as face recognition, virtual tools, and humanidentification security systems Region of Interest (ROI) is estimated depending on Eye state (open/closed).

[0014]. DATA COLLECTION

The Drowsiness Dataset (DD): The DD dataset was created for the task of multistage drowsiness detection, targeting only the face and eyes visibility. Detection of these subtle cases can be important for detecting drowsiness at an early stage, so as to activate drowsiness prevention mechanisms. DD dataset is the largest to date realistic drowsiness dataset. The detection of eyes and their parts, gaze estimation, and eye-blinking frequency are important tasks in computer vision (06).

Capturing the area of driver's behavior, which causes the acquiring of a lot of testing data that was acquired in real conditions? Therefore, we introduce the MRL Eye Dataset, the large- scale dataset of human eye images. This dataset contains infrared images in low and high resolution, all captured in various lightning conditions and by different devices.

The dataset is suitable for testing several features or trainable classifiers. In order to simplify the comparison of algorithms, the images are divided into several categories, which also makes them suitable for training and testing classifiers [4].

In the dataset, we annotated the following properties (the properties are indicated in the following order): * subject ID; in the dataset, we collected the data of 37 different persons (33 men and 4 women) Page 8 of 25

* image ID; the dataset consists of84,898 images

* gender [0 - man, 1 - woman]; the dataset contains the information about gender for each image (man, woman) * glasses [0 - no, 1 - yes]; the information if the eye image contains glasses is also provided for each image (with and without theglasses) * eye state [0 - closed, 1 - open]; this property contains the information about two eye states (open, close) * reflections [0 - none, 1 - small, 2 - big]; we annotated three reflection states based on the size of reflections (none, small, and bigreflections) * lighting conditions [0 - bad, 1 - good]; each image has two states (bad, good) based on the amount

o of light during capturing the videos

* sensor ID [01 - RealSense, 02 - IDS, 03 - Aptina]; at this moment, the dataset contains the images captured by three different sensors (Intel RealSense RS 300 sensor with 640 x 480 resolution, IDS Imaging sensor with 1280 x 1024 resolution, and Aptina sensor with 752 x 480 resolution)

[0015]. DESIGN Objective:

A countless number of people drive on the highway day and night. Taxi drivers, bus drivers, truck drivers and people traveling long- distance suffer from lack of sleep. Due to which it becomes very dangerous to drive when feeling sleepy. The majority of accidents happen due to the drowsiness of the driver. So, to prevent these accidents we will build a system using Python, OpenCV, and Keras which will alert the driver when he feels sleepy.

The objective ofthis project is to build a drowsiness detection system that will detect that a person's eyes are closed for a few seconds. This system will alert the driver when drowsiness is detected. The purpose of the design phase is to develop a clear understanding of what the developer want people to gain from his/her project. As the developer work on the project, the test for every design decision should be "Does this feature fulfill the ultimate purpose of the project?".

The design document will verify that the current design meets all of the explicit requirements contained in the system model as well as the implicit requirements desired.

Page 9 of 25

[0016]. System Architecture:

In this project, we will be using OpenCV for gathering the images from webcam and feed them into a Deep Learning model which will classify whether the person's eyes are 'Open' or 'Closed'. The approach we will be using for this project is as follows:

Step 1 - Take image as input from a camera.

Step 2 - Detect the face in the image and create a Region of Interest (ROI). Step 3

Detect the eyes from ROI and feed it to the classifier.

Step 4 - Classifier will categorize whether eyes are open or closed.

Step 5 - Calculate score to check whether the person is drowsy.

[0017]. Model Architecture:

The model we used is built with Keras using Convolutional Neural Networks (CNN). A convolutional neural network is a special type of deep neural network which performs extremely well for image classification purposes. A CNN basically consists of an input layer, an output layer and a hidden layer which can have multiple numbers of layers.

A convolution operation is performed on these layers using a filter that performs 2D matrix multiplication on the layer and filter.

The CNN model architecture consists of the following layers: • Convolutional layer; 32 nodes, kernel size 3 • Convolutional layer; 32 nodes, kernel size 3 • Convolutional layer; 64 nodes, kernel size 3 • Fully connected layer; 128 nodes

The final layer is also a fully connected layer with 2 nodes. In all the layers, a Relu activation function is used except the output layerin which we used Softmax.

Page 10 of 25

[0018]. UML DIAGRAMS

UML is a method for describing the system architecture in detail using the blueprint. UML represents a collection of best engineering practices that have proven successful in the modeling of large and complex systems. UML is a very important part of developing objects oriented software and the software development process. UML uses mostly graphical notations to express the design of software projects. Using the UML helps project teams communicate, explore potential designs, and validate the architectural design ofthe software.

Relationships: Relationships tie the things together. Relationships in the UML are Dependency, Association,Generalization, and Specialization.

Use Case Diagram: A use case diagram is a graph of actors, a set of use cases enclosed by a system boundary, associations between actors and users. In general, it shows a set of use cases and actors and their relationships. The creation of a use case model is an excellent vehicle for elicitation of functional requirements

Sequence Diagram: Sequence diagram are an easy and intuitive way of describing the behavior of a system by viewing the interaction between the system and its environment. A sequence diagram has two components are vertical dimension represents time, the horizontal dimension represents different object. The vertical line is called object's lifeline.

Collaboration Diagram: Collaboration diagrams represent a combination of information taken from class, sequence, and use case diagrams describing both the static structure and dynamic behavior of a system. The collaboration diagram represents interactions among objects in terms of sequenced messages.

Activity Diagram: The purpose of activity diagram is to provide a view of flows and what is going on inside a use case or amongseveral classes. An activity is shown as around box containing the name of the operation.

[0019]. Technological Description

Python is a high-level, interpreted, interactive and object-oriented scripting language. Python is designed to be highly readable. It usesEnglish keywords frequently where as other languages use punctuation, and it has fewer syntactical constructions than other languages.

Python is a MUST for students and working professionals to become a great Software Engineer Page 11 of 25 specially when they are working inWeb Development Domain. I will list down some of the key advantages of learning Python:

• Python is Interpreted - Python is processed at runtime by the interpreter. You do not need to compile your program beforeexecuting it. This is similar to PERL and PHP.

• Python is Interactive - you can actually sit at a Python prompt and interact with the interpreter directly to write yourprograms.

• Python is Object-Oriented - Python supports Object-Oriented style or technique of programming that encapsulates codewithin objects.

• Python is a Beginner's Language - Python is a great language for the beginner-level programmers and supports the development of a wide range of applications from simple text processing to WWW browsers to games.

Important characteristics of Python Programming :

• It supports functional and structured programming methods as well as OOP. • It can be used as a scripting language or can be compiled to byte-code for building large applications. • It provides very high-level dynamic data types and supports dynamic type checking. • It supports automatic garbage collection. • It can be easily integrated with C, C++, COM, ActiveX, CORBA, and Java.Python is one of the most widely used language over the web and its Applications are: • Easy-to-learn - Python has few keywords, simple structure, and a clearly defined syntax. This allows the student to pick upthe language quickly. • Easy-to-read - Python code is more clearly defined and visible to the eyes. • Easy-to-maintain - Python's source code is fairly easy-to-maintaining. • A broad standard library - Python's bulk of the library is very portable and cross-platform

compatible on UNIX, Windows,and Macintosh. • Interactive Mode - Python has support for an interactive mode which allows interactive testing

and debugging of snippets ofcode. • Portable - Python can run on a wide variety of hardware platforms and has the same interface

on all platforms. • Extendable - You can add low-level modules to the Python interpreter. These modules enable

Page 12 of 25 programmers to add to orcustomize their tools to be more efficient. • Databases - Python provides interfaces to all major commercial databases. • GUI Programming - Python supports GUI applications that can be created and ported to many system calls, libraries andwindows systems, such as Windows MFC, Macintosh, and the X Window system of Unix. • Scalable - Python provides a better structure and support for large programs than shell scripting.

Module in Python:

A module allows you to logically organize your Python code. Grouping related code into a module makes the code easier to understand and use. A module is a Python object with arbitrarily named attributes that you can bind and reference. Simply, a module is a file consisting of Python code. A module can define functions, classes and variables. A module can also include runnable code.

[0020].IMPLEMENTATION

Methodology: Eye detection is a pre-requisite stage for many applications such as human-computer interfaces, iris recognition, driver drowsiness detection, security, and biology systems. In this paper, template based eye detection is described. The template is correlated with different regions (04) of the face image. The region of face which gives maximum correlation with template refers to eye region. The method is simple and easy to implement.

The effectiveness of the method is demonstrated in both the cases like open eye as well as closed eye through various simulation results A novel and simple eye detection scheme is proposed in this paper. An eye template is usedto detect eye region from face image.

The template is matched with eye region using cross correlation technique. The method does not require any complex mathematical calculation and prior knowledge about the eye. It is a simple method and can easily be implemented by hardware.

Face Detection: For the face Detection it uses Haar feature-based cascade classifiers is an effective object detection method proposed by Paul Viola and Michael Jones in their paper, "Rapid Object Detection using a Boosted Cascade of Simple Features" in 2001. It is a machine learning based approach where a cascade

Page 13 of 25 function is trained from a lot of positive and negative images. It is then used to detect objects in other images. Here we will work with face detection (03).

Initially, the algorithm needs a lot of positive images (images of faces) and negative images (images without faces) to train the classifier. Then we need to extract features from it. For this, Haar features shown in the below image are used (06). Each feature is a single value obtained by subtracting sum of pixels under the white rectangle from sum of pixels under the black rectangle.

A cascaded Ada boost classifier with the Haar-like features is exploited to find out the face region. First, the compensated image is segmented into numbers of rectangle areas, at any position and scale within the original image. Due to the difference of facial feature, Haar-like feature is efficient for real-time face detection. These can be calculated according to the difference of sum of pixel values (05)within rectangle areas. The features can be represented by the different composition of the black region and white region.

A cascaded Ada boost classifier is a strong classifier which is a combination of several weak classifiers. Each weak classifier is trained by Ada boost algorithm. If a candidate sample passes through the cascaded Ada boost classifier, the face region can be found.

Almost all of face samples can pass through and non-face samples can be rejected.

Eye detection and Eyes Location:

Eye detection: In the system we have used facial landmark prediction for eye detection Facial landmarks are used to localize andrepresent salient regions of the face, such as: • Eyes • Eyebrows • Nose • Mouth • Jawline Facial landmarks have been successfully applied to face alignment, head pose estimation, faces wrapping, blink detection and much more. In the context of facial landmarks, our goal is detecting important facial structures on the face using shape prediction methods. Detecting facial landmarks is therefore a twostep process: Page 14 of 25

• Localize the face in the image. • Detect the key facial structures on the face ROI.

Detect the key facial structures on the face ROI: There are a variety of facial landmark detectors, but all methods essentially try tolocalize and label the following facial regions: * Mouth * Right eyebrow

* Left eyebrow * Right eye

* Left eye * Nose

This method starts by using:

1. A training set of labeled facial landmarks on an image. These images are manually labeled specifying specific (x, y)- coordinates of regions surrounding each facial structure. 2. Priors, of more specifically, the probability on distance between pairs of input pixels. The pre- trained facial landmark detector inside the dlib library is used to estimate the location of 68 (x, y)- coordinates that map to facial structures on the face.

Eyes Location: In order to make the image smoothing, doing some treatments before eyes location, including image denoising and enhancement (11), which is a prerequisite to ensure precise eyes location achieve the better result.

Locating the eye region roughly

The edge feature analysis method means, making use of the vertical gray-scale projection curve of the image determined the left and right borders of the face according to the convex peak width, then making use of the horizontal gray-scale projection curve of the gottenregion determined roughly the up and down border of the eyes location region [04]. The region that corresponds to a face is a convex peak with a certain width by observing the vertical gray-scale projection curve of a number of different single-face images.

Page 15 of 25

The left and right borders of the convex peak generally represented the left and right borders of the face

[12]. When the left and right borders of the face are established, take the region of the face between the left and right borders as the study object, and then make the horizontal gray-scale projection curve of the image, something will be found by observing (07).

The first minimum point of the horizontal gray-scale projection curve corresponds to the crown of the head, the maximum point corresponds to one of the forehead, the secondary maximum point corresponds to the central of the nose, and takes the region between the central of the nose and the crown of the head as the rough located region (04).

Sifting the similar eye points collection:

The primary problem is selecting the appropriate template prior to the template matching [10]. In the follow-up algorithm, it is necessary to use the relative position between two eyes to locate the two eyes from a number of similar points, so long as to ensure that there are two real eye-points among a number of similar eye-points. In order to reduce the two eyes' sensitivity to the eye template and improve the robustness, the system adopts the synthetic eye template of the two eyes.

In order to select the similar eye-points, it is desirable first to establish the similarity metric. The general way is doing the relevant operation to the local image (01) and the image template, the cross-correlation coefficient obtained in this way is regarded as the similarity metric (See Formula 1). Two parameters are

ALN, - (T(i.j)-T (Sr (V0+-Sr +

MxNx X {T (i p)-T x Mx (+x,j~)S iaj=1 WY, T +=1r used to describe the synthetic template: template height M , width template N .

Therein, N is the synthetic eye template, the size is M x N ;T is the average of the eye template image; r S T is the average of the local image that matches with the template in the expected face recognition image; (x, y) is the coordinates of search points in the face image. According to the above formula, operating Pxy, always have |Pxyl =<1 , and the greater the Pxy, the higher the matching (09).

Page 16 of 25

However, due to the synthetic eye template exists a certain error and image acquisition will be affected by external conditions, when the interference, these may lead to the greatest similarity is not the real eye point, so locating the eye point can not only be determined by the size of the similarity. In order not to miss the real eye point, the way is selecting roughly a similar eye point collection including the two real eye points (See Figure 2.4) A={(Xi , Yi)/ i= 1,2,...n} , and then obtains the two real eye points through prior knowledge calibration. n is a optional coefficient.

Recognition of Eye's State:

The eye area can be estimated from optical flow, by sparse tracking or by frame-to-frame intensity differencing and adaptive thresholding. And finally, a decision is made whether the eyes are or are not covered by eyelids. A different approach is to infer the state of the eye opening (02) from a single image, as e.g. by correlation matching with open and closed eye (03) templates, a heuristic horizontal or vertical image intensity projection over the eye region, a parametric model fitting to find the eyelids, or active shape models.

A major drawback of the previous approaches is that they usually implicitly impose too strong requirements on the setup, in the sense of a relative face-camera pose (head orientation), image resolution, illumination, motion dynamics, etc. Especially the heuristic methods that use raw image intensity are likely to be very sensitive despite their real-time performance.

Therefore, we propose a simple but efficient algorithm to detect eye blinks by using a recent facial landmark detector (11). A single scalar quantity that reflects a level of the eyeopening is derived from the landmarks. Finally, having a per-frame sequence of the eye-opening estimates, the eye blinks are found by an SVM classifier that is trained on examples of blinking and non- blinking patterns.

Eye Aspected Ratio Calculation:

For every video frame, the eye landmarks are detected. The eye aspect ratio (EAR) between height and width of the eye iscomputed.

EAR =||p2 - p6| + ||p3 - p5|| (1)

2||p l - p4|| Page 17 of 25

Where pl ...p 6 are the 2D landmark locations. The EAR is mostly constant when an eye is open and is getting close to zero while closing an eye. It is partially person and head pose insensitive. Aspect ratio of the open eye has a small variance among individuals, and it is fully invariant to a uniform scaling of the image and in-plane rotation of the face. Since eye blinking is performed by both eyes synchronously, the EAR of both eyes is averaged. By using Figure. 5

Eye State Determination:

Finally, the decision for the eye state is made based on EAR calculated in the previous step. If the distance is zero or is close to zero, theeye state is classified as "closed" otherwise the eye state is identified as "open".

Process of system:

Step 1 - Take Image as Input from a Camera:

With a webcam, we will take images as input. So to access the webcam, we made an infinite loop that will capture each frame. We use the method provided byOpenCV, ev2.VideoCapture(O) to access the camera and set the capture object (cap). cap.reado will read each frame and we store the image in a frame variable.

Step 2 - Detect Face in the Image and Create a Region of Interest (ROI):

To detect the face in the image, we need to first convert the image into grayscale as the OpenCV algorithm for object detection takes gray images in the input. We don't need color information to detect the objects. We will be using haar cascade classifier to detect faces. This line is used to set our classifier.

face = ev2.CascadeClassifier(' path to our haar cascade xml file').

Then we perform the detection using

faces = face.detectMultiScale(gray).

It returns an array of detections with x,y coordinates, and height, the width of the boundary box of the object. Now wecan iterate over the faces and draw boundary boxes for each face.

for (x,y,w,h) in faces:

Page 18 of 25 cv2.rectangle(frame, (x,y), (x+w, y+h), (100,100,100), 1)

Step 3 - Detect the eyes from ROI and feed it to the classifier:

The same procedure to detect faces is used to detect eyes. First, we set the cascade classifier for eyes in leye and reye respectivelythen detect the eyes using left-eye = leye.detectMultiScale(gray)

Now we need to extract only the eyes data from the full image. This can be achieved by extracting the boundary box of the eye andthen we can pull out the eye image from the frame with this code.

leye = frame[ y : y+h, x : x+w ]

leye only contains the image data of the eye. This will be fed into our CNN classifier which will predict if eyes are open orclosed. Similarly, we will be extracting the right eye into r-eye.

Step 4 - Classifier will Categorize whether Eyes are Open or Closed:

We are using CNN classifier for predicting the eye status. To feed our image into the model, we need to perform certainoperations because the model needs the correct dimensions to start with. First, weconvert the color image into grayscale using r_eye =cv2.cvtColor(reye, cv2.COLORBGR2GRAY) Then, we resize the image to 24*24 pixels as our model was trained on 24*24 pixel images cv2.resize(reye,(24,24)). We normalize our data for better convergence r_eye = r-eye/255 (All values will be between 0-1)

Expand the dimensions to feed into our classifier. We loaded our model using

model= loadmodel('models/cnnCat2.h5')

Now we predict each eye with our model

lpred = model.predict_classes(leye)

If the value oflpred[0]= 1, it states that eyes are open, if value of lpred[0]= 0 then, it states that eyes are closed.

Page 19 of 25

Step 5 - Calculate Score to Check whether Person is Drowsy

The score is basically a value we will use to determine how long the person has closed his eyes. So if both eyes are closed, we will keep on increasing score and when eyes are open, we decrease the score. We are drawing the result on the screen using cv2.putTexto function which will display real time status of the person. cv2.putText(frame, "Open", (10, height-20), font, 1, (255,255,255), 1, cv2.LINEAA)

A threshold is defined for example if score becomes greater than 15 that means the person's eyes are closed for a long period of time. Thisis when we beep the alarm using sound.playo.

[021]. TESTING/VALIDATIONS

Testing is the process of evaluating a system or its component(s) with the intent to find that whether it satisfies the specified requirements or not. This activity results in the actual, expected and difference between their results. Software testing is a critical elementof software quality assurance and represents the ultimate reviews of specification, design and coding. It represents interesting anomaly for the software (11). Testing is the process of detecting errors. The results of testing are used later on during maintenance also.

Generally, the testing phase involves the testing of the developed system using various test data. Preparation of the test data plays a vital role in the system testing. After preparing the test data the system under study was tested using those test data. While testing the system, errors werefound and corrected by using the testing steps and corrections are also noted for future use. Thus, a series of testing is performed for the proposed system, before the system was ready for the implementation.

Thus, the aim of testing is to demonstrate that a program works by showing that it has no errors. The fundamental purpose of testing phase is to detect the errors that may be present in the program. Thus, testing allows developers to deliver software that meets expectations, prevents unexpected results, and improves the long term maintenance of the application.

Depending upon the purpose of testing and the software requirements, the appropriate methodologies are applied. Wherever possible, testing can also be automated.

Page 20 of 25

Testing Strategies:

Unit Testing: Unit Testing is a level of the software testing process where individual units or components of a software or system are tested. It is also known as component testing, refers to tests that verify the functionality of a specific section of code. The purpose is to validate that each unit of the software performs as designed. All modules must be successful in the unit test before the start of the integration testing begins. The goal of unit testing is to isolate each part of the program and show that individual parts are correct in terms of requirements and functionality.

Unit testing focuses verification effort on the similar unit of software design the form. This is known as form testing. In this testing step, each module is found to be working satisfactorily, as regard to the expected output from the module. Each module has been tested by giving different sets of inputs, when developing the module as well as finishing the development so that each module works without any error. The inputs are validated when accepting from the user.

Each module can be tested using the following two strategies: White Box Testing: White box testing is a software testing method in which the internal structure/design/implementation of the item being tested is known to the tester. The tester chooses inputs to exercise paths through the code and determines the appropriate outputs. White box testing is testing beyond the user interface and into the nitty- gritty of a system. This is a unit testing method where a unit willbe taken at a time and tested thoroughly at a statement level to find the maximum possible errors.

We tested step wise every piece of code, taking care that every statement in the code is executed at least once. White-box testing can be applied at the unit, integration and system levels of the software testing process. It is usually done at the unit level. It can test paths within a unit, paths between units during integration, and between subsystems during a system-level test.

The white box testing is also called Glass Box Testing. We have generated a list of test cases sample data, which is used to check all possible combinations of execution paths through the code at every module level. This testing has been uses to find in the following categories: Execute internal data structures to ensure their validity. Guarantee that all independent paths have been executed. Execute all logical decisions on their true and false sides.

Page 21 of 25

Black Box Testing: The black-box approach is a testing method in which test data are derived from the specified functional requirements without regard to the final program structure. It treats the software as a "black box", examining functionality without any knowledge of internal implementation. This testing method considers a module as a single unit and checks the unit at interface and communication with other modules rather getting into details at statement level i.e., here the module will be treated as a black box that will take some input and generate output. Output for a given set of input combinations are forwarded to another module. This testing has been uses tofind in the following categories: Initialization and termination errors. Incorrect or missing functions Performance errors

Integration Testing: Integration Testing is a level of the software testing process where individual units are combined and tested as a group. The purpose of this level of testing is to expose faults in the interaction between integrated units. After the unit testing, we have to perform integration testing. The goal here is to see if modules can be integrated properly, the emphasis being on testing interfaces between modules. It works to expose defects in the interfaces and interaction between integrated components (modules). Progressively larger groups of tested software components corresponding to elements of the architectural design are integrated and tested until the software works as a system.

The testing of combined parts of an application to determine if they function correctly together is the main motto of integration testing. There are two methods of doing integration testing. They are Bottom-up integration testing and Top Down integration testing. All modules are combined in the testing step. Then the entire program is tested as a whole.

Validation Testing: At the culmination of the integration testing, the software is completely assembled as a package, interfacing errorshave been uncovered and corrected and final series of software validation testing begins. Validation checks that the product design satisfies or fits the intended use (high-level checking), i.e., the software meets the user requirements.

System Testing: System testing is a level of the software testing process where a complete, integrated system or software is tested. Onceall the components are integrated, the application as a whole is tested rigorously to see that it meets quality standards. Here the entire software system is tested. The reference document for this process is the requirements document, and the goal is to see if software meets its requirements. Here entire project has been tested against requirements of project and it is checked whether all requirements of project have been satisfied or not.

Page 22 of 25

Output Testing: After performing validation testing, the next steps are output testing of the proposed system, since no system could beuseful if it does not produce the desired output in the specified format. The output generated are displayed by the system under consideration or tested by asking the user about the format required by them. Here the output format is considered in two ways. One is on the screen and the other is on the printed form.

Acceptance Testing: Acceptance Testing is a level of the software testing process where a system is tested for acceptability. The purpose of this test is to evaluate the system's compliance with the business requirements and assess whether it is acceptable for delivery. Acceptance test is performed with realistic data of the client to demonstrate that the software is working satisfactorily. Testing here is focused on external behavior of the system; the internal logic of program is not emphasized.

In this project we have collected some data and tested whether project is working correctly or not. Test cases should be selected so that the largest number of attributes ofan equivalence class is exercised at once. The testing phase is an important part of software development. It is the process offinding errors and missing operations and also a complete verification to determine whether the objectives are met and the user requirements are satisfied. User acceptance of a system is the key factor for the success of any system.

The system under consideration was tested for user acceptance by constantly keeping in touch with the perspective system users at the time of developing and making changes whenever required. This is done with regard to the following points. Input screen design Output screen design Menu driven system.

Test Approach: Testing can be done in two ways: Bottom up approach Top down approach

Bottom up Approach Testing: can be performed starting from smallest and lowest level modules and proceeding one at a time. For each module in bottom up testing a short program executes the module and provides the needed data so that the module is asked to perform the way it will when embedded within the larger system. When bottom level modules are tested attention turns to those on the next level that use the lower level ones they are tested individually and then linked with the previously examined lower level modules.

Top down Approach: This type of testing starts from upper level modules. Since the detailed activities usually performed in the lower level routines are not provided stubs are written. A stub is a module shell called

Page 23 of 25 by upper level module and that when reached properly will return a message to the calling module indicating that proper interaction occurred. No attempt is made to verify the correctness of the lower level module.

Validation: The system has been tested and implemented successfully and thus ensured that all the requirements as listed in the softwarerequirements specification are completely fulfilled. In case of erroneous input corresponding error messages are displayed.

Test cases: Results testing is the set of activities that can be planned in advance and conducted systematically. The underlying motivation of program testing is to affirm software quality.

Environment setup; First, the screenshots of the input video are shown followed by the screens of the output video that has been obtained in SPYDER ENVIRONMENT. So, the Spyder environment will open a new window in which it will allow the webcam to start and take the live feed to process it and give the required output.

Input and Output responses: The input information (live feed of the user), will be gather by a webcam and will be sent to get process into the proposed system

[022]. CONCLUSION

In this Python project, we have built a drowsy driver alert system that you can implement in numerous ways. We used Open CV to detect faces and eyes using a haar cascade classifier and then we used a CNN model to predict the status. This work will be extended on real time video stream to recognize drowsiness of driver and give him the alert. The system is capable of accurate positioning eye point. Using four parameters of eye states can effectively detect the driver's fatigue status. In order to improve the accuracy grade, our system should be using some other methods as a supplementary means, such as * Road tracking, * Head position, * The rotation rate and the grip force of the steering wheel, which are the main directions to improve our systemaccuracy.

Page 24 of 25

SUMMARY OF INVENTION

[0021]. The present invention is related to the field of artificial intelligence approach to detection analysis and awaking system makes secure authentication and during user money transaction, and user driving module by using facial recognition mapping techniques. This mechanism is to various applications and sectors. A convolutional neural network is a special type of deep neural network which performs extremely well for image classification purposes.

BRIEF DESCRIPTION OF DRAWINGS

FIGURE 1 describes the proposed Input and scoring when the state of eye is open (01) on GUI of the environment FIGURE 2 describes the Input and scoring when the state (02) of eye is open wearing spectacles. FIGURE 3 describes the Scoring when the state of eye is open and Scoring when the state of eye is closed and when the driver is looking to his left FIGURE 4 describes the schematic of the eye template Sifting (03)

FIGURE 5 represents the Open and closed eyes with landmarks p(i) automatically detected (02)

FIGURE 6 describes the Block diagram of Drowsiness Detection System(05)

FIGURE 7 describes the Visualizing the 68 facial landmark coordinates(07) FIGURE 8 represents details of The results (12) of projection (06) from vertical and horizontal direction FIGURE 9 describes the Model structure representation (07) FIGURE 10 describes the details of Images from this dataset of open and closed eyes are shown FIGURE 11 describes the details of General Architecture (09) of drowsiness detection FIGURE 12 describes the details of Flow of Project (10) FIGURE 13 describes the details of (02) Use case diagram FIGURE 14 describes the Sequence diagram (04) FIGURE 15 describes the Collaboration diagrams of the system (04) FIGURE 16 describes the Activity diagram (11)

Page 25 of 25

Claims

ULA Iv1

I Claim,

1. The invention "An Autonomous Early Detection analysis of User Drowsiness through Facial Recognition using Artificial Intelligence " comprises of

a) A system which uses detection or unique facial recognition identification during travelling and ifany transaction has been done. b) A payment method which is easier to access compared to the existing user transaction identification pattern matching method for particular observed. c) A Detection analysis method which safeguards the user details during transaction identification and which user driving.

2. The invention as claimed in claim 1, wherein a biometric system with existing database is linked to alert system , this method awakes to reduces accidents and the need for the customer or user to enter the personaldetails during any transaction through secure authentication.

3. The invention as claimed in claim 1, where the proposed authentication payment method is much easier to access at any cases and can be accessed in all payment transactions links with secure identification , wherein any operations such as money transactions and other government or private operations can be performed under one method with existing resources.

4. The invention claimed in claim 3, where the implementation of user identification number in existing identification with secures the personal details matches as one has to match the user facial or biometric details with that of details in the database during any transaction.

5. The invention claimed in claim 1, wherein the user Drowns mechanism will detect user facial expressions and alert even user eyes are dropped and/or sleep during the travelling on particular driving mode. Autonomous detection alert system observes and alert frequently for better safe journey and secure authentication purposes.

6. The invention claimed in claim 1. User safe guards and authentication purpose this facial recognition mechanism predicts and an analysis better fruitful result for as per desired system without any interrupts.

Page 19 of 29

An Autonomous Early Detection analysis of UserDrowsiness through Facial Recognition Drawings using Artificial Intelligence 28 Jul 2021 2021104681

Figure.1: Input and scoring when the state of eye is open on GUI of the environment

Figure.2: Input and scoring when the state of eye is open wearing spectacles.

1 out of 16 Page 20 of 29

Figure.3: Scoring when the state of eye is open and Scoring when the state of eye is closed andwhen the driver is looking to his left

Figure.4: The schematic of the eye template Sifting

Figure.5:Open and closed eyes with landmarks p(i) automatically detected

3.4.5 out of 16

Page 21 of 29

Figure.6: Block diagram of Drowsiness Detection System

Figure.7: Visualizing the 68 facial landmark coordinates

6.7 out of 16

Page 22 of 29

Figure.8: The results of projection from vertical and horizontal direction

Figure.9: Model structure representation

8.9 out of 16

Page 23 of 29

Figure.10. Images from this dataset of open and closed eyes are shown

Figure.11: General Architecture of drowsiness detection

10.11 out of 16 Page 24 of 29

Figure.12: Flow of Project

12 out of 16

Page 25 of 29

Figure.13: Use case diagram

13 out of 16

Page 26 of 29

Figure.14: Sequence diagram

14 out of 16

Page 27 of 29

Figure.15: Collaboration diagrams of the system

15 out of 16

Page 28 of 29

Figure.16: Activity diagram

16 out of 16

Page 29 of 29