CN114170573A

CN114170573A - School library seat occupation detection method based on YOLO v5

Info

Publication number: CN114170573A
Application number: CN202111545932.0A
Authority: CN
Inventors: 陈国栋; 陈文铿; 林榆翔; 赵志峰; 黄立萱; 方莉; 严铮; 林鸿强; 边根成
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2021-12-17
Filing date: 2021-12-17
Publication date: 2022-03-11

Abstract

The invention relates to a campus library seat occupation detection method based on YOLO v 5. The camera and the YOLO V5 target detection algorithm monitor the vacant seat condition of the library in real time, for example, if someone on the seat can judge that someone on the seat, if no person is detected on the seat, it needs to be judged whether the seat occupying action is malicious or not at the moment, the lamp on the desk can be lighted to red, if no person is detected on the seat, the seat occupying action is considered, the timing is started at the moment, if the seat occupying time is more than 30 minutes, the seat occupying action is considered malicious at the moment, the lamp on the desk can be changed into green, the student without the seat can sit down, and the last person on the seat leaves for a long time or other students occupy a plurality of positions by one person. What is needed is to detect whether a person is on the seat in the first step, detect whether a large seat occupying object such as a book or a bag is on the table if no person is on the table, and calculate the placing time of the seat occupying object. Over 30 minutes this is considered a malicious seating occupation and other classmates can sit down with confidence.

Description

School library seat occupation detection method based on YOLO v5

Technical Field

The invention relates to a campus library seat occupation detection method based on YOLO v 5.

Background

In recent years, with the increase of the number of examinees and the number of examinee public institutions, the number of seats in a library of a school is limited, so that the requirement that all students have seats, many students and friends help the seats is not met, a plurality of books or schoolbag seats are placed on the seats, and possibly no people are in the seat in the morning. Even if the students have disputes due to seat occupation, a device is needed to detect the seat occupation behavior, and the judgment by a machine can reduce the disputes and greatly improve the utilization rate of the seat, thereby preventing the vicious seat occupation.

Disclosure of Invention

The invention aims to provide a campus library occupancy detection method based on YOLO V5, and provides a monitoring system which utilizes a YOLO V5 detection model and a K-means clustering algorithm capable of obtaining a more reasonable anchor frame, quickly searches key points and identifies desk chairs, people, books, bags and other objects.

In order to achieve the purpose, the technical scheme of the invention is as follows: a campus library occupation detection method based on YOLO v5 adopts a YOLO 5 detection method fused with K-means to obtain a more reasonable anchor frame algorithm and improve the accuracy of classification target detection, and the method is concretely implemented by the following steps:

step S1, obtaining scene images from the monitoring videos of the monitoring areas of the cameras in the library, and detecting tables and chairs in the images according to a deep learning target detection algorithm framework YOLO 5;

s2, obtaining more reasonable 9 anchors frames needed by YOLO V5 by using a K-means clustering algorithm;

step S3, after the chair and the person are detected, the IOU ratio of the coincidence degree between the images of the chair and the person is calculated to judge whether the chair is occupied or not;

step S4, judging whether the distance is smaller than a threshold value, if so, judging that a person is on the seat, lighting a red light, and if not, continuing to detect the book and the bag;

step S5, if no book or bag is detected, then it is determined that no person is present and the position is sitting, and a green light is turned on;

and step S6, if the book and the bag are detected, timing is started, if the time of no person exceeds 30 minutes, the seat is judged to be occupied maliciously, and a green light is turned on.

In an embodiment of the present invention, in step S1, the training of YOLO V5 includes the following steps;

step A1, firstly, constructing a data set containing a table, a chair and students, and then amplifying the data set by affine transformation and rotation to generate a data set which is large enough and has high picture quality;

a2, building a neural network model required by a YOLO V5 framework, adopting a pitorch network framework and a Relu activation function as an activation function, and solving the problem of neuron death when an input value is a negative value;

step A3, dividing data into training sets: and (4) verification set: test set = 8: 1: adding an Auto Learning Bounding Box-adaptive anchor frame, analyzing a user-defined data set by adopting a k-means and a genetic Learning algorithm, and obtaining a preset anchor frame suitable for predicting an object boundary frame in the user-defined data set, wherein the frame is automatically learned based on training data.

In an embodiment of the invention, by using Darknet-53 as a backbone network, 3 feature layers with different scales are respectively 13 × 13, 26 × 26 and 52 × 52, and YOLO V5 firstly sets 3 prior frames for each downsampling scale so as to cluster to obtain prior frames with 9 sizes; in the entire YOLO V5 structure, there are no pooling layers and full connectivity layers, and downsampling of the network is achieved by setting the stride of the convolution to 3; a Deepsort algorithm is fused on the basis of target identification based on the current-stage YOLO V5 algorithm, and prediction heads are integrated and applied to unmanned aerial vehicle shooting, so that the technology that the target can be accurately positioned and continuously tracked in a high-density scene is finally achieved.

Compared with the prior art, the invention has the following beneficial effects: the method comprises the steps of monitoring whether a seat is occupied on a table in a library or not by arranging a monitoring camera in the library and utilizing a YOLO V5 model, sending a video frame acquired by the camera into the YOLO V5 model, fusing a Deepsort algorithm on the basis of identifying a target based on the current YOLO V5 algorithm, integrating prediction heads, applying the detection heads to unmanned aerial vehicle shooting, and finally achieving the technology of accurately positioning the target in a high-density scene and continuously tracking so as to achieve the purpose of real-time monitoring.

Drawings

FIG. 1 is a schematic diagram of the network structure of YOLO V5;

FIG. 2 is a formula for a loss function of the YOLO V5 model;

FIG. 3 is a schematic workflow diagram of an embodiment of the present invention;

FIG. 4 illustrates the identification of whether a chair is empty in a restaurant according to the present invention;

fig. 5 shows the result of recognition of whether or not an occupied seat article is present when an empty seat is detected.

Detailed Description

The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.

The invention relates to a campus library occupation detection method based on YOLO v5, which adopts a K-means fused YOLO 5 detection method to obtain a more reasonable anchor frame algorithm and improve the accuracy of classified target detection, and the method is concretely implemented by the following steps:

The following are specific embodiments of the present invention.

The campus library seat occupation detection method based on the YOLO V5 is characterized in that an end-to-end deep learning target detection algorithm YOLO V5 is used for detection of students, tables, chairs, books and bags, a model is optimized by adopting a K-means algorithm, and detection performance is improved. The YoLO V5 network model is shown in FIG. 1.

Specifically, the YOLO V5 model adopts ReLU as an activation function and is trained by an end-to-end method, and the YOLO V5 model adopts a loss function when performing a gradient descent method as shown in fig. 2:

the first part and the second part are responsible for predicting bbox (boundary box) of an object, the first part represents error values of center point coordinates and ground truth center point mislabels obtained by forward propagation of the image along the neural network, and the second part measures error values of frame width height and ground truth width height obtained by forward propagation of the image along the neural network; the third part represents an error value of the confidence coefficient of the prediction frame containing the target object, and the confidence coefficient of the prediction frame containing the target object reaches 1 after training; the fourth part represents an error value of the confidence coefficient of the prediction frame without the target object, and the confidence coefficient of the prediction widening without the target object is trained to reach 0; the fifth part is a mesh prediction classification error term that contains the target object.

The specific implementation steps will be described with reference to the workflow diagram of fig. 3:

step 1, obtaining a scene image from a monitoring video of a monitoring area by a monitoring camera;

step 2, generating initial anchors frames required by 9 YOLO V5 models by using a K-means algorithm;

step 3, detecting students and chairs in the image according to a deep learning target detection algorithm framework YOLO 5, and identifying and outputting by using a rectangular frame;

step 4, judging whether a person sits on the chair or not according to the IOU ratio of the chair to the person image prior frame;

step 5, if the chair is judged to be occupied, controlling a red light to be on, and if the chair is judged to be unoccupied, detecting books and bags;

step 6, if the person is detected, timing for 30 minutes, and if the person is not detected in more than 30 minutes, judging that the behavior is seat occupation;

fig. 4 is a diagram of detecting whether a person is present in a seat: detecting the IOU ratio map of the person, the chair and the prior frames of the person and the chair through different detection frames, and judging whether the person sits on the chair or not according to the ratio;

fig. 5 is a diagram showing the effect of detecting whether there is an object occupying the seat on the table when no person is present.

The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims

1. A campus library occupation detection method based on YOLO v5 is characterized in that a K-means fused YOLO 5 detection method is adopted to obtain a more reasonable anchor frame algorithm and improve the accuracy of classification target detection, and the method is concretely implemented by the following steps:

2. The method for detecting occupancy in campus library based on YOLO V5 as claimed in claim 1, wherein in step S1, the training of YOLO V5 includes the following steps;

3. The method for detecting occupancy of a campus library based on YOLO V5 as claimed in claim 2, wherein Darknet-53 is used as backbone network, the feature layers of 3 different scales are respectively 13 × 13, 26 × 26 and 52 × 52, YOLO V5 firstly sets 3 prior frames for each downsampling scale so as to cluster to obtain 9 prior frames; in the entire YOLO V5 structure, there are no pooling layers and full connectivity layers, and downsampling of the network is achieved by setting the stride of the convolution to 3; a Deepsort algorithm is fused on the basis of target identification based on the current-stage YOLO V5 algorithm, and prediction heads are integrated and applied to unmanned aerial vehicle shooting, so that the technology that the target can be accurately positioned and continuously tracked in a high-density scene is finally achieved.