Bank counter personnel hand-lifting identification method and system
Technical Field
The invention relates to the field of image recognition, in particular to a method and a system for identifying hands of bank counter personnel.
Background
At a cash register of a consuming place such as a supermarket, a hotel, a bank or a shopping mall, a cashier can use the cash register to settle the consumption behavior of a customer. Meanwhile, the merchant can check whether the cashier's cash registering behavior is illegal according to the video monitored or recorded by the monitoring system and the relevant cash registering behavior specification.
For banks with manual service, no matter whether customers need to handle low-authority or high-authority services, the non-differentiated processing mode is adopted, counter service personnel with manual service can handle the banks in the whole process, but the problem that whether the behavior of bank counter personnel meets the behavior specification in the manual service process or not can not be found in time by monitoring of the personnel, the labor cost is increased in the monitoring process, and the behavior specification of the counter personnel cannot be supervised and managed in time.
Meanwhile, the conventional gesture recognition technical means are usually time-consuming, and key point data collection and labeling aiming at internal monitoring of a depression angle are difficult. Therefore, a method and a system capable of automatically recognizing the hand-lifting action of the worker are lacked.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a method and a system for identifying the hands held by bank counter personnel. And setting a threshold value of the contact ratio, calculating the contact ratio of each lifted hand in the position frame and the image of each person, comparing the contact ratio with the threshold value, and if the obtained calculated value exceeds the threshold value, considering that the person lifts the hand, otherwise, not considering that the currently judged person lifts the hand.
The purpose of the invention is realized by the following technical scheme:
a bank counter personnel hand-lifting identification method comprises the following steps:
the method comprises the following steps: acquiring image data, respectively acquiring work videos and pictures of staff at a bank counter by using an image acquisition device, and respectively preprocessing the acquired work videos and pictures;
step two, behavior recognition and inference, namely inputting the preprocessed working video and pictures into a target detection model for behavior analysis recognition and inference, and outputting the positions of workers in the working video and pictures and the positions of hands of the workers during hand lifting;
step three: behavior comparison and judgment, namely setting a hand lifting contact degree threshold value, calculating the position frame of each worker in the working video and the picture and the hand lifting contact degree of the hands of the workers, and comparing the calculated hand lifting contact degree with the set hand lifting contact degree threshold value; and if the calculated hand lifting contact ratio is greater than the set hand lifting contact ratio threshold value, the worker is considered to have a hand lifting behavior, otherwise, the worker is determined not to have the hand lifting behavior.
Specifically, the second step specifically includes the following substeps:
s201, acquiring and preprocessing working videos and pictures of bank workers through monitoring equipment, and inputting the preprocessed working videos and pictures into a CNN-backbone network extraction feature map;
s202, a CNN-backbone network extracts feature graphs to accelerate circulation of feature information through a PANET structure, and a similar YOLOV3-Head detection Head is used for generating detection result feature graphs of three different scales;
and S203, performing spatial filtering on the detection result characteristic diagram through the adaptive ASFF module, filtering out conflict information, and converting and splicing the detection result characteristic diagram to output the body position of the worker and the hand position of the worker during hand lifting in the working video and picture.
Specifically, the preprocessing process of the collected working video and picture in the first step specifically includes:
(1) the adaptive picture scaling of the input image specifically comprises the following steps: firstly, the scaling coefficients of the length and the width are calculated according to the length and the width of the input model. And then selecting the smaller scaling coefficient as the overall scaling coefficient, and multiplying the original length and width by the scaling coefficient to obtain the scaled size. The width of the original image is larger than the height, the height is filled with black edges, and because the model is sampled by 32 times, the size of the width of the picture is filled to be 32 times, and the difference with the scaled size is minimum. And after calculating the difference value, performing black edge supplement on the upper part and the lower part of the picture.
(2) And after the picture pixel points are divided by 255 for normalization, exchanging channels to obtain the characteristic value of the input model.
A bank counter personnel hand-lifting identification system comprises an image preprocessing module, a target detection module, a central calculation module, a comparison module and an alarm module;
the image preprocessing module is used for acquiring and acquiring working videos and pictures of staff in a bank counter in real time and preprocessing the acquired working videos and pictures;
the target detection module is used for reasoning and analyzing the preprocessed working video and picture and outputting the body position of the worker and the hand position of the worker when the worker lifts the hand in the working video and picture;
the central computing module is used for computing the coincidence degree of a position frame of each worker in the monitoring image and the hand-lifting action of each worker in the working video and the picture;
the comparison module is used for comparing the calculated contact ratio of the hand lifting action of each worker with a set contact ratio threshold value according to a preset contact ratio threshold value of the hand lifting action of the worker, and finally outputting a comparison result.
The invention has the beneficial effects that:
1. the simple model architecture is beneficial to being transplanted to various edge devices for operation.
2. The training data required by the model is collected and manufactured simply, and the efficiency is saved
3. The model based on target detection has high performance, the speed can reach 10+ ms, and the industrial requirement is met.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Fig. 2 is a schematic diagram of a hand lifting identification process according to an embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a hand raising recognition model according to an embodiment of the present invention.
Detailed Description
In order to more clearly understand the technical features, objects, and effects of the present invention, embodiments of the present invention will now be described with reference to the accompanying drawings.
In this embodiment, as shown in fig. 1, a method for identifying a bank counter person by raising a hand includes the following steps:
the method comprises the following steps: acquiring image data, respectively acquiring work videos and pictures of staff at a bank counter by using an image acquisition device, and respectively preprocessing the acquired work videos and pictures;
step two, behavior recognition and inference, namely inputting the preprocessed working video and pictures into a target detection model for behavior analysis recognition and inference, and outputting the positions of workers in the working video and pictures and the positions of hands of the workers during hand lifting;
step three: performing behavior comparison and judgment, setting a threshold value of the overlap ratio of hands, calculating the overlap ratio of the position frame of each worker and the hands of the workers in the working video and the pictures, and comparing the calculated overlap ratio of hands with the set threshold value of the overlap ratio of hands; and if the calculated hand lifting contact ratio is greater than the set hand lifting contact ratio threshold value, the worker is considered to have a hand lifting behavior, otherwise, the worker is determined not to have the hand lifting behavior.
Specifically, the second step specifically includes the following substeps:
s201, acquiring and preprocessing working videos and pictures of bank workers through monitoring equipment, and inputting the preprocessed working videos and pictures into a CNN-backbone network extraction feature map;
s202, a CNN-backbone network extracts feature graphs to accelerate circulation of feature information through a PANET structure, and a similar YOLOV3-Head detection Head is used for generating detection result feature graphs of three different scales;
and S203, performing spatial filtering on the detection result characteristic diagram through the adaptive ASFF module, filtering out conflict information, and converting and splicing the detection result characteristic diagram to output the body position of the worker and the hand position of the worker during hand lifting in the working video and picture.
Specifically, the preprocessing process of the collected working video and pictures in the first step specifically includes:
(1) the process of the adaptive picture scaling of the input image specifically comprises the following steps: firstly, the scaling coefficients of the length and the width are calculated according to the length and the width of the input model. And then selecting the smaller scaling coefficient as the overall scaling coefficient, and multiplying the original length and width by the scaling coefficient to obtain the scaled size. The width of the original image is larger than the height, the height is filled with black edges, and because the model is sampled by 32 times, the size of the width of the picture is filled to be 32 times, and the difference with the scaled size is minimum. And after calculating the difference value, performing black edge supplement on the upper part and the lower part of the picture.
(2) And after the picture pixel points are divided by 255 for normalization, exchanging channels to obtain the characteristic value of the input model.
A bank counter personnel hand-lifting identification system comprises an image preprocessing module, a target detection module, a central calculation module, a comparison module and an alarm module;
the image preprocessing module is used for acquiring and acquiring working videos and pictures of staff in a bank counter in real time and preprocessing the acquired working videos and pictures;
the target detection module is used for reasoning and analyzing the preprocessed working video and picture and outputting the body position of the worker and the hand position of the worker when the worker lifts the hand in the working video and picture;
the central computing module is used for computing the coincidence degree of a position frame of each worker in the monitoring image and the hand-lifting action of each worker in the working video and the picture;
the comparison module is used for comparing the calculated contact ratio of the hand lifting action of each worker with a set contact ratio threshold value according to a preset contact ratio threshold value of the hand lifting action of the worker, and finally outputting a comparison result.
The target detection model of the invention adopts a model structure of YOLOV4 to output the position frames of people and seals, and the specific structure is shown in FIG. 3. YOLOV4 is to extract feature maps by using CSPDarknet53 as backbone network, then to perform multi-scale sampling by SPP, to speed up the communication between feature information by PANet, and to obtain the required output by using the head structure of YOLOV 3.
In the embodiment of the invention, the hand-lifting identification process is as shown in fig. 2, when the worker identifies the hand-lifting, the input video and the picture are preprocessed, the preprocessed image is input into the target detection model for reasoning, and the positions (if the positions) of the person and the hand-lifting are output. And setting a threshold value of the contact ratio, calculating the contact ratio of each lifted hand in the position frame and the image of each person, comparing the contact ratios with the threshold value, and if the obtained calculated value exceeds the threshold value, considering that the person lifts the hand, otherwise, not considering that the currently judged person lifts the hand.
The hand-lifting recognition model image used by the invention is shown in fig. 3, the model input is a preprocessed image, the preprocessed image is firstly transmitted into a CNN-backbone network to extract a feature map, the feature map accelerates the circulation of feature information through a PANET structure, a similar YOLOV3-Head detection Head is used for generating three detection result feature maps with different scales, and a method for inhibiting inconsistency by learning spatial filtering conflict information through an adaptive ASFF module improves the scale invariance of features. And finally, transforming and splicing the predictions by a method to obtain the content which needs to be output.
The foregoing shows and describes the general principles, principal features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.