CN115100732A

CN115100732A - Phishing detection method, device, computer equipment and storage medium

Info

Publication number: CN115100732A
Application number: CN202110250476.0A
Authority: CN
Inventors: 顾林松; 王京
Original assignee: Shenzhen Intellifusion Technologies Co Ltd
Current assignee: Shenzhen Intellifusion Technologies Co Ltd
Priority date: 2021-03-08
Filing date: 2021-03-08
Publication date: 2022-09-23

Abstract

The invention discloses a fishing detection method, a fishing detection device, computer equipment and a storage medium, wherein the method comprises the following steps: acquiring a target detection frame comprising a pedestrian and a fishing rod; performing phishing behavior recognition on the target detection frame through the phishing behavior detection model, and outputting a first result of which the recognition result is 'phishing behavior'; carrying out fishing behavior recognition on the target detection frame through the human fishing rod key point detection model, and outputting a second result of which the recognition result is 'fishing behavior'; and voting the first result and the second result to output a final recognition result. The fishing detection method provided by the invention can be used for detecting and managing the sneaking fishing behaviors in 24 hours all day under the scene of day and night, so that the labor cost can be greatly saved, the illegal fishing behaviors in a large water area can be identified in 24 hours all day at low cost, the management level of the water area in which fishing is forbidden is improved, the final identification result is output by voting according to the average value of the first result and the second result, and the detection accuracy is improved.

Description

Phishing detection method, device, computer equipment and storage medium

技术领域technical field

本发明涉及计算机视觉识别技术领域，尤其涉及一种钓鱼检测方法、装置、计算机设备及存储介质。The present invention relates to the technical field of computer visual recognition, and in particular, to a fishing detection method, device, computer equipment and storage medium.

背景技术Background technique

钓鱼是一项受众较多的户外业余活动，经常可以看到在河流边、湖边或水库边进行垂钓的爱好者，但是为了管理需要以及商业利益很多地方是不允许钓鱼的，比如景区的湖泊、私人承包的水库。但仍然会时不时地出现一些偷钓行为，不仅给相关水域的安全管理带来了隐患，也可能会对偷钓者的自身造成人身伤害，如意外落水、溺水，或触碰高压电等等，而且偷钓过程中产生的废弃物也极有可能造成水体污染等等。Fishing is an outdoor leisure activity with a large audience. You can often see fishing enthusiasts by rivers, lakes or reservoirs. However, for management needs and commercial interests, fishing is not allowed in many places, such as scenic lakes. , Privately contracted reservoirs. However, there are still some poaching behaviors from time to time, which not only bring hidden dangers to the safety management of the relevant waters, but also may cause personal injury to the poachers themselves, such as accidental falling into the water, drowning, or touching high-voltage electricity, etc. , and the waste generated in the process of poaching is also very likely to cause water pollution and so on.

现有的禁止钓鱼者在管制水域进行非法钓鱼的手段主要包括在相应区域设置禁止钓鱼的警示牌，或安排专门的管理人员进行巡逻。但是在实践中发现，设置警示牌并不能有效地起到驱离非法钓鱼者的作用，而安排专门的管理人员巡逻则极大的浪费人力资源，也不能做到24小时全天候的巡逻监视。Existing methods for prohibiting anglers from illegal fishing in controlled waters mainly include setting up warning signs prohibiting fishing in corresponding areas, or arranging special management personnel to conduct patrols. However, in practice, it is found that setting up warning signs cannot effectively drive away illegal fishermen, and arranging special management personnel to patrol is a huge waste of human resources, and it cannot achieve 24-hour patrol monitoring.

有鉴于于此，一些现有技术也提供了基于计算机视觉识别的手段来对非法偷钓者进行目标检测识别管理。目标检测是计算机视觉领域的基本任务之一，学术界已有将近二十年的研究历史。近些年随着深度学习技术的火热发展，目标检测算法也从基于手工特征的传统算法转向了基于深度神经网络的检测技术。In view of this, some existing technologies also provide means based on computer vision recognition to perform target detection, recognition and management on illegal phishers. Object detection is one of the basic tasks in the field of computer vision, and it has been studied in academia for nearly two decades. In recent years, with the rapid development of deep learning technology, the target detection algorithm has also shifted from the traditional algorithm based on manual features to the detection technology based on deep neural network.

但现有的计算机非法钓鱼识别技术主要依据的原理是基于传统的前后帧像素比对方法。像素比对方法的原理是通过对待检测区域进行实时图像采集，对采集图像的前后帧数据进行比对。具体是，计算k时刻和k+1时刻的图像像素差值，用以检测前后时刻的图像差异。而后计算差异像素点数值的概率分布并与已有钓鱼图像数据库进行比对，判断是否存在钓鱼行为。由于能够搜集到的钓鱼图像数据有限，即可以比对的标准数据分布有限，且采集和计算出的像素差异数值分布是随机多样的，导致这种方法存在两个问题：(1)经常会把行人误认为钓鱼者；(2)检测不出钓鱼行为。同时，现有的计算机非法钓鱼识别技术采集的图像数据主要是可见光图像，对光照较为敏感，而很多钓鱼爱好者喜欢在夜间钓鱼，因此，在夜幕掩护下，这些非法钓鱼者依然可以逃避监控，无法对偷钓行为进行全天24小时的实时监控。However, the main principle of the existing computer illegal phishing identification technology is based on the traditional pixel comparison method between the front and back frames. The principle of the pixel comparison method is to compare the frame data before and after the collected image by collecting real-time images of the area to be detected. Specifically, the image pixel difference values at time k and time k+1 are calculated to detect the image difference between the previous and the previous time. Then, the probability distribution of the difference pixel values is calculated and compared with the existing fishing image database to determine whether there is fishing behavior. Due to the limited fishing image data that can be collected, that is, the distribution of standard data that can be compared is limited, and the numerical distribution of pixel differences collected and calculated is random and diverse, resulting in two problems with this method: (1) Often the Pedestrians mistaken for anglers; (2) no fishing behavior can be detected. At the same time, the image data collected by the existing computer illegal fishing identification technology are mainly visible light images, which are more sensitive to light, and many fishing enthusiasts like to fish at night. Therefore, under the cover of night, these illegal fishermen can still avoid monitoring. There is no real-time monitoring of phishing behavior 24 hours a day.

发明内容SUMMARY OF THE INVENTION

为解决上述技术问题，本发明实施例提供一种钓鱼检测方法、装置、计算机设备及存储介质，可在昼夜场景下对偷钓行为进行全天24小时的检测管理，检测准确度高。To solve the above technical problems, embodiments of the present invention provide a method, device, computer equipment and storage medium for phishing detection, which can perform 24-hour detection and management of phishing behavior in day and night scenarios with high detection accuracy.

一种钓鱼检测方法，包括：A phishing detection method comprising:

获取目标检测框，其中，所述目标检测框内包括行人和鱼竿；obtaining a target detection frame, wherein the target detection frame includes pedestrians and fishing rods;

通过钓鱼行为检测模型对所述目标检测框进行钓鱼行为识别，输出识别结果为“钓鱼行为”的第一结果；Perform phishing behavior recognition on the target detection frame through the phishing behavior detection model, and output the first result of the recognition result as "phishing behavior";

通过人体鱼竿关键点检测模型对所述目标检测框进行钓鱼行为识别，输出识别结果为“钓鱼行为”的第二结果；Perform fishing behavior recognition on the target detection frame through the human fishing rod key point detection model, and output the recognition result as the second result of "fishing behavior";

对所述第一结果和所述第二结果进行投票，以输出最终的识别结果。The first result and the second result are voted to output the final recognition result.

优选地，在上述的钓鱼检测方法中，所述获取目标检测框包括：Preferably, in the above fishing detection method, the acquiring the target detection frame includes:

提取待检测图像中的行人框和鱼竿框；Extract the pedestrian frame and fishing rod frame in the image to be detected;

将所述鱼竿框分别与同一待检测图像中的行人框进行匹配；matching the fishing rod frame with the pedestrian frame in the same image to be detected;

若匹配得到的交并比值大于预设的交并比阈值，则确定此时的行人框中的行人与该鱼竿框中的鱼竿相关联；If the intersection ratio obtained by matching is greater than the preset intersection ratio threshold, it is determined that the pedestrian in the pedestrian frame at this time is associated with the fishing rod in the fishing rod frame;

根据所述行人和鱼竿生成目标检测框。A target detection frame is generated based on the pedestrian and fishing rod.

优选地，在上述的钓鱼检测方法中，采用预训练好的目标检测模型提取待检测图像中的行人框和鱼竿框，所述提取待检测图像中的行人框和鱼竿框之前，所述方法还包括：Preferably, in the above fishing detection method, a pre-trained target detection model is used to extract the pedestrian frame and the fishing rod frame in the image to be detected. Before extracting the pedestrian frame and the fishing rod frame in the image to be detected, the Methods also include:

获取预处理过的样本数据；Obtain preprocessed sample data;

将所述预处理过的样本数据输入到预设的初始目标检测模型中，以得到输出结果；Inputting the preprocessed sample data into a preset initial target detection model to obtain an output result;

根据预设的焦点损失函数和所述输出结果，调整所述初始目标检测模型中的样本类型权重，以得到调参目标检测模型；According to the preset focus loss function and the output result, adjust the sample type weight in the initial target detection model to obtain a parameter-adjusted target detection model;

通过批随机梯度下降算法训练所述调参目标检测模型，以得到预训练好的目标检测模型。The parameter-adjusted target detection model is trained by a batch stochastic gradient descent algorithm to obtain a pre-trained target detection model.

优选地，在上述的钓鱼检测方法中，所述获取预处理过的样本数据包括：Preferably, in the above fishing detection method, the obtaining of the preprocessed sample data includes:

获取样本数据中的第一原始图像和第二原始图像；obtaining the first original image and the second original image in the sample data;

将所述第一原始图像和所述第二原始图像按照混合权重进行混合增强处理，以得到增强后的样本数据。The first original image and the second original image are mixed and enhanced according to the mixed weight to obtain enhanced sample data.

优选地，在上述的钓鱼检测方法中，所述通过钓鱼行为检测模型对所述目标检测框进行钓鱼行为识别，输出识别结果为“钓鱼行为”的第一结果之前，所述方法还包括：Preferably, in the above-mentioned phishing detection method, before the phishing behavior recognition is performed on the target detection frame through the phishing behavior detection model, and before outputting the first result of the recognition result as "fishing behavior", the method further includes:

提取所述预处理过的样本数据中的人竿框，其中，所述人竿框内包括样本行人和样本钓鱼竿；extracting a human-rod frame in the preprocessed sample data, wherein the human-rod frame includes a sample pedestrian and a sample fishing rod;

根据所述人竿框和批随机梯度下降算法训练预设的分类模型；Train a preset classification model according to the human pole frame and batch stochastic gradient descent algorithm;

若通过训练得到的训练结果达到预设的训练阈值时，将此时训练好的分类模型作为钓鱼行为检测模型。If the training result obtained through training reaches the preset training threshold, the classification model trained at this time is used as the fishing behavior detection model.

优选地，在上述的钓鱼检测方法中，所述通过人体鱼竿关键点检测模型对所述目标检测框进行钓鱼行为识别，输出识别结果为“钓鱼行为”的第二结果之前，所述方法还包括：Preferably, in the above fishing detection method, before the target detection frame is identified as fishing behavior by using a human fishing rod key point detection model, and before outputting the second result of the identification result as "fishing behavior", the method further include:

将所述人竿框进行关键点特征标记，以得到样本行人对应的人体关键点和样本钓鱼竿对应的鱼竿关键点；Carrying out key point feature marking on the human-rod frame to obtain human key points corresponding to the sample pedestrian and fishing rod key points corresponding to the sample fishing rod;

根据所述人体关键点、鱼竿关键点以及批随机梯度下降算法对预设的分割模型进行训练，并将训练好的分割模型作为所述人体鱼竿关键点检测模型。The preset segmentation model is trained according to the human body key points, the fishing rod key points and the batch stochastic gradient descent algorithm, and the trained segmentation model is used as the human fishing rod key point detection model.

优选地，在上述的钓鱼检测方法中，所述对所述第一结果和所述第二结果进行投票，以输出最终的识别结果包括：Preferably, in the above phishing detection method, the voting on the first result and the second result to output the final identification result includes:

分别获取所述第一结果的均值和所述第二结果的置信度；respectively obtaining the mean value of the first result and the confidence level of the second result;

根据所述均值和所述置信度进行投票计算，以输出最终的识别结果。Voting is calculated according to the mean and the confidence to output the final recognition result.

一种钓鱼检测装置，包括：A fishing detection device, comprising:

RGB-IR图像获取模块，用于获取监控水域场景下的待检测图像；The RGB-IR image acquisition module is used to acquire the image to be detected in the monitoring water scene;

目标检测模块，提取待检测图像中的行人框和鱼竿框，根据交并比阈值生成同时包括行人和鱼竿的目标检测框；The target detection module extracts the pedestrian frame and the fishing rod frame in the image to be detected, and generates the target detection frame including the pedestrian and the fishing rod according to the threshold of the intersection ratio;

钓鱼行为检测模块，对所述目标检测框进行钓鱼行为识别，输出识别结果为“钓鱼行为”的第一结果；The phishing behavior detection module performs phishing behavior recognition on the target detection frame, and outputs the first result of the recognition result as "phishing behavior";

人体鱼竿关键点检测模块，对所述目标检测框进行钓鱼行为识别，输出识别结果为“钓鱼行为”的第二结果；A human fishing rod key point detection module, which performs fishing behavior identification on the target detection frame, and outputs the second result of the identification result as "fishing behavior";

投票模块，根据所述第一结果的均值和所述第二结果的置信度进行投票计算，输出最终的识别结果。The voting module performs voting calculation according to the mean value of the first result and the confidence level of the second result, and outputs the final recognition result.

一种计算机设备，包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现所述的钓鱼检测方法。A computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the phishing detection method when the processor executes the computer program.

一种计算机可读存储介质，所述计算机可读存储介质存储有计算机程序，所述计算机程序被处理器执行时实现所述的钓鱼检测方法。A computer-readable storage medium stores a computer program, and the computer program implements the phishing detection method when executed by a processor.

本发明的有益效果为：本发明提出的钓鱼检测方法可在夜间场景下对偷钓行为进行全天24小时的检测管理，通过第一结果的均值与第二结果进行投票输出最终识别结果，提升了检测的准确度。通过全天24小时自动检测，极大地节省了人力成本，使得大型水域的非法钓鱼行为可以被全天24小时，低成本地，高准确率地识别，提高了对禁止钓鱼的水域的管理水平。The beneficial effects of the present invention are as follows: the fishing detection method proposed by the present invention can detect and manage the poaching behavior 24 hours a day in the night scene, and output the final identification result by voting on the mean value of the first result and the second result. the detection accuracy. Through automatic detection 24 hours a day, labor costs are greatly saved, so that illegal fishing in large waters can be identified 24 hours a day at low cost and with high accuracy, improving the management level of waters where fishing is prohibited.

附图说明Description of drawings

为了更清楚地说明本发明实施例的技术方案，下面将对本发明实施例的描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions of the embodiments of the present invention more clearly, the following briefly introduces the drawings that are used in the description of the embodiments of the present invention. Obviously, the drawings in the following description are only some embodiments of the present invention. , for those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative labor.

图1为本发明中所述钓鱼检测方法的流程图；Fig. 1 is the flow chart of the fishing detection method described in the present invention;

图2为获取所述目标检测框的流程图；Fig. 2 is the flow chart of obtaining described target detection frame;

图3为本发明中所述目标检测模型的预训练流程图；Fig. 3 is the pre-training flow chart of the target detection model described in the present invention;

图4为本发明中所述钓鱼行为检测模型的训练流程图；Fig. 4 is the training flow chart of the fishing behavior detection model described in the present invention;

图5为本发明中所述人体鱼竿关键点检测模型的训练流程图；Fig. 5 is the training flow chart of the key point detection model of the human fishing rod described in the present invention;

图6为本发明中所述钓鱼检测装置的结构示意图；6 is a schematic structural diagram of the fishing detection device described in the present invention;

图7为本发明中所述计算机设备一实施例的内部结构示意图；FIG. 7 is a schematic diagram of the internal structure of an embodiment of the computer device according to the present invention;

图8为本发明中所述计算机设备另一实施例的内部结构示意图。FIG. 8 is a schematic diagram of the internal structure of another embodiment of the computer device according to the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

请参考图1，如图所示，本发明的实施例提出的一种钓鱼检测方法，该钓鱼检测方法包括：Please refer to FIG. 1. As shown in the figure, a phishing detection method proposed by an embodiment of the present invention includes:

步骤S100，获取目标检测框，其中，所述的目标检测框内包括行人和鱼竿。Step S100, acquiring a target detection frame, wherein the target detection frame includes pedestrians and fishing rods.

具体地，待检测图像可由布置在监控水域场景下的图像传感器进行实时采集，其可以是时序帧图像，也可以是流媒体视频中的时序帧图像。从流媒体视频中取出的时序帧图像可以是连续的帧图像，也可以是一个时序帧图像集合中采用间隔取帧规则获取的间隔帧图像。例如，流媒体视频中包括M帧待检测图像，则从M帧待检测图像中每间隔N帧获取至少一帧待检测图像。需要说明的是，流媒体视频中待检测图像帧速一般为每秒25帧以上，如果对每一帧待检测图像都进行检测，则会增加运算量，降低钓鱼行为识别检测的响应及时性。在本实施例中，从流媒体视频中间隔获取多帧待检测图像，可以降低图像处理的运算量，提高对钓鱼行为识别检测的速度，做到快速地实时检测结果反馈，有力地打击非法钓鱼的偷钓者。Specifically, the to-be-detected image can be collected in real time by an image sensor arranged in the monitoring water scene, and it can be a time-series frame image or a time-series frame image in a streaming media video. The time-series frame images extracted from the streaming media video may be continuous frame images, or may be interval frame images obtained by adopting the interval frame-taking rule in a time-series frame image set. For example, if the streaming media video includes M frames of images to be detected, at least one frame of images to be detected is acquired from the M frames of images to be detected every N frames. It should be noted that the frame rate of the image to be detected in the streaming video is generally more than 25 frames per second. If each frame of the image to be detected is detected, it will increase the amount of computation and reduce the response timeliness of phishing behavior identification and detection. In this embodiment, multiple frames of images to be detected are obtained at intervals from the streaming media video, which can reduce the computational complexity of image processing, improve the speed of identification and detection of phishing behaviors, achieve rapid real-time feedback of detection results, and effectively combat illegal phishing. of poachers.

现场实时获取的待检测图像可以通过有线或无线的方式传输给监控中心服务器，然后服务器对待检测图像中出现的行人和鱼竿进行检测识别，并以行人框对该待检测图像上的行人进行标记框定，以鱼竿框对该待检测图像上的鱼竿进行标记框定。在对行人和鱼竿进行标记框定时，采用自动滑窗的形式对该待检测图像进行滑窗扫描检测。The images to be detected obtained on site in real time can be transmitted to the monitoring center server by wired or wireless means, and then the server will detect and identify the pedestrians and fishing rods appearing in the images to be detected, and mark the pedestrians on the images to be detected with pedestrian frames. Frame, mark and frame the fishing rod on the to-be-detected image with a fishing rod frame. When marking frames for pedestrians and fishing rods, the image to be detected is scanned and detected by a sliding window in the form of an automatic sliding window.

具体地，通过遍历每一帧待检测图像上的行人框，根据预设的交并比阈值确定该帧待检测图像上与鱼竿框关联匹配的行人框，生成同时包含行人和鱼竿的目标检测框。Specifically, by traversing the pedestrian frame on each frame of the image to be detected, and determining the pedestrian frame associated with the fishing rod frame on the image to be detected according to a preset intersection ratio threshold, a target containing both pedestrians and fishing rods is generated. Check box.

具体地，在步骤S100中检测出的待检测图像上的行人框和鱼竿框有可能是互无关联，没有匹配关系的，也可能是关联匹配的。在此处所称的关联匹配是指，该行人框和鱼竿框是一个偷钓者的特征集合，即行人框为该偷钓者的人体所反映出的人体特征，鱼竿框为该偷钓者的鱼竿所反映出的鱼竿特征，即此时存在一个偷钓者。通过将行人框和鱼竿框进行关联匹配，生成同时包含行人和鱼竿的目标检测框，可以避免将经过的无关路人识别为偷钓者，提高了识别检测的准确度。Specifically, the pedestrian frame and the fishing rod frame on the image to be detected detected in step S100 may be unrelated to each other, have no matching relationship, or may be associated and matched. The association matching referred to here means that the pedestrian frame and the fishing rod frame are a feature set of a poaching fisher, that is, the pedestrian frame is the human body feature reflected by the poaching fisher's body, and the fishing rod frame is the poaching fisher's human body. The characteristics of the fishing rod reflected by the person's fishing rod, that is, there is a poaching fisher at this time. By associating and matching the pedestrian frame and the fishing rod frame, a target detection frame containing both the pedestrian and the fishing rod can be generated, which can avoid identifying the passing unrelated passers-by as poachers, and improve the accuracy of identification and detection.

步骤S200，通过钓鱼行为检测模型对所述目标检测框进行钓鱼行为识别，输出识别结果为“钓鱼行为”的第一结果。In step S200, the target detection frame is identified as a phishing behavior by using a phishing behavior detection model, and the first result of the identification result of "fishing behavior" is output.

具体地，目标检测框是一个同时包含了行人和鱼竿的标记框，即在步骤 S200中检测识别的具有关联匹配关系的一个偷钓者的两个关联匹配特征，即其行人特征和鱼竿特征，通过对该包含了行人和鱼竿的目标检测框的识别，可以检测出该待检测图像中的一个或多个偷钓者，一个目标检测框代表一个偷钓者。在一个待检测图像中，可能不存在目标检测框，也可能存在一个，或多个目标检测框。在不存在目标检测框的情况下，该待检测图像不进入步骤S200的钓鱼行为识别过程。在存在一个目标检测框或多个目标检测框的情况下，该待检测图像进入步骤S200进行钓鱼行为识别，并输出结果为“钓鱼行为”的第一结果。Specifically, the target detection frame is a marker frame that includes both a pedestrian and a fishing rod, that is, two associated matching features of a poacher with an associated matching relationship detected and identified in step S200, that is, the pedestrian feature and the fishing rod. One or more poachers in the to-be-detected image can be detected by identifying the target detection frame containing pedestrians and fishing rods, and a target detection frame represents a poacher. In an image to be detected, there may be no target detection frame, or there may be one or more target detection frames. In the case where there is no target detection frame, the to-be-detected image does not enter the fishing behavior identification process of step S200. In the case where there is one target detection frame or multiple target detection frames, the image to be detected enters step S200 to identify the fishing behavior, and the output result is the first result of "fishing behavior".

步骤S300，通过人体鱼竿关键点检测模型对目标检测框进行钓鱼行为识别，输出识别结果为“钓鱼行为”的第二结果。Step S300 , the target detection frame is identified by the fishing behavior of the key point detection model of the human fishing rod, and the identification result is output as the second result of "fishing behavior".

具体地，在对人体鱼竿关键点进行检测时，是载入与输出第一结果的帧图像在一个时序下的多帧图像的目标检测框，即将与输出第一结果的帧图像在一个时序下的多帧图像中的目标检测框同时进行载入，然后对多帧图像中目标检测框内的人体鱼竿关键点特征进行钓鱼行为识别。需要注意的是，在此处的钓鱼行为识别与步骤S200中的钓鱼行为识别不同，在此处，是通过对偷钓者的人体关键点特征和鱼竿关键点特征进行识别，而非对目标检测框的识别。通过获得多帧的人体鱼竿关键点特征数据，然后使用STGCN(时空图卷积网络模型)提取关键点特征信息进行钓鱼行为识别，从而可以更准确地获得钓鱼行为的识别检测结果。Specifically, when detecting the key points of the human fishing rod, it is to load and output the target detection frame of the multi-frame image of the frame image of the first result in a time sequence, that is, to output the frame image of the first result in a time sequence. The target detection frames in the multi-frame images below are loaded at the same time, and then the key point features of the human fishing rod in the target detection frames in the multi-frame images are recognized for fishing behavior. It should be noted that the fishing behavior identification here is different from the fishing behavior identification in step S200. Here, the key point features of the human body and the key point features of the fishing rod of the poacher are identified, not the target. Detection frame recognition. By obtaining multi-frame human fishing rod key point feature data, and then using STGCN (space-time graph convolutional network model) to extract key point feature information for fishing behavior identification, the identification and detection results of fishing behavior can be obtained more accurately.

步骤S400，对所述第一结果和所述第二结果进行投票，以输出最终的识别结果。Step S400, voting on the first result and the second result to output the final identification result.

具体地，先对一个时序下的多帧图像的第一结果进行平均取值，然后与该一个时序下的多帧图像基于人体鱼竿关键点特征信息识别输出的第二结果进行投票，输出最终结果，进一步地提升钓鱼行为的检测识别准确度。Specifically, the first result of the multi-frame images under one time sequence is averaged, and then the second result of the identification and output of the multi-frame images under the one time sequence based on the key point feature information of the human fishing rod is voted, and the final result is output. As a result, the detection and recognition accuracy of phishing behaviors is further improved.

在本发明的一些实施例中，待检测图像由设置在监控水域场景下的 RGB-IR图像传感器获取，RGB-IR图像传感器是一种可以同时进行可见光和红外光信号感光的图像传感器，红外光信号用来在低照度的环境下提供场景亮度信息，通过红外特征图谱可以在夜晚照明不良的情况将感兴趣的拍摄对象进行呈现。在此处，该感兴趣的拍摄对象包括人体目标和鱼竿目标。由于夜晚钓鱼时，通常会投射灯光，鱼竿在灯光的照射下可以反映出红外特征，而偷钓者或行人在由于人体自发红外信号即可反映其红外特征。在正常照度情况下，R、G、B可见光分量即可在图像传感器中取得较好的成像效果，主要应用于白天照度良好的情况下。In some embodiments of the present invention, the image to be detected is acquired by an RGB-IR image sensor set in the monitoring water scene. The RGB-IR image sensor is an image sensor that can simultaneously perform visible light and infrared light signal sensing, and the infrared light The signal is used to provide scene brightness information in a low-light environment, and the object of interest can be presented at night with poor lighting through the infrared feature map. Here, the subjects of interest include human targets and fishing rod targets. When fishing at night, lights are usually projected, and the fishing rod can reflect infrared characteristics under the illumination of lights, while poachers or pedestrians can reflect their infrared characteristics due to the spontaneous infrared signals of the human body. In the case of normal illumination, the visible light components of R, G, and B can achieve better imaging effects in the image sensor, which are mainly used in the case of good illumination during the day.

进一步地，在本发明的一些实施例中，待检测图像由设置在监控水域场景下的RGB-IR图像传感器以脉冲式摄取的多帧实时静态图像。具体是，RGB-IR 图像传感器以脉冲的形式对当前场景画面进行摄取，具体地脉冲信号可以根据实际进行设定。在脉冲信号的波峰处，RGB-IR图像传感器采集当前场景的画面以作为待检测的图像，在脉冲信号的波谷处，RGB-IR图像传感器则不采集当前场景的画面，即处于间歇式地休眠工作状态。也即另一种形式的间隔帧，区别在于，其不属于流媒体视频中的间隔帧。在时间序列上，其是连续的静态帧图像。通过上述的脉冲式摄取待检测图像，可以降低对图像数据的处理量，节约能耗的同时还可以提高运算量，避免消耗不必要的运算力。Further, in some embodiments of the present invention, the image to be detected is a multi-frame real-time static image captured in a pulsed manner by an RGB-IR image sensor set in the monitoring water scene. Specifically, the RGB-IR image sensor captures the current scene image in the form of pulses. Specifically, the pulse signal can be set according to the actual situation. At the peak of the pulse signal, the RGB-IR image sensor collects the image of the current scene as the image to be detected. At the trough of the pulse signal, the RGB-IR image sensor does not collect the image of the current scene, that is, it sleeps intermittently. working status. That is, another form of interval frame, the difference is that it does not belong to the interval frame in streaming media video. In time series, it is a continuous still frame image. Through the above-mentioned pulsed capturing of the image to be detected, the processing amount of the image data can be reduced, the calculation amount can be increased while the energy consumption is saved, and unnecessary calculation power can be avoided.

进一步地，在本发明的一些实施例中，该待检测图像还可以由设置在监控水域场景下的RGB-IR图像传感器摄录的流媒体视频中基于间隔取帧规则取得的在一个单位时序下的多帧实时静态图像。具体地，作为另一示例，间隔取帧规则可以是1+(n-1)，其中，n为取帧周期，也即是，在取帧周期n内，仅获取1帧待检测图像。在1+(n-1)方式下，检测耗时是取帧周期检测的1/n，如n 为5，待检测图像帧数占待检测视频帧数量的比例达到20％，可以进一步提高待检测视频处理速度和待检测视频接入路数。从流媒体视频中间隔获取多帧待检测图像，能够提高对待检测图像的处理速度，提高了钓鱼行为识别检测的响应及时度，从而降低钓鱼行为识别检测的迟延。Further, in some embodiments of the present invention, the to-be-detected image can also be obtained under a unit time sequence based on the interval frame rule from the streaming media video recorded by the RGB-IR image sensor set in the monitoring water scene. of multi-frame real-time still images. Specifically, as another example, the interval frame selection rule may be 1+(n-1), where n is the frame selection period, that is, within the frame selection period n, only one frame of the image to be detected is acquired. In the 1+(n-1) mode, the detection time is 1/n of the frame period detection. For example, if n is 5, the number of image frames to be detected accounts for 20% of the number of video frames to be detected, which can be further improved. Detect the video processing speed and the number of video access channels to be detected. Obtaining multiple frames of images to be detected at intervals from the streaming video can improve the processing speed of the images to be detected, improve the response timeliness of phishing behavior identification and detection, and reduce the delay of phishing behavior identification and detection.

进一步地，在本发明的一些实施例中，如图2所示，获取目标检测框的步骤包括：Further, in some embodiments of the present invention, as shown in FIG. 2 , the step of acquiring the target detection frame includes:

步骤S110、提取待检测图像中的行人框和鱼竿框；Step S110, extracting the pedestrian frame and the fishing rod frame in the image to be detected;

步骤S120、将所述鱼竿框分别与同一待检测图像中的行人框进行匹配；Step S120, respectively matching the fishing rod frame with the pedestrian frame in the same to-be-detected image;

步骤S130、若匹配得到的交并比值大于预设的交并比阈值，则确定此时的行人框中的行人与该鱼竿框中的鱼竿相关联；Step S130, if the intersection ratio obtained by matching is greater than the preset intersection ratio threshold, determine that the pedestrian in the pedestrian frame at this time is associated with the fishing rod in the fishing rod frame;

步骤S140、根据所述行人和鱼竿生成目标检测框。Step S140, generating a target detection frame according to the pedestrian and the fishing rod.

具体地，在本发明的实施例中，交并比的计算规则具体如下：Specifically, in the embodiment of the present invention, the calculation rule of the intersection ratio is as follows:

在行人框a的左上角坐标和右下角坐标记为bbox_a＝[(x_a1,y_a1),(x_a2,y_a2)]，其中，行人框a的左上角坐标记为(x_a1,y_a1)，行人框a的右下角坐标记为 (x_a2,y_a2)；The coordinates of the upper left corner and the lower right corner of the pedestrian box a are marked as bbox _a = [(x _a1 , y _a1 ), (x _a2 , y _a2 )], where the coordinates of the upper left corner of the pedestrian box a are marked as (x _a1 , y _a1 ), the coordinates of the lower right corner of the pedestrian frame a are marked as (x _a2 ,y _a2 );

在鱼竿框b的左上角坐标和右下角坐标记为bbox_b＝[(x_1b,y_1b),(x₂,y_2b)]，其中，鱼竿框b的左上角坐标记为(x_1b,y_1b)，鱼竿框b的右下角坐标记为 (x₂,y_2b)；The coordinates of the upper left corner and the lower right corner of the rod frame b are marked as bbox _b = [(x _1b , y _1b ), (x ₂ , y _2b )], where the coordinates of the upper left corner of the rod frame b are marked as (x _1b , y _1b ), the coordinates of the lower right corner of the rod frame b are marked as (x ₂ , y _2b );

交并比IoU即行人框a与鱼竿框b的交并比，可以表示为：The intersection ratio IoU is the intersection ratio between pedestrian frame a and fishing rod frame b, which can be expressed as:

进一步地，本发明的实施例中的预设交并比阈值设置为0.1～0.3。其中，作为本发明的一个较佳的优选实施方式，该预设的交并比阈值设置为0.2。在实际的测试过程中发现，当一个鱼竿框与一个行人框的交并比大于0.2时，这两个框包含的区域可以认为是一个人和与他相关的鱼竿；当交并比阈值小于0.2 时，认为这两个框的鱼竿和行人是不相关的，所以交并比阈值设定为0.2较为合适。在不增加运算量的前提下，保证了对钓鱼行为的较高的检测识别准确率。Further, the preset intersection ratio threshold in the embodiment of the present invention is set to be 0.1˜0.3. Wherein, as a preferred embodiment of the present invention, the preset intersection ratio threshold is set to 0.2. In the actual test process, it is found that when the intersection ratio of a fishing rod frame and a pedestrian frame is greater than 0.2, the areas contained in these two frames can be considered as a person and the fishing rod related to him; when the intersection ratio threshold When it is less than 0.2, it is considered that the fishing rods and pedestrians in these two boxes are irrelevant, so it is more appropriate to set the intersection ratio threshold to 0.2. Under the premise of not increasing the amount of computation, a high detection and recognition accuracy rate for phishing behaviors is guaranteed.

在本发明的优选实施例中，待检测图像上的行人框和鱼竿框是通过预训练好的目标检测模型提取的。具体地，如图3所示，该目标检测模型的训练过程包括：In a preferred embodiment of the present invention, the pedestrian frame and the fishing rod frame on the image to be detected are extracted through a pre-trained target detection model. Specifically, as shown in Figure 3, the training process of the target detection model includes:

步骤S1101、获取预处理过的样本数据；Step S1101, obtaining preprocessed sample data;

步骤S1102、将所述预处理过的样本数据输入到预设的初始目标检测模型中，以得到输出结果；Step S1102, inputting the preprocessed sample data into a preset initial target detection model to obtain an output result;

步骤S1103、根据预设的焦点损失函数和所述输出结果，调整所述初始目标检测模型中的样本类型权重，以得到调参目标检测模型；Step S1103, according to the preset focus loss function and the output result, adjust the sample type weight in the initial target detection model to obtain a parameter-adjusted target detection model;

步骤S1104、通过批随机梯度下降算法训练所述调参目标检测模型，以得到预训练好的目标检测模型。Step S1104: Train the parameter-adjusted target detection model through a batch stochastic gradient descent algorithm to obtain a pre-trained target detection model.

具体地，在本发明的优选实施例中，所述初始目标检测模型为YOLOv3模型，所述样本类型包括正样本和负样本。Specifically, in a preferred embodiment of the present invention, the initial target detection model is a YOLOv3 model, and the sample types include positive samples and negative samples.

其中，所述获取预处理过的样本数据具体为：Wherein, the obtaining of the preprocessed sample data is specifically:

步骤S11011、获取样本数据中的第一原始图像和第二原始图像；Step S11011, acquiring the first original image and the second original image in the sample data;

步骤S11012、将所述第一原始图像和所述第二原始图像按照混合权重进行混合增强处理，以得到增强后的样本数据。Step S11012: Perform mixed enhancement processing on the first original image and the second original image according to the mixed weight to obtain enhanced sample data.

具体地，该数据增强处理的方法通过表达式可以表示为：Specifically, the data enhancement processing method can be expressed as:

image_mix＝lambda*image_a+(1-lambda)*image_b；image _mix = lambda*image _a +(1-lambda)*image _b ;

bbox_mix∈bboxes_mix；bbox _mix ∈ bboxes _mix ;

bbox_a∈bboxes_a；bbox _a ∈ bboxes _a ;

bbox_b∈bboxes_b；bbox _b ∈ bboxes _b ;

bboxes_mix＝bboxes_a∪bboxes_b；bboxes _mix = bboxes _a ∪bboxes _b ;

其中，image_mix表示混合后的增强图像，image_a表示第一原始图像， image_b表示第二原始图像，bboxes_mix表示混合的增强图像的特征标注框， bboxes_a表示第一原始图像的特征标注框，bboxes_b表示第二原始图像的特征标注框，lambda为混合权重。在本发明的优选实施例中，该lambda混合权重取值为0.5，使第一原始图像和第二原始图像在混合后的增强图像中有相同的权重，通过这种数据增强处理方法可极大的丰富训练样本，提高了行人/鱼竿检测模型行人和鱼竿特征的检测效率和准确度，以快速生成对应的行人框和鱼竿框。具体地，行人框包括行人特征，鱼竿框包括鱼竿特征，目标检测框则同时包括行人特征和鱼竿特征，即本发明中认定的一个钓鱼行为的特征标记。Among them, image _mix represents the mixed enhanced image, image _a represents the first original image, image _b represents the second original image, bboxes _mix represents the feature annotation frame of the mixed enhanced image, bboxes _a represents the feature annotation frame of the first original image , bboxes _b represents the feature label box of the second original image, and lambda is the mixing weight. In a preferred embodiment of the present invention, the value of the lambda mixing weight is 0.5, so that the first original image and the second original image have the same weight in the mixed enhanced image. This data enhancement processing method can greatly improve the The rich training samples of the pedestrian/fishing rod detection model improve the detection efficiency and accuracy of pedestrian and fishing rod features, so as to quickly generate the corresponding pedestrian and fishing rod frames. Specifically, the pedestrian frame includes the pedestrian feature, the fishing rod frame includes the fishing rod feature, and the target detection frame includes both the pedestrian feature and the fishing rod feature, that is, a feature marker of a fishing behavior identified in the present invention.

在本发明的实施例中，样本数据来源于监控水域场景下已有的偷钓行为对应的若干可见光图像和红外图像，具体是指标注了行人框和鱼竿框的可见光图像或红外图像，以此作为模型训练用的样本数据。In the embodiment of the present invention, the sample data is derived from several visible light images and infrared images corresponding to the existing poaching behavior in the monitoring water scene, specifically the visible light images or infrared images marked with pedestrian frames and fishing rod frames, so as to This is used as sample data for model training.

其中，在本发明的实施例中，为提高模型训练的精度，可以采用1000帧偷钓行为对应的可见光图像和红外图像，进行多次重复训练和迭代训练。为提高对图像的读取速度以及识别的精度，在对红外图像上的行人或鱼竿进行标注前，还需要对可见光图像和红外图像进行对齐。具体地，该图像对齐包括基于特征的图像对齐，以及基于数据的对齐。基于特征的图像对齐是通过寻找一种空间变换把浮动图像映射到参考图像上，使得两图中对应于空间同一位置的点一一对应起来，从而达到信息融合的目，方便进行特征的定位、提取以及检测识别。在本发明的实施例中，优选地，以可见光图像作为参考图像，以红外图像作为浮动图像，通过空间变换，将红外图像映射到可见光图像上，使得在一个时间节点上，红外图像上的行人特征和鱼竿特征分别与可见光图像上的行人特征和鱼竿特征对应对齐。具体地，基于特征的图像对齐可以采用现有技术中的Homography(单应性)算法、Mesh Warps(变体)算法或者Optical flow(光流)算法进行处理。其中，基于数据的对齐则是将红外图像数据和可见光图像数据按照一定的规则在空间上排列，而不是顺序的一个接一个的排放。由于各个硬件平台对存储空间的处理上有很大的不同，因此，一些平台对某些特定类型的数据只能从某些特定地址开始存取。如果不按照适合其平台要求对数据存放进行对齐，会在存取效率上带来损失。比如有些平台每次读都是从偶地址开始，如果一个int型(假设为32位系统)如果存放在偶地址开始的地方，那么一个读周期就可以读出，而如果存放在奇地址开始的地方，就可能会需要2个读周期，并对两次读出的结果的高低字节进行拼凑才能得到该int数据。显然在读取效率上下降很多。因此，将红外图像数据和可见光图像数据基于数据的对齐实现了在空间和时间上的博弈，提高了数据的读取效率，提高了运算量和响应及时度。具体地，数据对齐为数据存取处理过程中常用的现有技术，通常采用四字节对齐，此处不再赘述。Among them, in the embodiment of the present invention, in order to improve the accuracy of model training, 1000 frames of visible light images and infrared images corresponding to the phishing behavior can be used for repeated training and iterative training for many times. In order to improve the reading speed of the image and the accuracy of recognition, it is also necessary to align the visible light image and the infrared image before marking the pedestrian or fishing rod on the infrared image. Specifically, the image alignment includes feature-based image alignment and data-based alignment. The feature-based image alignment is to map the floating image to the reference image by finding a spatial transformation, so that the points corresponding to the same spatial position in the two images correspond one-to-one, so as to achieve the purpose of information fusion and facilitate the localization of features, Extraction and detection recognition. In the embodiment of the present invention, preferably, the visible light image is used as the reference image, and the infrared image is used as the floating image, and the infrared image is mapped to the visible light image through spatial transformation, so that at a time node, pedestrians on the infrared image The features and fishing rod features are aligned with the pedestrian features and fishing rod features, respectively, on the visible light image. Specifically, the feature-based image alignment may be processed by using a Homography (homography) algorithm, a Mesh Warps (variant) algorithm or an Optical flow (optical flow) algorithm in the prior art. Among them, the data-based alignment is to spatially arrange the infrared image data and the visible light image data according to certain rules, rather than sequentially arranging them one by one. Since each hardware platform handles storage space very differently, some platforms can only access certain specific types of data from certain specific addresses. If the data storage is not aligned according to the requirements of its platform, it will bring loss in access efficiency. For example, on some platforms, each read starts from an even address. If an int type (assuming a 32-bit system) is stored at the beginning of an even address, then one read cycle can be read, and if it is stored at an odd address In some places, it may take 2 read cycles, and the high and low bytes of the results of the two reads can be pieced together to get the int data. Obviously there is a lot of drop in read efficiency. Therefore, the alignment of the infrared image data and the visible light image data based on the data realizes the game in space and time, improves the data reading efficiency, and improves the calculation amount and response timeliness. Specifically, data alignment is a conventional technique commonly used in data access processing, and four-byte alignment is usually adopted, which will not be repeated here.

具体地，在本发明的实施例中，所述步骤S11012中，对样本数据进行混合增强处理具体是使用焦点损失函数focal loss正负样本中的难样本进行加权处理，通过加权处理可以提高样本质量，降低损失，提高模型训练过程中对难样本判断的准确率，以减弱样本类别不平衡及样本分类难度不平衡的问题。具体地，在本发明的实施例中，是基于YOLOv3模型，对输入数据进行数据增强处理。Specifically, in the embodiment of the present invention, in the step S11012, performing hybrid enhancement processing on the sample data is to perform weighting processing on the difficult samples in the positive and negative samples of the focal loss function, and the sample quality can be improved through the weighting processing. , reduce the loss, and improve the accuracy of judging difficult samples during the model training process, so as to reduce the problem of imbalanced sample categories and sample classification difficulty. Specifically, in the embodiment of the present invention, data enhancement processing is performed on the input data based on the YOLOv3 model.

进一步地，在本发明的一些实施例中，如图4所示，所述钓鱼行为检测模型的训练过程包括：Further, in some embodiments of the present invention, as shown in FIG. 4 , the training process of the fishing behavior detection model includes:

步骤S210、提取所述预处理过的样本数据中的人竿框，其中，所述人竿框内包括样本行人和样本钓鱼竿；Step S210, extracting the human-rod frame in the preprocessed sample data, wherein the human-rod frame includes a sample pedestrian and a sample fishing rod;

步骤S220、根据所述人竿框和批随机梯度下降算法训练预设的分类模型；Step S220, training a preset classification model according to the human pole frame and batch stochastic gradient descent algorithm;

步骤S230、若通过训练得到的训练结果达到预设的训练阈值时，将此时训练好的分类模型作为钓鱼行为检测模型。Step S230: If the training result obtained through training reaches the preset training threshold, the classification model trained at this time is used as the fishing behavior detection model.

具体地，在本发明的优选实施例中，所述分类模型为mobilenetv1(一种轻量级神经网络)模型。Specifically, in a preferred embodiment of the present invention, the classification model is a mobilenetv1 (a lightweight neural network) model.

进一步地，在本发明的一些实施例中，如图5所示，所述人体鱼竿关键点检测模型的训练过程包括：Further, in some embodiments of the present invention, as shown in FIG. 5 , the training process of the human fishing rod key point detection model includes:

步骤S310、将所述人竿框进行关键点特征标记，以得到样本行人对应的人体关键点和样本钓鱼竿对应的鱼竿关键点；Step S310, marking the human rod frame with key point features to obtain human key points corresponding to the sample pedestrian and fishing rod key points corresponding to the sample fishing rod;

步骤S320、根据所述人体关键点、鱼竿关键点以及批随机梯度下降算法对预设的分割模型进行训练，并将训练好的分割模型作为所述人体鱼竿关键点检测模型。Step S320: Train a preset segmentation model according to the human body key points, the fishing rod key points and the batch stochastic gradient descent algorithm, and use the trained segmentation model as the human fishing rod key point detection model.

具体地，在本发明的优选实施例中，所述分割模型为改进的Unet模型。Specifically, in a preferred embodiment of the present invention, the segmentation model is an improved Unet model.

具体地，在本发明的优选实施例中，该人体鱼竿关键点特征信息包括对应人体关节的17个人体关键点特征和1个鱼竿关键点特征。其中，17个人体关键点特征包括：鼻子、左眼、右眼、左耳、右耳、左肩、右肩、左肘、右肘、左腕、右腕、左臀、右臀、左膝、右膝、左脚踝、右脚踝。1个鱼竿关键点特征包括：与鱼线相连的鱼竿末端。Specifically, in a preferred embodiment of the present invention, the human body fishing rod key point feature information includes 17 human body key point features and one fishing rod key point feature corresponding to human joints. Among them, 17 human key point features include: nose, left eye, right eye, left ear, right ear, left shoulder, right shoulder, left elbow, right elbow, left wrist, right wrist, left hip, right hip, left knee, right knee , left ankle, right ankle. 1 key point feature of the rod includes: the end of the rod that is attached to the line.

在步骤320中，由于鱼竿比较细，在图像中比较难识别到，本发明的改进的Unet(分割网络)模型是在现有的Unet模型的基础上，将普通卷积层替换为空洞卷积，提升卷积计算时的感受野，通过强化获取鱼竿周围context(背景) 信息来表征鱼竿的存在，进而提升鱼竿关键点检测的准确率。具体地，Unet 模型是著名的用于医学图像分割领域的网络，由于其底层特征与高层特征的融合思想能很好的保留图像的细节特征信息，故本发明的实施例中使用Unet 模型作为关键点检测的base(底层)网络，同时为了进一步提升小细节的关键点的检出率，即鱼竿关键点，本发明将其中的卷积层替换为空洞卷积层，以提升卷积计算时的感受野。在步骤S320中，可以获得多帧的人体鱼竿关键点数据，然后使用STGCN(时空图卷积网络模型)提取关键点特征信息进行钓鱼行为识别，从而更准确的得到最终结果。In step 320, because the fishing rod is relatively thin, it is difficult to identify in the image. The improved Unet (segmentation network) model of the present invention is based on the existing Unet model, and the ordinary convolution layer is replaced by a hollow volume. product, improve the receptive field during convolution calculation, and characterize the existence of the fishing rod by strengthening the acquisition of the context (background) information around the fishing rod, thereby improving the accuracy of the key point detection of the fishing rod. Specifically, the Unet model is a well-known network used in the field of medical image segmentation. Since its fusion idea of low-level features and high-level features can well preserve the detailed feature information of the image, the Unet model is used as the key in the embodiment of the present invention. At the same time, in order to further improve the detection rate of the key points of small details, that is, the key points of the fishing rod, the present invention replaces the convolutional layer with a hollow convolutional layer to improve the time of convolution calculation. receptive field. In step S320, multiple frames of human fishing rod key point data can be obtained, and then STGCN (spatiotemporal graph convolutional network model) is used to extract key point feature information for fishing behavior identification, so as to obtain the final result more accurately.

进一步地，在本发明的一些实施例中，所述对所述第一结果和所述第二结果进行投票，以输出最终的识别结果的具体包括：Further, in some embodiments of the present invention, the voting on the first result and the second result to output the final identification result specifically includes:

步骤S410、分别获取所述第一结果的均值和所述第二结果的置信度；Step S410, obtaining the mean value of the first result and the confidence level of the second result respectively;

步骤S420、根据所述均值和所述置信度进行投票计算，以输出最终的识别结果。Step S420: Perform voting calculation according to the mean value and the confidence level to output the final recognition result.

具体地，所述第一结果的均值是指一个时序下的多帧图像的第一结果的均值，所述第二结果的置信度是指使用一个时序下的多帧图像提取人体鱼竿关键点特征信息，获得一个表示第二结果的最终钓鱼行为识别置信度结果；Specifically, the mean value of the first result refers to the mean value of the first results of multiple frames of images in a time series, and the confidence level of the second result refers to using the multi-frame images in a time series to extract the key points of the human fishing rod feature information to obtain a final phishing behavior recognition confidence result representing the second result;

其中，投票规则可以表示为：Among them, the voting rules can be expressed as:

其中，score_STGCN为STGCN(Spatial Temporal Graph Convolutional Networks，时空图卷积网络模型)使用多帧时序数据得到的一个最终地钓鱼行为识别置信度结果；score_imagemodeli为步骤S300中得到的第i帧的结果；lambda为权重参数。Among them, score _STGCN is a final phishing behavior recognition confidence result obtained by STGCN (Spatial Temporal Graph Convolutional Networks, spatiotemporal graph convolutional network model) using multiple frames of time series data; score _imagemodeli is the result of the ith frame obtained in step S300 ; lambda is the weight parameter.

在本发明的优选实施例中，该第一结果的均值的投票权重为0.4<n<0.6，该钓鱼行为识别置信度结果的投票权重为0.4<m<0.6，其中，m+n＝1。In a preferred embodiment of the present invention, the voting weight of the mean value of the first result is 0.4<n<0.6, and the voting weight of the fishing behavior recognition confidence result is 0.4<m<0.6, where m+n=1.

具体地，作为本发明的一种较佳的实施方式，该权重参数lambda在此设置为0.5，表示平均参考钓鱼行为检测模型和人体鱼竿关键点检测模型这两个模型的检测识别结果。其中，n为使用的多帧的帧数，在此设置为5，5帧既可以达到较好的识别率同时在速度上也能有很好的表现。Specifically, as a preferred embodiment of the present invention, the weight parameter lambda is set to 0.5 here, which represents the average detection and recognition results of the two models, the fishing behavior detection model and the human fishing rod key point detection model. Among them, n is the number of frames of the multi-frame used, which is set to 5 here. 5 frames can not only achieve a better recognition rate, but also have a good performance in speed.

进一步地，在本发明的实施例中，所述的待检测图像由设置在监控水域场景下的RGB-IR图像传感器获取，包括：Further, in the embodiment of the present invention, the to-be-detected image is acquired by an RGB-IR image sensor set in the monitoring water scene, including:

接收人体感应信号，在感应触发后，唤醒休眠态的RGB-IR图像传感器，采集当前视野中的实时画面；Receive the human body induction signal, wake up the dormant RGB-IR image sensor after the induction is triggered, and collect the real-time picture in the current field of view;

将采集的实时画面回传给监控中心服务器。The collected real-time images are sent back to the monitoring center server.

通过对人体感应信号的触发来唤醒休眠态的RGB-IR图像传感器对当前视野中的实时画面进行采集，可以大大地节约数据存储空间，同时降低终端 RGB-IR图像传感器的功耗。By triggering the human body induction signal to wake up the dormant RGB-IR image sensor to collect the real-time picture in the current field of view, it can greatly save the data storage space and reduce the power consumption of the terminal RGB-IR image sensor.

另一面方面，本发明另一实施例还提出了一种钓鱼检测装置，该钓鱼检测装置与上述实施例中的钓鱼检测方法一一对应。具体地，如图6所示，该钓鱼检测装置包括：RGB-IR图像获取模块10、目标检测模块20、钓鱼行为检测模块30、人体鱼竿关键点检测模块40以及投票模块50。On the other hand, another embodiment of the present invention also provides a fishing detection device, which is in one-to-one correspondence with the fishing detection method in the above embodiment. Specifically, as shown in FIG. 6 , the fishing detection device includes: an RGB-IR image acquisition module 10 , a target detection module 20 , a fishing behavior detection module 30 , a human fishing rod key point detection module 40 and a voting module 50 .

其中，RGB-IR图像获取模块10用于获取监控水域场景下的待检测图像。目标检测模块20用于提取待检测图像中的行人框和鱼竿框，然后根据交并比阈值生成同时包括行人和鱼竿的目标检测框。钓鱼行为检测模块30用于对所述目标检测框进行钓鱼行为识别，输出识别结果为“钓鱼行为”的第一结果。人体鱼竿关键点检测模块40用于对所述目标检测框进行钓鱼行为识别，输出识别结果为“钓鱼行为”的第二结果。投票模块50用于根据所述第一结果的均值和所述第二结果的置信度进行投票计算，输出最终的识别结果。Wherein, the RGB-IR image acquisition module 10 is used to acquire the image to be detected in the monitoring water scene. The target detection module 20 is used to extract the pedestrian frame and the fishing rod frame in the image to be detected, and then generate a target detection frame including both the pedestrian and the fishing rod according to the intersection ratio threshold. The phishing behavior detection module 30 is configured to perform phishing behavior recognition on the target detection frame, and output the first result of the recognition result as "fishing behavior". The human fishing rod key point detection module 40 is used to identify the fishing behavior on the target detection frame, and output the second result of the identification result as "fishing behavior". The voting module 50 is configured to perform voting calculation according to the mean value of the first result and the confidence level of the second result, and output the final recognition result.

具体地，目标检测模块20完成的工作如下：首先，对载入的待检测图像进行行人和鱼竿的检测识别，分别生成对应的行人框和鱼竿框；接着，遍历每一帧待检测图像上的行人框，并与该帧待检测图像上的鱼竿框进行匹配，根据预设的交并比阈值确定与鱼竿框相关联匹配的行人框；最后，根据关联匹配结果，生成同时包含行人和鱼竿的目标检测框。Specifically, the work completed by the target detection module 20 is as follows: first, perform detection and identification of pedestrians and fishing rods on the loaded images to be detected, and generate corresponding pedestrian frames and fishing rod frames; then, traverse each frame of the images to be detected The pedestrian frame on the frame is matched with the fishing rod frame on the image to be detected in the frame, and the pedestrian frame associated with the fishing rod frame is determined according to the preset intersection ratio threshold; Object detection boxes for pedestrians and fishing rods.

关于该基于RGB-IR图像数据的钓鱼检测装置的具体工作原理可以参见上述钓鱼检测方法的工作流程，在此不再赘述。上述的钓鱼检测装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中，也可以以软件形式存储于计算机设备中的存储器中，以便于处理器调用执行以上各个模块对应的操作。For the specific working principle of the fishing detection device based on RGB-IR image data, reference may be made to the workflow of the above fishing detection method, which will not be repeated here. Each module in the above-mentioned fishing detection device can be implemented in whole or in part by software, hardware and combinations thereof. The above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.

另一方面，在一个实施例中，如图7所示，本发明还提出了一种计算机设备，该计算机设备可以是服务器，其内部结构示意图如图7所示。该计算机设备包括通过装置总线连接的数据处理器、存储器、网络接口和数据库。其中，该计算机设备设置有多个数据处理器，数据处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作装置、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该设备的数据库用于存储图像处理涉及的数据。该设备的网络接口用于与外部的终端通过网络连接通信。On the other hand, in an embodiment, as shown in FIG. 7 , the present invention further provides a computer device, the computer device may be a server, and a schematic diagram of its internal structure is shown in FIG. 7 . The computer equipment includes a data processor, memory, a network interface and a database connected by a device bus. Wherein, the computer equipment is provided with a plurality of data processors, and the data processors are used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium, an internal memory. The nonvolatile storage medium stores an operating device, a computer program, and a database. The internal memory provides an environment for the execution of the operating system and computer programs in the non-volatile storage medium. The device's database is used to store data involved in image processing. The network interface of the device is used to communicate with external terminals through a network connection.

其中，存储器中存储有可在处理器上运行的计算机程序，处理器执行该计算机程序时实现上述钓鱼检测方法。The memory stores a computer program that can run on the processor, and the processor implements the above-mentioned phishing detection method when the computer program is executed.

在一个实施例中，本发明还提出了一种计算机设备，该计算机设备可以是终端，其内部结构示意图如图8所示。该计算机设备包括通过系统总线连接的数据处理器、存储器、网络接口、显示屏和输入装置。其中，该计算机设备设置有多个数据处理器，数据处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。In an embodiment, the present invention also provides a computer device, the computer device may be a terminal, and a schematic diagram of its internal structure is shown in FIG. 8 . The computer equipment includes a data processor, memory, a network interface, a display screen, and an input device connected by a system bus. Wherein, the computer equipment is provided with a plurality of data processors, and the data processors are used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium, an internal memory. The nonvolatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the execution of the operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used to communicate with an external terminal through a network connection.

其中，存储器中存储有可在处理器上运行的计算机程序，该计算机程序被处理器执行时实现上述钓鱼检测方法。Wherein, the memory stores a computer program that can be executed on the processor, and when the computer program is executed by the processor, realizes the above-mentioned phishing detection method.

具体地，该计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏，该计算机设备的输入装置可以是显示屏上覆盖的触摸层，也可以是计算机设备外壳上设置的按键、轨迹球或触控板，还可以是外接的键盘、触控板或鼠标等。Specifically, the display screen of the computer equipment may be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment may be a touch layer covered on the display screen, or buttons, trackballs or buttons provided on the shell of the computer equipment. The touchpad can also be an external keyboard, touchpad or mouse.

本领域技术人员可以理解，图7和图8中示出的结构，仅仅是与本申请方案相关的部分结构的框图，并不构成对本申请方案所应用于其上的计算机设备的限定，具体的设备可以包括比图中所示更多或更少的部件，或者组合某些部件，或者具有不同的部件布置。Those skilled in the art can understand that the structures shown in FIG. 7 and FIG. 8 are only block diagrams of partial structures related to the solution of the present application, and do not constitute a limitation on the computer equipment to which the solution of the present application is applied. A device may include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.

另一面方面，本发明还提出了一种计算机可读存储介质，该计算机可读存储介质存储有计算机程序，该计算机程序被处理器执行时实现上述钓鱼检测方法。In another aspect, the present invention also provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the above-mentioned phishing detection method is implemented.

应理解，上述实施例中各步骤的序号的大小并不意味着执行顺序的先后，各过程的执行顺序应以其功能和内在逻辑确定，而不应对本发明实施例的实施过程构成任何限定。It should be understood that the size of the sequence numbers of the steps in the above embodiments does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

示例性的，计算机程序可以被分割成一个或多个模块/单元，一个或者多个模块/单元被存储在存储器中，并由处理器执行，以完成本发明。一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段，该指令段用于描述计算机程序在计算机设备中的执行过程。Exemplarily, the computer program may be divided into one or more modules/units, and the one or more modules/units are stored in a memory and executed by a processor to accomplish the present invention. One or more modules/units may be a series of computer program instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer program in a computer device.

该计算机设备可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。计算机设备可包括，但不仅限于，处理器、存储器。本领域技术人员可以理解，图7和图8仅仅是计算机设备的示例，并不构成对计算机设备的限定，可以包括比图示更多或更少的部件，或者组合某些部件，或者不同的部件，例如计算机设备还可以包括输入输出设备、网络接入设备、总线等。The computer device may be a desktop computer, a notebook, a palmtop computer, a cloud server, and other computing devices. Computer equipment may include, but is not limited to, processors, memory. Those skilled in the art can understand that FIG. 7 and FIG. 8 are only examples of computer equipment, and do not constitute a limitation to the computer equipment, and may include more or less components than those shown, or combine certain components, or different Components, such as computer devices, may also include input-output devices, network access devices, buses, and the like.

所称处理器可以是中央处理单元(Central Processing Unit，CPU)，还可以是其他通用处理器、数字信号处理器(Digital Signal Processor，DSP)、专用集成电路(Application Specific Integrated Circuit，ASIC)、现成可编程门阵列(Field-Programmable Gate Array，FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf processors Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

存储器可以是计算机设备的内部存储单元，例如计算机设备的硬盘或内存。存储器也可以是计算机设备的外部存储设备，例如计算机设备上配备的插接式硬盘，智能存储卡(Smart Media Card,SMC)，安全数字(Secure Digital,SD) 卡，闪存卡(Flash Card)等。进一步地，存储器还可以既包括计算机设备的内部存储单元也包括外部存储设备。存储器用于存储计算机程序以及终端设备所需的其他程序和数据。存储器还可以用于暂时地存储已经输出或者将要输出的数据。The memory may be an internal storage unit of a computer device, such as a hard disk or a memory of the computer device. The memory can also be an external storage device of the computer device, such as a plug-in hard disk equipped on the computer device, a Smart Media Card (SMC), a Secure Digital (SD) card, a Flash Card, etc. . Further, the memory may also include both an internal storage unit of the computer device and an external storage device. The memory is used to store computer programs and other programs and data required by the terminal device. The memory can also be used to temporarily store data that has been or will be output.

所属领域的技术人员可以清楚地了解到，为了描述的方便和简洁，仅以上述各功能单元、模块的划分进行举例说明，实际应用中，可以根据需要而将上述功能分配由不同的功能单元、模块完成，即将所述装置的内部结构划分成不同的功能单元或模块，以完成以上描述的全部或者部分功能。Those skilled in the art can clearly understand that, for the convenience and simplicity of description, only the division of the above-mentioned functional units and modules is used as an example. Module completion, that is, dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.

另外，在本发明各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现，也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.

所述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明实现上述实施例方法中的全部或部分流程，也可以通过计算机程序来指令相关的硬件来完成，所述的计算机程序可存储于一计算机可读存储介质中，该计算机程序在被处理器执行时，可实现上述各个方法实施例的步骤。其中，所述计算机程序包括计算机程序代码，所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括：能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是，所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减，例如在某些司法管辖区，根据立法和专利实践，计算机可读介质不包括是电载波信号和电信信号。The integrated modules/units, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium. Based on this understanding, the present invention can implement all or part of the processes in the methods of the above embodiments, and can also be completed by instructing relevant hardware through a computer program, and the computer program can be stored in a computer-readable storage medium. When the program is executed by the processor, the steps of the foregoing method embodiments can be implemented. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file or some intermediate form, and the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory) , Random Access Memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium, etc. It should be noted that the content contained in the computer-readable media may be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction, for example, in some jurisdictions, according to legislation and patent practice, the computer-readable media Excluded are electrical carrier signals and telecommunication signals.

以上所述实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围，均应包含在本发明的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it is still possible to implement the foregoing implementations. The technical solutions described in the examples are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present invention, and should be included in the within the protection scope of the present invention.

Claims

1. A fishing detection method, comprising:

obtaining a target detection frame, wherein the target detection frame comprises a pedestrian and a fishing rod;

performing phishing behavior recognition on the target detection frame through a phishing behavior detection model, and outputting a first result of which the recognition result is 'phishing behavior';

carrying out fishing behavior recognition on the target detection frame through a human fishing rod key point detection model, and outputting a second result of which the recognition result is 'fishing behavior';

and voting the first result and the second result to output a final recognition result.

2. A fishing detection method according to claim 1, wherein said acquiring a target detection frame includes:

extracting a pedestrian frame and a fishing rod frame in an image to be detected;

respectively matching the fishing rod frames with pedestrian frames in the same image to be detected;

if the intersection ratio obtained by matching is larger than a preset intersection ratio threshold value, determining that the pedestrian in the pedestrian frame is associated with the fishing rod in the fishing rod frame;

and generating a target detection frame according to the pedestrian and the fishing rod.

3. A fishing detection method according to claim 2, wherein a pedestrian frame and a fishing rod frame in the image to be detected are extracted using a pre-trained target detection model, and before the extraction of the pedestrian frame and the fishing rod frame in the image to be detected, the method further comprises:

acquiring preprocessed sample data;

inputting the preprocessed sample data into a preset initial target detection model to obtain an output result;

adjusting sample type weight in the initial target detection model according to a preset focus loss function and the output result to obtain a parameter-adjusted target detection model;

and training the parameter-adjusted target detection model through a batch stochastic gradient descent algorithm to obtain a pre-trained target detection model.

4. A phishing detection method according to claim 3 where said obtaining pre-processed sample data comprises:

acquiring a first original image and a second original image in sample data;

and performing mixed enhancement processing on the first original image and the second original image according to the mixed weight to obtain enhanced sample data.

5. A phishing detection method according to claim 3 or 4, wherein before performing phishing behavior recognition on the target detection frame by the phishing behavior detection model and outputting a first result that a recognition result is "phishing behavior", the method further comprises:

extracting a rod frame in the preprocessed sample data, wherein the rod frame comprises a sample pedestrian and a sample fishing rod;

training a preset classification model according to the rod frame and the batch random gradient descent algorithm;

and if the training result obtained through training reaches a preset training threshold value, taking the trained classification model as a fishing behavior detection model.

6. A fishing detection method according to claim 5, wherein before the fishing behavior recognition is performed on the target detection frame by the human fishing rod key point detection model and the second result that the recognition result is "fishing behavior" is output, the method further comprises:

carrying out key point feature marking on the fishing rod frame to obtain a human body key point corresponding to a sample pedestrian and a fishing rod key point corresponding to a sample fishing rod;

and training a preset segmentation model according to the human body key points, the fishing rod key points and the batch stochastic gradient descent algorithm, and taking the trained segmentation model as the human body fishing rod key point detection model.

7. A phishing detection method as claimed in claim 1 wherein said voting for said first and second results to output a final recognition result comprises:

respectively acquiring the mean value of the first result and the confidence coefficient of the second result;

and voting calculation is carried out according to the mean value and the confidence coefficient so as to output a final recognition result.

8. A fishing detection device, comprising:

the RGB-IR image acquisition module is used for acquiring an image to be detected in a scene of a monitoring water area;

the target detection module is used for extracting a pedestrian frame and a fishing rod frame in the image to be detected and generating a target detection frame simultaneously comprising the pedestrian and the fishing rod according to the intersection ratio threshold;

the fishing behavior detection module is used for carrying out fishing behavior identification on the target detection frame and outputting a first result of which the identification result is 'fishing behavior';

the human fishing rod key point detection module is used for carrying out fishing behavior recognition on the target detection frame and outputting a second result of which the recognition result is 'fishing behavior';

and the voting module is used for voting according to the mean value of the first result and the confidence coefficient of the second result and outputting a final recognition result.

9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the phishing detection method of any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements a phishing detection method according to any one of claims 1 to 7.