CN115100732A - Fishing detection method and device, computer equipment and storage medium - Google Patents
Fishing detection method and device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN115100732A CN115100732A CN202110250476.0A CN202110250476A CN115100732A CN 115100732 A CN115100732 A CN 115100732A CN 202110250476 A CN202110250476 A CN 202110250476A CN 115100732 A CN115100732 A CN 115100732A
- Authority
- CN
- China
- Prior art keywords
- fishing
- frame
- result
- fishing rod
- behavior
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a fishing detection method, a fishing detection device, computer equipment and a storage medium, wherein the method comprises the following steps: acquiring a target detection frame comprising a pedestrian and a fishing rod; performing phishing behavior recognition on the target detection frame through the phishing behavior detection model, and outputting a first result of which the recognition result is 'phishing behavior'; carrying out fishing behavior recognition on the target detection frame through the human fishing rod key point detection model, and outputting a second result of which the recognition result is 'fishing behavior'; and voting the first result and the second result to output a final recognition result. The fishing detection method provided by the invention can be used for detecting and managing the sneaking fishing behaviors in 24 hours all day under the scene of day and night, so that the labor cost can be greatly saved, the illegal fishing behaviors in a large water area can be identified in 24 hours all day at low cost, the management level of the water area in which fishing is forbidden is improved, the final identification result is output by voting according to the average value of the first result and the second result, and the detection accuracy is improved.
Description
Technical Field
The invention relates to the technical field of computer vision recognition, in particular to a fishing detection method, a fishing detection device, computer equipment and a storage medium.
Background
Fishing is an outdoor hobby activity with a large audience, and people who are loved by rivers, lakes or reservoirs can be seen frequently, but fishing is not allowed in many places for management needs and commercial interests, such as lakes in scenic spots and reservoirs under private contract. But some fishing behaviors occur occasionally, which not only brings hidden dangers to the safety management of related water areas, but also can cause personal injury to the fisherman, such as accidental water falling, drowning or touching of high voltage electricity, and the like, and the wastes generated in the fishing process are also very likely to cause water body pollution and the like.
The existing means for prohibiting the fisherman from illegally fishing in the controlled water area mainly comprises the steps of setting a warning board for prohibiting fishing in the corresponding area or arranging special management personnel to patrol. However, in practice, the arrangement of the warning board cannot effectively drive off illegal fishers, and the arrangement of special management personnel for patrol wastes manpower resources greatly, and the 24-hour all-weather patrol monitoring cannot be realized.
In view of the above, some prior arts also provide a means for performing target detection and identification management on illegal fishermen based on computer vision identification. Target detection is one of the basic tasks in the field of computer vision, and the academic community has a history of research that will be in the last two decades. With the fire development of deep learning technology in recent years, the target detection algorithm is also shifted to the detection technology based on the deep neural network from the traditional algorithm based on manual characteristics.
However, the existing computer illegal phishing identification technology is mainly based on the principle of the traditional front-back frame pixel comparison method. The principle of the pixel comparison method is that real-time image acquisition is carried out on an area to be detected, and frame data before and after the image acquisition is compared. Specifically, the difference between the image pixels at the time k and the time k +1 is calculated to detect the image difference between the previous time and the next time. And then calculating the probability distribution of the difference pixel values, comparing the probability distribution with the existing fishing image database, and judging whether fishing behaviors exist or not. Because the fishing image data which can be collected is limited, namely the standard data distribution which can be compared is limited, and the pixel difference value distribution which is acquired and calculated is random and various, the method has two problems: (1) pedestrians are often mistaken for anglers; (2) fishing activity is not detected. Meanwhile, image data acquired by the existing computer illegal fishing identification technology is mainly visible light images and is sensitive to illumination, and many fishing enthusiasts like fishing at night, so that under the protection of night curtains, the illegal fishers can still escape monitoring and cannot perform 24-hour real-time monitoring on the illegal fishing behaviors all day long.
Disclosure of Invention
In order to solve the above technical problems, embodiments of the present invention provide a fishing detection method, apparatus, computer device and storage medium, which can perform detection management for 24 hours all day for a sneak fishing behavior in a day-night scene, and have high detection accuracy.
A fishing detection method comprising:
obtaining a target detection frame, wherein the target detection frame comprises a pedestrian and a fishing rod;
performing phishing behavior recognition on the target detection frame through a phishing behavior detection model, and outputting a first result of which the recognition result is 'phishing behavior';
carrying out fishing behavior recognition on the target detection frame through a human fishing rod key point detection model, and outputting a second result of which the recognition result is 'fishing behavior';
and voting the first result and the second result to output a final recognition result.
Preferably, in the above fishing detection method, the acquisition target detection frame includes:
extracting a pedestrian frame and a fishing rod frame in an image to be detected;
respectively matching the fishing rod frames with pedestrian frames in the same image to be detected;
if the intersection ratio obtained by matching is larger than a preset intersection ratio threshold value, determining that the pedestrian in the pedestrian frame is associated with the fishing rod in the fishing rod frame;
and generating a target detection frame according to the pedestrian and the fishing rod.
Preferably, in the above fishing detection method, a pre-trained target detection model is used to extract a pedestrian frame and a fishing rod frame in an image to be detected, and before the pedestrian frame and the fishing rod frame in the image to be detected are extracted, the method further includes:
acquiring preprocessed sample data;
inputting the preprocessed sample data into a preset initial target detection model to obtain an output result;
adjusting sample type weight in the initial target detection model according to a preset focus loss function and the output result to obtain a parameter-adjusted target detection model;
and training the parameter-adjusted target detection model through a batch stochastic gradient descent algorithm to obtain a pre-trained target detection model.
Preferably, in the above fishing detection method, the acquiring pre-processed sample data includes:
acquiring a first original image and a second original image in sample data;
and performing mixed enhancement processing on the first original image and the second original image according to the mixed weight to obtain enhanced sample data.
Preferably, in the above fishing detection method, before the fishing behavior recognition is performed on the target detection frame by the fishing behavior detection model and a first result that the recognition result is "fishing behavior" is output, the method further includes:
extracting a rod frame in the preprocessed sample data, wherein the rod frame comprises a sample pedestrian and a sample fishing rod;
training a preset classification model according to the rod frame and the batch random gradient descent algorithm;
and if the training result obtained through training reaches a preset training threshold value, taking the trained classification model as a fishing behavior detection model.
Preferably, in the above fishing detection method, before the fishing behavior recognition is performed on the target detection frame by using the human fishing rod key point detection model and the second result that the recognition result is "fishing behavior" is output, the method further includes:
carrying out key point feature marking on the fishing rod frame to obtain a human body key point corresponding to a sample pedestrian and a fishing rod key point corresponding to a sample fishing rod;
and training a preset segmentation model according to the human body key points, the fishing rod key points and the batch stochastic gradient descent algorithm, and taking the trained segmentation model as the human body fishing rod key point detection model.
Preferably, in the above fishing detection method, the voting the first result and the second result to output a final recognition result includes:
respectively acquiring the mean value of the first result and the confidence coefficient of the second result;
and voting calculation is carried out according to the mean value and the confidence coefficient so as to output a final recognition result.
A fishing detection device comprising:
the RGB-IR image acquisition module is used for acquiring an image to be detected in a scene of a monitoring water area;
the target detection module is used for extracting a pedestrian frame and a fishing rod frame in the image to be detected and generating a target detection frame simultaneously comprising a pedestrian and a fishing rod according to the intersection ratio threshold;
the fishing behavior detection module is used for carrying out fishing behavior identification on the target detection frame and outputting a first result of which the identification result is 'fishing behavior';
the human fishing rod key point detection module is used for carrying out fishing behavior recognition on the target detection frame and outputting a second result of which the recognition result is 'fishing behavior';
and the voting module is used for voting according to the mean value of the first result and the confidence coefficient of the second result and outputting a final recognition result.
A computer device comprising a memory, a processor and a computer program stored in said memory and executable on said processor, said processor implementing said phishing detection method when executing said computer program.
A computer-readable storage medium storing a computer program which, when executed by a processor, implements the phishing detection method.
The invention has the beneficial effects that: the fishing detection method provided by the invention can be used for carrying out detection management on the fishing stealing behavior for 24 hours all day in a night scene, and voting is carried out through the average value of the first result and the second result to output the final recognition result, so that the detection accuracy is improved. Through 24-hour all-day automatic detection, the labor cost is greatly saved, the illegal fishing behaviors in large water areas can be identified by 24 hours all day with low cost and high accuracy, and the management level of the water areas in which fishing is forbidden is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.
FIG. 1 is a flow chart of a fishing detection method according to the present invention;
FIG. 2 is a flow chart of obtaining the target detection box;
FIG. 3 is a flow chart of pre-training of the target detection model of the present invention;
FIG. 4 is a flow chart of the training of the fishing behavior detection model of the present invention;
FIG. 5 is a flow chart of the training process of the detection model for key points of the fishing rod of the human body according to the present invention;
FIG. 6 is a schematic view of the fishing detecting device according to the present invention;
FIG. 7 is a schematic diagram of an internal structure of an embodiment of a computer apparatus according to the present invention;
FIG. 8 is a schematic diagram of an internal structure of another embodiment of the computer apparatus according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, as shown in the drawings, an embodiment of the invention provides a fishing detection method, which includes:
step S100, obtaining a target detection frame, wherein the target detection frame comprises a pedestrian and a fishing rod.
Specifically, the image to be detected can be acquired in real time by an image sensor arranged in a scene of a monitoring water area, and can be a time sequence frame image or a time sequence frame image in a streaming media video. The time sequence frame images extracted from the streaming media video can be continuous frame images, or interval frame images obtained by adopting an interval frame extracting rule in a time sequence frame image set. For example, if the streaming media video includes M frames of images to be detected, at least one frame of image to be detected is obtained from N frames of the M frames of images to be detected at intervals. It should be noted that the frame rate of the to-be-detected image in the streaming video is generally more than 25 frames per second, and if each frame of to-be-detected image is detected, the amount of computation is increased, and the response timeliness of the phishing behavior identification detection is reduced. In the embodiment, multiple frames of images to be detected are obtained at intervals from the streaming media video, so that the computation amount of image processing can be reduced, the speed of identifying and detecting the fishing behaviors is increased, the feedback of the detection result is quickly realized in real time, and the people who steal the fish in illegal fishing are effectively attacked.
The image to be detected acquired in real time on site can be transmitted to a monitoring center server in a wired or wireless mode, then the server detects and identifies pedestrians and fishing rods appearing in the image to be detected, and marks and frames pedestrians on the image to be detected in a pedestrian frame mode, and marks and frames the fishing rod on the image to be detected in a fishing rod frame mode. When marking frames are carried out on pedestrians and fishing rods, sliding window scanning detection is carried out on the image to be detected in an automatic sliding window mode.
Specifically, a pedestrian frame on each frame of image to be detected is traversed, and the pedestrian frame on the frame of image to be detected, which is matched with the fishing rod frame in a correlation mode, is determined according to a preset intersection ratio threshold value, so that a target detection frame containing pedestrians and a fishing rod is generated.
Specifically, the pedestrian frame and the fishing rod frame on the image to be detected in step S100 may be unrelated, not matched, or associated and matched. The term "associative matching" as used herein means that the pedestrian frame and the fishing rod frame are a feature set of a fisherman, that is, the pedestrian frame is a human feature reflected by the human body of the fisherman, and the fishing rod frame is a fishing rod feature reflected by the fishing rod of the fisherman, that is, there is a fisherman at this time. Through carrying out the associative matching with pedestrian frame and fishing rod frame, generate the target detection frame that contains pedestrian and fishing rod simultaneously, can avoid discerning the irrelevant passerby who passes through as the person of surreptitious fishing, improved the degree of accuracy that discerns the detection.
And step S200, carrying out phishing behavior recognition on the target detection frame through a phishing behavior detection model, and outputting a first result of which the recognition result is 'phishing behavior'.
Specifically, the target detection frame is a marking frame containing both pedestrians and fishing rods, that is, two associated matching features of a fisher having an associated matching relationship identified in step S200, that is, a pedestrian feature and a fishing rod feature thereof are detected, and one or more fishers in the image to be detected can be detected by identifying the target detection frame containing pedestrians and fishing rods, one target detection frame representing one fisher. In an image to be detected, there may be no target detection frame, and there may be one or more target detection frames. In the case where there is no target detection frame, the image to be detected does not proceed to the fishing behavior recognition process of step S200. In the case where there is one target detection frame or a plurality of target detection frames, the image to be detected proceeds to step S200 for phishing behavior recognition, and a first result whose result is "phishing behavior" is output.
And step S300, carrying out fishing behavior recognition on the target detection frame through the human fishing rod key point detection model, and outputting a second result of which the recognition result is 'fishing behavior'.
Specifically, when the key point of the fishing rod of the human body is detected, the target detection frame of the multi-frame image under a time sequence with the frame image outputting the first result is loaded, namely the target detection frame of the multi-frame image under a time sequence with the frame image outputting the first result is loaded simultaneously, and then the fishing behavior recognition is carried out on the key point feature of the fishing rod of the human body in the target detection frame in the multi-frame image. It should be noted that the fishing behavior recognition here is different from the fishing behavior recognition in step S200, and here, the human body key point feature and the fishing rod key point feature of the stealer are recognized, not the recognition of the target detection frame. By obtaining the key point feature data of a plurality of frames of the human fishing rod and then extracting the key point feature information by using an STGCN (space-time graph convolutional network model), the fishing behavior is identified, so that the identification and detection result of the fishing behavior can be more accurately obtained.
Step S400, voting the first result and the second result to output a final recognition result.
Specifically, the average value is taken from the first result of the multi-frame image in a time sequence, then the voting is carried out on the second result which is output by the multi-frame image in the time sequence based on the identification of the key point characteristic information of the fishing rod of the human body, the final result is output, and the detection and identification accuracy of the fishing behavior is further improved.
In some embodiments of the present invention, the image to be detected is obtained by an RGB-IR image sensor disposed in a scene of a monitored water area, the RGB-IR image sensor is an image sensor capable of simultaneously sensing visible light and infrared light signals, the infrared light signals are used to provide scene brightness information in a low-illumination environment, and an interested photographic object can be presented in a case of poor illumination at night through an infrared characteristic map. Here, the photographic subject of interest includes a human target and a fishing rod target. When fishing at night, light is usually projected, the fishing rod can reflect the infrared characteristics under the irradiation of the light, and stealer or pedestrians can reflect the infrared characteristics of the fishing rod due to the spontaneous infrared signals of the human body. Under the condition of normal illumination, R, G, B visible light components can obtain better imaging effect in the image sensor, and the method is mainly applied to the condition of good illumination in the daytime.
Further, in some embodiments of the invention, the image to be detected is a real-time static image of a plurality of frames captured in pulses by an RGB-IR image sensor arranged under a scene of a monitoring water area. Specifically, the RGB-IR image sensor takes a current scene picture in a pulse form, and specifically, a pulse signal may be set according to the actual situation. And at the wave crest of the pulse signal, the RGB-IR image sensor collects the picture of the current scene to be used as the image to be detected, and at the wave trough of the pulse signal, the RGB-IR image sensor does not collect the picture of the current scene, namely, the RGB-IR image sensor is in an intermittent sleep working state. I.e., another form of inter frame, except that it does not belong to an inter frame in a streaming video. In time series, it is a continuous still frame image. Through foretell pulsed ingestion image to be detected, can reduce the handling capacity to image data, can also improve the operand when practicing thrift the energy consumption, avoid consuming unnecessary calculation power.
Further, in some embodiments of the present invention, the image to be detected may also be a multi-frame real-time still image at a unit time sequence, which is obtained based on an interval frame-taking rule in a streaming media video recorded by an RGB-IR image sensor disposed in the scene of the monitoring water area. Specifically, as another example, the interval frame taking rule may be 1+ (n-1), where n is a frame taking period, that is, only 1 frame of image to be detected is obtained in the frame taking period n. Under the 1+ (n-1) mode, the detection time is 1/n of the frame period detection, if n is 5, the proportion of the number of the frames of the image to be detected to the number of the video frames to be detected reaches 20%, and the processing speed of the video to be detected and the number of the video access paths to be detected can be further improved. The multi-frame images to be detected are obtained from the streaming media video at intervals, so that the processing speed of the images to be detected can be increased, the response timeliness of the fishing behavior identification detection is improved, and the delay of the fishing behavior identification detection is reduced.
Further, in some embodiments of the present invention, as shown in fig. 2, the step of obtaining the target detection frame includes:
s110, extracting a pedestrian frame and a fishing rod frame in the image to be detected;
s120, respectively matching the fishing rod frames with pedestrian frames in the same image to be detected;
step S130, if the intersection ratio obtained by matching is larger than a preset intersection ratio threshold value, determining that the pedestrian in the pedestrian frame is associated with the fishing rod in the fishing rod frame;
and S140, generating a target detection frame according to the pedestrian and the fishing rod.
Specifically, in the embodiment of the present invention, the calculation rule of the intersection ratio is specifically as follows:
the coordinates of the upper left corner and the lower right corner of the pedestrian frame a are marked as bbox a =[(x a1 ,y a1 ),(x a2 ,y a2 )]Wherein, the coordinate of the upper left corner of the pedestrian frame a is marked as (x) a1 ,y a1 ) The coordinate of the lower right corner of the pedestrian frame a is marked as (x) a2 ,y a2 );
The coordinates of the upper left corner and the lower right corner of the fishing rod frame b are marked as bbox b =[(x 1b ,y 1b ),(x 2 ,y 2b )]Wherein, the coordinate of the upper left corner of the fishing rod frame b is marked as (x) 1b ,y 1b ) The coordinate of the lower right corner of the fishing rod frame b is marked as (x) 2 ,y 2b );
The intersection ratio IoU, i.e. the intersection ratio of the pedestrian frame a and the fishing rod frame b, can be expressed as:
further, the preset intersection ratio threshold value in the embodiment of the invention is set to be 0.1-0.3. Wherein, as a preferred embodiment of the present invention, the preset intersection ratio threshold is set to 0.2. In the actual test process, it is found that when the intersection ratio of a fishing rod frame and a pedestrian frame is more than 0.2, the two frames can be regarded as a person and the fishing rod related to the person; when the intersection ratio threshold is less than 0.2, the fishing rod and the pedestrian of the two frames are considered to be irrelevant, so it is appropriate to set the intersection ratio threshold to 0.2. On the premise of not increasing the calculation amount, the higher detection and identification accuracy of the fishing behaviors is ensured.
In a preferred embodiment of the invention, the pedestrian frame and the fishing rod frame on the image to be detected are extracted by a pre-trained target detection model. Specifically, as shown in fig. 3, the training process of the target detection model includes:
step 1101, obtaining preprocessed sample data;
step S1102, inputting the preprocessed sample data into a preset initial target detection model to obtain an output result;
step S1103, adjusting sample type weight in the initial target detection model according to a preset focus loss function and the output result to obtain a parameter adjusting target detection model;
and S1104, training the parameter-adjusted target detection model through a batch stochastic gradient descent algorithm to obtain a pre-trained target detection model.
Specifically, in a preferred embodiment of the present invention, the initial target detection model is a YOLOv3 model, and the sample types include positive samples and negative samples.
The obtaining of the preprocessed sample data specifically includes:
step S11011, acquiring a first original image and a second original image in sample data;
step S11012, performing hybrid enhancement processing on the first original image and the second original image according to the hybrid weight to obtain enhanced sample data.
Specifically, the method of data enhancement processing can be expressed by an expression as follows:
image mix =lambda*image a +(1-lambda)*image b ;
bbox mix ∈bboxes mix ;
bbox a ∈bboxes a ;
bbox b ∈bboxes b ;
bboxes mix =bboxes a ∪bboxes b ;
wherein, the image mix Representing a blended enhanced image, image a Representing a first original image, image b Representing a second original image, bboxes mix Feature annotation boxes, bboxes, representing blended enhanced images a Feature annotation boxes, bboxes, representing a first original image b And (4) representing a feature labeling box of the second original image, wherein lambda is a mixed weight. In the preferred embodiment of the invention, the lambda mixed weight value is 0.5, so that the first original image and the second original image have the same weight in the mixed enhanced image, the data enhancement processing method can greatly enrich training samples, improve the detection efficiency and accuracy of pedestrian and fishing rod characteristics of the pedestrian/fishing rod detection model, and rapidly generate corresponding pedestrian frames and fishing rod frames. Specifically, the pedestrian frame comprises the pedestrian feature, the fishing rod frame comprises the fishing rod feature, and the target detection frame comprises the pedestrian feature and the fishing rod feature at the same time, namely the feature mark of one fishing behavior identified in the invention.
In the embodiment of the invention, the sample data is derived from a plurality of visible light images and infrared images corresponding to the existing steal fishing behaviors in the monitored water area scene, specifically, the visible light images or the infrared images marked with a pedestrian frame and a fishing rod frame are used as the sample data for model training.
In the embodiment of the invention, in order to improve the precision of model training, 1000 frames of visible light images and infrared images corresponding to the steal fishing behavior can be adopted to perform repeated training and iterative training for many times. In order to improve the reading speed of the image and the identification precision, the visible light image and the infrared image need to be aligned before marking the pedestrian or the fishing rod on the infrared image. In particular, the image registration includes feature-based image registration, and data-based registration. The image alignment based on the characteristics is to map the floating images to the reference images by finding a space transformation, so that points corresponding to the same position in space in the two images are in one-to-one correspondence, thereby achieving the purpose of information fusion and facilitating the positioning, extraction, detection and identification of the characteristics. In the embodiment of the present invention, preferably, the visible light image is used as a reference image, the infrared image is used as a floating image, and the infrared image is mapped onto the visible light image through spatial transformation, so that the pedestrian feature and the fishing rod feature on the infrared image are correspondingly aligned with the pedestrian feature and the fishing rod feature on the visible light image respectively at a time node. Specifically, feature-based image alignment may be processed using a Homography algorithm, a Mesh Warps algorithm, or an Optical flow algorithm in the prior art. The data-based alignment is to arrange the infrared image data and the visible light image data spatially according to a certain rule, rather than to arrange them one by one in sequence. Because of the large difference in the processing of memory space by the various hardware platforms, some platforms can only access certain types of data from certain specific addresses. This loss in access efficiency can occur if the data stores are not aligned to suit their platform requirements. For example, some platforms start reading from even addresses each time, if an int type (assumed to be a 32-bit system) is stored at the beginning of an even address, then one read cycle can be read, while if it is stored at the beginning of an odd address, 2 read cycles may be required, and the high and low bytes of the results of the two reads are pieced together to obtain the int data. Clearly there is a significant drop in read efficiency. Therefore, the infrared image data and the visible light image data are aligned based on the data, the game in space and time is realized, the data reading efficiency is improved, and the operand and the response timeliness are improved. Specifically, data alignment is a common prior art in the data access processing process, and usually adopts four-byte alignment, which is not described herein again.
Specifically, in the embodiment of the present invention, in the step S11012, the performing of the hybrid enhancement processing on the sample data specifically uses the hard sample in the positive and negative samples of the focal loss function focal loss to perform weighting processing, and through the weighting processing, the sample quality can be improved, the loss can be reduced, and the accuracy of judging the hard sample in the model training process can be improved, so as to reduce the problems of the unbalanced sample class and the unbalanced sample classification difficulty. Specifically, in the embodiment of the present invention, the data enhancement processing is performed on the input data based on the YOLOv3 model.
Further, in some embodiments of the present invention, as shown in fig. 4, the training process of the fishing behavior detection model includes:
step S210, extracting a rod frame in the preprocessed sample data, wherein the rod frame comprises a sample pedestrian and a sample fishing rod;
s220, training a preset classification model according to the rod frame and a batch stochastic gradient descent algorithm;
and step S230, if the training result obtained through training reaches a preset training threshold value, taking the trained classification model as a fishing behavior detection model.
Specifically, in the preferred embodiment of the present invention, the classification model is the mobilenetv1 (a lightweight neural network) model.
Further, in some embodiments of the present invention, as shown in fig. 5, the training process of the human fishing rod key point detection model includes:
step S310, carrying out key point feature marking on the rod frame to obtain a human body key point corresponding to a sample pedestrian and a fishing rod key point corresponding to a sample fishing rod;
and S320, training a preset segmentation model according to the human body key points, the fishing rod key points and the batch stochastic gradient descent algorithm, and taking the trained segmentation model as the human body fishing rod key point detection model.
In particular, in a preferred embodiment of the invention, the segmentation model is a modified Unet model.
Specifically, in a preferred embodiment of the present invention, the human fishing rod keypoint feature information includes 17 individual body keypoint features and 1 fishing rod keypoint feature corresponding to a human joint. Wherein, 17 personal body key point characteristics include: nose, left eye, right eye, left ear, right ear, left shoulder, right shoulder, left elbow, right elbow, left wrist, right wrist, left hip, right hip, left knee, right knee, left ankle, right ankle. 1 fishing rod key point characteristic includes: the end of the fishing rod connected with the fishing line.
In step 320, because the fishing rod is thin and is difficult to identify in the image, the improved Unet (segmented network) model of the invention replaces the common convolution layer with the cavity convolution on the basis of the existing Unet model, improves the receptive field during convolution calculation, represents the existence of the fishing rod by acquiring context information around the fishing rod in an enhanced manner, and further improves the accuracy of detecting the key points of the fishing rod. Specifically, the uet model is a well-known network used in the field of medical image segmentation, and because the fusion idea of the bottom layer features and the high layer features of the uet model can well retain the detailed feature information of the image, the uet model is used as a base (bottom layer) network for key point detection in the embodiment of the invention, and meanwhile, in order to further improve the detection rate of small-detail key points, namely, fishing rod key points, the convolutional layer in the network is replaced by a cavity convolutional layer in the network, so that the receptive field during convolutional calculation is improved. In step S320, the key point data of the human fishing rod of multiple frames may be obtained, and then the STGCN (space-time graph convolutional network model) is used to extract the feature information of the key point for fishing behavior recognition, so as to obtain the final result more accurately.
Further, in some embodiments of the present invention, the voting the first result and the second result to output a final recognition result specifically includes:
step S410, respectively obtaining the mean value of the first result and the confidence coefficient of the second result;
and step S420, voting calculation is carried out according to the mean value and the confidence coefficient so as to output a final recognition result.
Specifically, the mean value of the first result refers to the mean value of the first result of a multi-frame image under a time sequence, and the confidence of the second result refers to the final fishing behavior recognition confidence result representing the second result obtained by extracting the key point feature information of the fishing rod of the human body from the multi-frame image under the time sequence;
the voting rule can be expressed as:
wherein, score STGCN Obtaining a final fishing behavior recognition confidence result for STGCN (Spatial Temporal Graph connected Networks, space-time Graph convolution network model) by using multi-frame time sequence data; score imagemodeli Is the result of the ith frame obtained in step S300; lambda is a weight parameter.
In a preferred embodiment of the present invention, the voting weight of the mean value of the first result is 0.4< n <0.6, and the voting weight of the phishing behavior recognition confidence result is 0.4< m <0.6, where m + n is 1.
Specifically, as a preferred embodiment of the present invention, the weighting parameter lambda is set to 0.5, which represents the detection and recognition results of the average reference fishing behavior detection model and the human fishing rod key point detection model. Where n is the number of frames of the used multiframes, and is set to 5, 5 frames can achieve better recognition rate and also have good performance in speed.
Further, in an embodiment of the present invention, the acquiring of the image to be detected by the RGB-IR image sensor disposed in the scene of the monitored water area includes:
receiving a human body induction signal, awakening the RGB-IR image sensor in a dormant state after induction triggering, and collecting a real-time picture in the current field;
and returning the acquired real-time pictures to the monitoring center server.
The dormant RGB-IR image sensor is awakened by triggering the human body induction signal to collect the real-time image in the current field, so that the data storage space can be greatly saved, and the power consumption of the terminal RGB-IR image sensor is reduced.
On the other hand, another embodiment of the invention further provides a fishing detection device, and the fishing detection device corresponds to the fishing detection method in the above embodiment one to one. Specifically, as shown in fig. 6, the fishing detection apparatus includes: the system comprises an RGB-IR image acquisition module 10, a target detection module 20, a fishing behavior detection module 30, a human fishing rod key point detection module 40 and a voting module 50.
The RGB-IR image obtaining module 10 is configured to obtain an image to be detected in a monitored water scene. The target detection module 20 is configured to extract a pedestrian frame and a fishing rod frame in the image to be detected, and then generate a target detection frame including both a pedestrian and a fishing rod according to the intersection ratio threshold. The phishing behavior detection module 30 is configured to perform phishing behavior identification on the target detection frame, and output a first result that the identification result is "phishing behavior". The human fishing rod key point detection module 40 is configured to perform fishing behavior recognition on the target detection frame, and output a second result that a recognition result is a "fishing behavior". The voting module 50 is configured to perform voting calculation according to the mean value of the first result and the confidence level of the second result, and output a final recognition result.
Specifically, the target detection module 20 performs the following operations: firstly, carrying out detection and identification on a pedestrian and a fishing rod on a loaded image to be detected, and respectively generating a pedestrian frame and a fishing rod frame; traversing the pedestrian frame on each frame of image to be detected, matching the pedestrian frame with the fishing rod frame on the frame of image to be detected, and determining the pedestrian frame which is matched with the fishing rod frame in a correlation manner according to a preset intersection-comparison threshold; and finally, generating a target detection frame simultaneously containing the pedestrian and the fishing rod according to the correlation matching result.
For the specific working principle of the fishing detection device based on RGB-IR image data, reference may be made to the working flow of the fishing detection method described above, and details thereof are not repeated here. The modules in the fishing detection device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, or can be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
On the other hand, in an embodiment, as shown in fig. 7, the present invention further provides a computer device, which may be a server, and a schematic diagram of an internal structure of the computer device is shown in fig. 7. The computer apparatus includes a data processor, a memory, a network interface, and a database connected by a device bus. Wherein the computer device is provided with a plurality of data processors for providing computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The nonvolatile storage medium stores an operating device, a computer program, and a database. The internal memory provides an environment for the operating system and the computer program to run in the non-volatile storage medium. The database of the device is used to store data involved in image processing. The network interface of the device is used for communicating with an external terminal through a network connection.
The memory stores a computer program capable of running on the processor, and the processor executes the computer program to realize the phishing detection method.
In one embodiment, the present invention further provides a computer device, which may be a terminal, and the schematic internal structure of the computer device is shown in fig. 8. The computer device comprises a data processor, a memory, a network interface, a display screen and an input device which are connected through a system bus. Wherein the computer device is provided with a plurality of data processors for providing computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operating system and the computer program to run in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection.
Wherein the memory has stored therein a computer program operable on the processor, which computer program, when executed by the processor, implements the fishing detection method described above.
Specifically, the display screen of the computer device may be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer device may be a touch layer covered on the display screen, a key, a trackball or a touch pad arranged on a casing of the computer device, or an external keyboard, a touch pad or a mouse.
It will be appreciated by those skilled in the art that the configurations shown in fig. 7 and 8 are only block diagrams of some of the configurations relevant to the present application, and do not constitute a limitation on the computer apparatus to which the present application is applied, and a particular apparatus may include more or less components than those shown in the figures, or may combine certain components, or have a different arrangement of components.
In another aspect, the present invention further provides a computer-readable storage medium, which stores a computer program, and the computer program is executed by a processor to implement the fishing detection method.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
Illustratively, a computer program may be partitioned into one or more modules/units, which are stored in a memory and executed by a processor to implement the present invention. One or more modules/units may be a series of computer program instruction segments capable of performing certain functions, the instruction segments being used to describe the execution of a computer program in a computer device.
The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer device may include, but is not limited to, a processor, a memory. Those skilled in the art will appreciate that fig. 7 and 8 are merely examples of a computing device and are not intended to limit the computing device and that a computing device may include more or fewer components than those shown, or some of the components may be combined, or different components, e.g., the computing device may also include input output devices, network access devices, buses, etc.
The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage may be an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. The memory may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the computer device. Further, the memory may also include both internal and external storage units of the computer device. The memory is used for storing computer programs and other programs and data required by the terminal device. The memory may also be used to temporarily store data that has been or will be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain other components which may be suitably increased or decreased as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media which may not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.
Claims (10)
1. A fishing detection method, comprising:
obtaining a target detection frame, wherein the target detection frame comprises a pedestrian and a fishing rod;
performing phishing behavior recognition on the target detection frame through a phishing behavior detection model, and outputting a first result of which the recognition result is 'phishing behavior';
carrying out fishing behavior recognition on the target detection frame through a human fishing rod key point detection model, and outputting a second result of which the recognition result is 'fishing behavior';
and voting the first result and the second result to output a final recognition result.
2. A fishing detection method according to claim 1, wherein said acquiring a target detection frame includes:
extracting a pedestrian frame and a fishing rod frame in an image to be detected;
respectively matching the fishing rod frames with pedestrian frames in the same image to be detected;
if the intersection ratio obtained by matching is larger than a preset intersection ratio threshold value, determining that the pedestrian in the pedestrian frame is associated with the fishing rod in the fishing rod frame;
and generating a target detection frame according to the pedestrian and the fishing rod.
3. A fishing detection method according to claim 2, wherein a pedestrian frame and a fishing rod frame in the image to be detected are extracted using a pre-trained target detection model, and before the extraction of the pedestrian frame and the fishing rod frame in the image to be detected, the method further comprises:
acquiring preprocessed sample data;
inputting the preprocessed sample data into a preset initial target detection model to obtain an output result;
adjusting sample type weight in the initial target detection model according to a preset focus loss function and the output result to obtain a parameter-adjusted target detection model;
and training the parameter-adjusted target detection model through a batch stochastic gradient descent algorithm to obtain a pre-trained target detection model.
4. A phishing detection method according to claim 3 where said obtaining pre-processed sample data comprises:
acquiring a first original image and a second original image in sample data;
and performing mixed enhancement processing on the first original image and the second original image according to the mixed weight to obtain enhanced sample data.
5. A phishing detection method according to claim 3 or 4, wherein before performing phishing behavior recognition on the target detection frame by the phishing behavior detection model and outputting a first result that a recognition result is "phishing behavior", the method further comprises:
extracting a rod frame in the preprocessed sample data, wherein the rod frame comprises a sample pedestrian and a sample fishing rod;
training a preset classification model according to the rod frame and the batch random gradient descent algorithm;
and if the training result obtained through training reaches a preset training threshold value, taking the trained classification model as a fishing behavior detection model.
6. A fishing detection method according to claim 5, wherein before the fishing behavior recognition is performed on the target detection frame by the human fishing rod key point detection model and the second result that the recognition result is "fishing behavior" is output, the method further comprises:
carrying out key point feature marking on the fishing rod frame to obtain a human body key point corresponding to a sample pedestrian and a fishing rod key point corresponding to a sample fishing rod;
and training a preset segmentation model according to the human body key points, the fishing rod key points and the batch stochastic gradient descent algorithm, and taking the trained segmentation model as the human body fishing rod key point detection model.
7. A phishing detection method as claimed in claim 1 wherein said voting for said first and second results to output a final recognition result comprises:
respectively acquiring the mean value of the first result and the confidence coefficient of the second result;
and voting calculation is carried out according to the mean value and the confidence coefficient so as to output a final recognition result.
8. A fishing detection device, comprising:
the RGB-IR image acquisition module is used for acquiring an image to be detected in a scene of a monitoring water area;
the target detection module is used for extracting a pedestrian frame and a fishing rod frame in the image to be detected and generating a target detection frame simultaneously comprising the pedestrian and the fishing rod according to the intersection ratio threshold;
the fishing behavior detection module is used for carrying out fishing behavior identification on the target detection frame and outputting a first result of which the identification result is 'fishing behavior';
the human fishing rod key point detection module is used for carrying out fishing behavior recognition on the target detection frame and outputting a second result of which the recognition result is 'fishing behavior';
and the voting module is used for voting according to the mean value of the first result and the confidence coefficient of the second result and outputting a final recognition result.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the phishing detection method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements a phishing detection method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110250476.0A CN115100732A (en) | 2021-03-08 | 2021-03-08 | Fishing detection method and device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110250476.0A CN115100732A (en) | 2021-03-08 | 2021-03-08 | Fishing detection method and device, computer equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115100732A true CN115100732A (en) | 2022-09-23 |
Family
ID=83287972
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110250476.0A Pending CN115100732A (en) | 2021-03-08 | 2021-03-08 | Fishing detection method and device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115100732A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115410280A (en) * | 2022-11-03 | 2022-11-29 | 合肥中科类脑智能技术有限公司 | Fishing behavior detection method based on human body orientation judgment |
CN115497172A (en) * | 2022-11-18 | 2022-12-20 | 合肥中科类脑智能技术有限公司 | Fishing behavior detection method and device, edge processing equipment and storage medium |
CN115497030A (en) * | 2022-10-27 | 2022-12-20 | 中国水利水电科学研究院 | Fishing behavior identification method based on deep learning |
-
2021
- 2021-03-08 CN CN202110250476.0A patent/CN115100732A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115497030A (en) * | 2022-10-27 | 2022-12-20 | 中国水利水电科学研究院 | Fishing behavior identification method based on deep learning |
CN115410280A (en) * | 2022-11-03 | 2022-11-29 | 合肥中科类脑智能技术有限公司 | Fishing behavior detection method based on human body orientation judgment |
CN115497172A (en) * | 2022-11-18 | 2022-12-20 | 合肥中科类脑智能技术有限公司 | Fishing behavior detection method and device, edge processing equipment and storage medium |
CN115497172B (en) * | 2022-11-18 | 2023-02-17 | 合肥中科类脑智能技术有限公司 | Fishing behavior detection method and device, edge processing equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111178183B (en) | Face detection method and related device | |
CN115100732A (en) | Fishing detection method and device, computer equipment and storage medium | |
CN114424253A (en) | Model training method and device, storage medium and electronic equipment | |
US11676390B2 (en) | Machine-learning model, methods and systems for removal of unwanted people from photographs | |
TW202038191A (en) | Method, device and electronic equipment for living detection and storage medium thereof | |
CN107832677A (en) | Face identification method and system based on In vivo detection | |
CN111354024B (en) | Behavior prediction method of key target, AI server and storage medium | |
CN109800682B (en) | Driver attribute identification method and related product | |
CN109299658B (en) | Face detection method, face image rendering device and storage medium | |
CN109714526B (en) | Intelligent camera and control system | |
CN112418195B (en) | Face key point detection method and device, electronic equipment and storage medium | |
CN111899470B (en) | Human body falling detection method, device, equipment and storage medium | |
CN108647671A (en) | A kind of optical indicia visual identity method and the self-service cabinet based on this method | |
CN109815813A (en) | Image processing method and Related product | |
CN112836625A (en) | Face living body detection method and device and electronic equipment | |
CN112347526A (en) | Information security protection method and device based on anti-shooting screen, electronic equipment and medium | |
CN105469054A (en) | Model construction method of normal behaviors and detection method of abnormal behaviors | |
CN114359618A (en) | Training method of neural network model, electronic equipment and computer program product | |
CN110443179B (en) | Off-post detection method and device and storage medium | |
CN116863286A (en) | Double-flow target detection method and model building method thereof | |
CN114387496A (en) | Target detection method and electronic equipment | |
CN113869115A (en) | Method and system for processing face image | |
CN117576616A (en) | Deep learning-based fish swimming behavior early warning method, system and device | |
WO2024093296A1 (en) | Wake-up method and apparatus | |
CN113570615A (en) | Image processing method based on deep learning, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |