CN116052064A

CN116052064A - Method and device for identifying feeding strength of fish shoal, electronic equipment and bait casting machine

Info

Publication number: CN116052064A
Application number: CN202310343704.8A
Authority: CN
Inventors: 杨信廷; 曾昱皓; 周超; 孙传恒; 刘锦涛
Original assignee: Intelligent Equipment Technology Research Center of Beijing Academy of Agricultural and Forestry Sciences
Current assignee: Research Center of Information Technology of Beijing Academy of Agriculture and Forestry Sciences
Priority date: 2023-04-03
Filing date: 2023-04-03
Publication date: 2023-05-02
Anticipated expiration: 2043-04-03
Also published as: CN116052064B

Abstract

The invention provides a method and a device for identifying the feeding strength of a fish school, electronic equipment and a bait casting machine, belonging to the technical field of fishery, wherein the method comprises the following steps: determining a spectrogram of a sound signal generated in the process of feeding the fish school; inputting the spectrogram to a fish-swarm feeding intensity recognition model, and obtaining the fish-swarm feeding intensity output by the fish-swarm feeding intensity recognition model; the fish swarm feeding intensity recognition model is used for determining the fish swarm feeding intensity based on acoustic features obtained by feature extraction of the spectrogram; the fish swarm feeding intensity recognition model is obtained through training according to a spectrogram sample of a fish swarm feeding sound signal and a corresponding fish swarm feeding intensity label. The invention can effectively improve the accuracy and effect of identifying the feeding intensity of the fish shoal, and can realize the effect of identifying the feeding intensity of the fish shoal with high accuracy under complex water scenes such as weak light, turbid water quality and the like.

Description

Method and device for identifying feeding strength of fish shoal, electronic equipment and bait casting machine

Technical Field

The invention relates to the technical field of fishery, in particular to a method and a device for identifying the feeding strength of a fish school, electronic equipment and a bait casting machine.

Background

In aquaculture, the real-time detection and monitoring of the variation of the feeding intensity of the shoal of fish in the aquaculture water body is one of the important bases for making a scientific feeding strategy, so that the waste of baits can be effectively reduced, and win-win of economic benefit and ecological benefit is realized. In recent years, the analysis of the feeding behaviors of the shoal of fish is improved in recognition accuracy by combining a deep learning algorithm, and the problem of grading the feeding strength of the shoal of fish is solved by the behavior characteristics of various shoals of fish.

However, at present, most of methods based on traditional machine vision depend on the quality of the acquired fish-shoal ingestion image, and have limitations in the actual detection process, such as image contrast reduction caused by water turbidity and light ray difference, and image unclear; the problems of focusing blurring and ghosting caused by rapid swimming of the fish shoal, shielding and misjudgment caused by baits and the like bring a plurality of difficulties to detection of the fish shoal feeding activity in the aquatic breeding, so that the efficiency of identifying the fish shoal feeding intensity is low, the identification precision is low, and the traditional machine vision method cannot be applied particularly in a water quality turbid scene.

Disclosure of Invention

The invention provides a method, a device, electronic equipment and a bait casting machine for identifying the feeding strength of a fish school, which are used for solving the defects that the efficiency of identifying the feeding strength of the fish school is low and the identifying precision is not high in the prior art, and particularly, the traditional machine vision method cannot be applied in a water turbidity scene.

The invention provides a fish swarm ingestion intensity identification method, which comprises the following steps:

determining a spectrogram of a sound signal generated in the process of feeding the fish school;

inputting the spectrogram to a fish-swarm feeding intensity recognition model, and obtaining the fish-swarm feeding intensity output by the fish-swarm feeding intensity recognition model;

the fish swarm feeding intensity recognition model is used for determining the fish swarm feeding intensity based on acoustic features obtained by feature extraction of the spectrogram; the fish swarm feeding intensity recognition model is obtained through training according to a spectrogram sample of a fish swarm feeding sound signal and a corresponding fish swarm feeding intensity label.

According to the fish swarm feeding intensity identification method provided by the invention, the fish swarm feeding intensity identification model comprises a first convolution layer, a three-layer characteristic processing layer, a second convolution layer and a global pooling layer which are sequentially connected;

the first convolution layer is used for encoding the spectrogram to obtain a first feature map;

the three feature processing layers are used for carrying out feature extraction on the first feature map to obtain a second feature map;

the second convolution layer is used for reducing the dimension of the second feature map to obtain a third feature map;

and the global pooling layer is used for carrying out global average pooling treatment on the third characteristic map and outputting the feeding intensity of the fish school.

According to the fish school feeding intensity identification method provided by the invention, the first convolution layer comprises at least one convolution downsampling layer which is sequentially connected, and the convolution downsampling layer is used for convoluting input data and downsampling the convolved input data; the first feature map is data output by the convolution downsampling layer of the last layer;

each of the three feature processing layers comprises a mobile network layer and a mobile vision transducer module which are sequentially connected;

the mobile network layer is used for carrying out downsampling processing on input data; the mobile visual transducer module is used for sequentially extracting local features and global features from the output data of the mobile network layer; the second feature map is data output by a mobile vision transducer module in the feature processing layer of the last layer.

According to the fish swarm ingestion intensity identification method provided by the invention, the three characteristic processing layers comprise at least one improved mobile vision transducer module;

the improved mobile vision transducer module comprises a local feature extraction module, a global feature extraction module and an output module which are sequentially connected;

The local feature extraction module adopts a shift patch marking module or a one-dimensional convolution module and is used for carrying out dimension reduction and local feature extraction on the fourth feature map to obtain a fifth feature map;

the global feature extraction module is used for carrying out global feature extraction on the fifth feature map to obtain a sixth feature map;

the output module adopts the shift patch marking module or the one-dimensional convolution module to carry out dimension lifting on the sixth characteristic diagram to obtain a seventh characteristic diagram; the seventh feature map comprises the second feature map, and the seventh feature map is the same as the fourth feature map in dimension;

the global feature extraction module comprises at least one improved transducer module obtained by adding a local self-attention module to the transducer module and/or adding an enhanced residual connection for enhancing the attention to audio features in the input features and/or increasing the diversity of the input features.

According to the fish swarm feeding strength identification method provided by the invention, the mobile network layer comprises two MV2 modules and a downsampling layer which are sequentially connected; the first convolution layer comprises two convolution downsampling layers which are sequentially connected, the convolution layers in the convolution downsampling layers are three-dimensional convolution layers, and the second convolution layer is one-dimensional convolution layer.

According to the method for identifying the feeding intensity of the fish school, after the spectrogram is input into the fish school feeding intensity identification model to obtain the feeding intensity of the fish school outputted by the fish school feeding intensity identification model, the method further comprises the following steps:

acquiring a fish swarm feeding intensity identification result before and after the moment corresponding to the fish swarm feeding intensity;

comparing the fish swarm feeding intensity with the fish swarm feeding intensity identification results at the front and rear moments in pairs;

and correcting the fish swarm feeding intensity to be the fish swarm feeding intensity identification result at the front and rear moments when the fish swarm feeding intensity is different from the fish swarm feeding intensity identification result at the front and rear moments and the fish swarm feeding intensity identification result at the front and rear moments is the same.

According to the method for identifying the feeding intensity of the fish school, before the spectrogram is input into the fish school feeding intensity identification model to obtain the feeding intensity of the fish school outputted by the fish school feeding intensity identification model, the method further comprises the following steps:

taking the spectrogram sample of the fish swarm feeding sound signal and the corresponding fish swarm feeding intensity label as a group of training samples to obtain a plurality of groups of training samples;

For any group of training samples, inputting the training samples into the fish group ingestion intensity recognition model, and outputting the prediction probability corresponding to the training samples;

calculating a loss value according to the prediction probability corresponding to the training sample and the fish group ingestion intensity label corresponding to the training sample by using a preset loss function;

based on the loss value, adjusting model parameters of the fish intake intensity recognition model until the loss value is smaller than a preset threshold value or the training frequency reaches a preset frequency;

and taking the model parameters obtained when the loss value is smaller than the preset threshold value or the training times reach the preset times as the model parameters of the trained fish swarm feeding strength identification model.

The invention also provides a device for identifying the feeding strength of the fish school, which comprises the following steps:

the processing module is used for determining a spectrogram of a sound signal generated in the feeding process of the fish school;

the identification module is used for inputting the spectrogram to a fish-swarm feeding intensity identification model and obtaining the fish-swarm feeding intensity output by the fish-swarm feeding intensity identification model;

The invention also provides electronic equipment, which comprises a memory, a communication interface, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the identification method of the fish intake intensity when executing the program.

The invention also provides a bait casting machine, comprising:

the bait box, the suction device, the bait blowing device, the bait channel and the power frequency converter;

the bait box is used for containing bait;

the bottom discharge hole of the bait box is connected with the material sucking device, the material sucking device is connected with one end of the bait channel, and the bait blowing device is connected with the other end of the bait channel;

the sucking device is used for sucking the baits in the bait box into the bait channel; the bait blowing device is used for blowing out the bait in the bait channel for feeding;

one end of the power frequency converter is respectively connected with the material sucking device and the bait blowing device, and the other end of the power frequency converter is connected with the communication interface in the electronic equipment and is used for receiving the identification result of the intake intensity of the shoal of fish output by the electronic equipment, controlling the start and stop of the material sucking device and the material sucking speed according to the identification result of the intake intensity of the shoal of fish and controlling the start and stop of the bait blowing device and the bait blowing speed.

The invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which when executed by a processor implements a method of identifying the ingestion intensity of a fish population as described in any of the above.

The invention also provides a computer program product comprising a computer program which when executed by a processor implements a method of identifying the ingestion intensity of a fish population as described in any of the above.

According to the method, the device, the electronic equipment and the bait casting machine for identifying the fish-swarm feeding intensity, provided by the invention, the association between the acoustic signals and the feeding intensity generated in the fish-swarm feeding process is considered, the fish-swarm feeding audio information is obtained, the spectrogram sample of the fish-swarm feeding sound signals and the corresponding fish-swarm feeding intensity label are utilized to conduct neural network model training, the fish-swarm feeding intensity identification model is obtained, the fish-swarm feeding intensity identification is conducted on the acoustic characteristics obtained by extracting the fish-swarm feeding audio information spectrogram through the fish-swarm feeding intensity identification model, the accuracy and the effect of fish-swarm feeding intensity identification can be effectively improved, and the high-precision fish-swarm feeding intensity identification effect can be realized under complex water scenes such as weak light and turbid water quality.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a method for identifying the feeding strength of a fish school;

FIG. 2 is a schematic diagram of a model structure in the method for identifying the feeding strength of a fish school;

FIG. 3 is a diagram showing a second model structure in the method for identifying the feeding strength of a fish school according to the present invention;

FIG. 4 is a third schematic diagram of the model structure in the method for identifying the feeding strength of fish school according to the present invention;

FIG. 5 is a diagram showing a model structure in the method for identifying the feeding strength of a fish school according to the present invention;

FIG. 6 is a diagram showing a model structure in the method for identifying the feeding strength of a fish school according to the present invention;

FIG. 7 is a diagram showing a model structure in the method for identifying the feeding strength of a fish school according to the present invention;

fig. 8 is a schematic structural diagram of a fish school feeding intensity recognition device provided by the invention;

Fig. 9 is a schematic diagram of the physical structure of an electronic device according to the present invention;

fig. 10 is a schematic view of the structure of the bait casting machine provided by the invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In the description of the invention, it should be noted that, unless explicitly stated and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art according to the specific circumstances.

The method, the device, the electronic equipment and the bait casting machine for identifying the feeding strength of the fish school according to the invention are described below with reference to fig. 1 to 10.

Fig. 1 is a schematic flow chart of a method for identifying the feeding strength of a fish school, which is provided by the invention, and as shown in fig. 1, the method comprises the following steps: step 110 and step 120.

Step 110, determining a spectrogram of a sound signal generated in the process of feeding the fish school;

although quantification of the feeding behavior of the fish shoal is achieved by the traditional machine vision-based method, the contrast of the image is reduced due to turbidity of the water body and poor light in the actual implementation process, and the image is unclear; focusing blurring and ghosting caused by rapid swimming of fish shoal; the problems of shielding, misjudgment and the like caused by baits lead to poor recognition effect of the traditional machine vision method, and the ingestion acoustic characteristics of the fish shoal are not interfered by the problems, and in acoustic aspects, different behaviors of fish, such as feeding and reproduction, are accompanied by different sounding behaviors. During feeding, the fish can quickly flap the tail to swim, collide, bite, swallow and other actions can cause the acoustic signals to change, and the information of the feeding intensity of the fish shoal can be reflected.

Specifically, the spectrogram described in the embodiment of the invention refers to a spectrogram obtained by converting sound signals in the process of collecting fish food intake.

In the embodiment of the invention, the underwater sound transducer can be arranged in the circulating culture pond to collect sound signals generated in the process of fish intake, and the collected original sound data are converted into wav format through the preamplifier, the filter, the A/D converter and the data recording unit by recording underwater sound and are stored in the data recording unit. Further, the stored original sound data can be subjected to framing, windowing and short-time Fourier transformation to extract a corresponding time-frequency characteristic diagram, and the corresponding spectrogram can be obtained by carrying out frequency domain transformation on the time-frequency characteristic diagram.

In some embodiments, to accurately reflect the front-to-back consistency of the intake behavior of the fish farm, a sliding window manner may be used to segment the collected intake sound signal of the fish farm, for example, a sliding window with a length of 3 seconds and a step length of 0.1 seconds may be used to segment the sound signal, so as to obtain a plurality of audio information files, for example 24000 audio files. And extracting corresponding time-frequency characteristics of each audio file data through framing, hamming window adding and short-time Fourier transformation by using a voice processing package Librosa tool package, and converting the corresponding time-frequency characteristics into a corresponding spectrogram.

Step 120, inputting the spectrogram to a fish-swarm feeding intensity recognition model, and obtaining the fish-swarm feeding intensity output by the fish-swarm feeding intensity recognition model;

Specifically, the intake intensity of the fish school described in the embodiment of the invention can be subdivided into four stages, and the four stages sequentially comprise a category of characterization none, a category of characterization weak degree w ak, a category of characterization medium degree w medium and a category of characterization strong degree w strong. Wherein, the intake intensity of the fish shoal is "none", which means that the fish shoal moves slowly and the frequency spectrum change is relatively stable; the feeding strength of the fish shoal is 'weak', which means that a small amount of fish feeds, and the frequency spectrum energy is concentrated in low-frequency fluctuation; the feeding strength of the fish shoal is medium, which means that the fish swims back and forth to feed, and the energy spectrum has small interruption; the feeding strength of the fish shoal is strong, which means that the fish shoal feeds fiercely, the frequency spectrum energy is high, and the high frequency change is obvious.

In the embodiment of the invention, the fish-school feeding intensity is determined according to the spectrogram of the input fish-school feeding sound signal, and different spectrograms can correspond to different fish-school feeding intensities.

The fish swarm feeding intensity recognition model is obtained through training according to the spectrogram sample of the fish swarm feeding sound signal and the corresponding fish swarm feeding intensity label and is used for learning internal relations between different fish swarm feeding intensities and acoustic signals generated in the fish swarm feeding process, and recognition of the fish swarm feeding intensity is carried out on acoustic features obtained by extracting the spectrogram of the fish swarm feeding sound signal, so that a high-precision fish swarm feeding intensity recognition result is output.

In the embodiment of the invention, the fish swarm feeding strength identification model can be constructed based on a deep neural network. The deep neural network may specifically be a Mobile visual transducer (Mobile Visual Transformer, mobile ViT) model, or may be another ViT model, or other deep neural networks for identifying the feeding strength of a fish farm, which is not specifically limited in the present invention.

In the embodiment of the invention, the model training samples are composed of a plurality of groups of fish swarm ingestion sound signal spectrogram samples carrying fish swarm ingestion intensity labels.

In the embodiment of the invention, the fish school feeding intensity label is predetermined according to the fish school feeding sound signal spectrogram sample and corresponds to the fish school feeding sound signal spectrogram sample one by one. That is, each fish group ingestion sound signal spectrogram sample in the training samples is preset to carry a corresponding fish group ingestion intensity label.

It will be appreciated that the fish school feeding strength tag may be subdivided into four levels of tags, including in order "none", "weak", "medium" and "strong".

Further, training is performed by using the spectrogram sample of the fish-swarm feeding sound signal and the corresponding fish-swarm feeding intensity label to obtain a fish-swarm feeding intensity recognition model, and after inputting the spectrogram of the fish-swarm feeding sound signal obtained in step 110 into the fish-swarm feeding intensity recognition model, the fish-swarm feeding intensity corresponding to the spectrogram can be obtained.

According to the method for identifying the intake intensity of the fish shoal, provided by the embodiment of the invention, the association of the acoustic signals and the intake intensity generated in the intake process of the fish shoal is considered, the intake audio information of the fish shoal is obtained, the spectrogram sample of the intake sound signals of the fish shoal and the corresponding intake intensity label of the fish shoal are utilized to carry out neural network model training, the intake intensity identification model of the fish shoal is obtained, the intake intensity identification of the fish shoal is carried out on the acoustic characteristics obtained by extracting the spectrogram of the intake audio information of the fish shoal through the intake intensity identification model of the fish shoal, the accuracy and effect of the intake intensity identification of the fish shoal can be effectively improved, and the high-precision recognition effect of the intake intensity of the fish shoal can be realized under the complex water area scene such as weak light, turbid water quality and the like.

Fig. 2 is a schematic diagram of a model structure in the method for identifying the feeding strength of a fish school, as shown in fig. 2, in an embodiment of the present invention, a model for identifying the feeding strength of a fish school may be constructed based on a deep neural network model for identifying sound signals, and includes a first convolution layer 1, three feature processing layers 2, a second convolution layer 3, and a global pooling layer 4, which are sequentially connected. The first convolution layer 1 is used for encoding a spectrogram of a fish swarm feeding sound signal to obtain a first characteristic diagram; the three-layer feature processing layer 2 is used for carrying out feature extraction on the first feature map to obtain a second feature map; the second convolution layer 3 is used for reducing the dimension of the second feature map to obtain a third feature map; the global pooling layer 4 is used for carrying out global average pooling treatment on the third feature map and outputting the feeding intensity of the fish school.

Fig. 3 is a second schematic diagram of a model structure in the method for identifying the feeding strength of a fish school, as shown in fig. 3 (a), based on the content of the foregoing embodiment, the model for identifying the feeding strength of a fish school may specifically be constructed based on a Mobile visual transducer V2 (Mobile ViT V2) model, where the first convolution layer 1 includes at least one convolution downsampling layer 11 sequentially connected, and the convolution downsampling layer is configured to convolve input data and downsample the convolved input data; the first feature map is the data output by the last convolution downsampling layer.

Specifically, the convolution downsampling layer 11 includes a convolution layer 111 and a downsampling layer 112 connected in sequence; the second convolution layer and the first convolution layer are convolution operations of different dimensions. Wherein, the convolution layer 111 is used for carrying out convolution operation on the input feature diagram; the downsampling layer 112 is used for downsampling the feature map input thereto.

With continued reference to fig. 3, as shown in fig. 3 (b), each of the three feature processing layers 2 includes a Mobile network layer 21 and a Mobile ViT module 22 that are sequentially connected, where the Mobile network layer 21 is configured to perform downsampling processing on input data; the Mobile ViT module 22 is configured to sequentially extract local features and global features from output data of the Mobile network layer; the second feature map is data output by the Mobile ViT module in the last feature processing layer.

The mobile network layer 21 includes at least one MV2 module 211 and a downsampling layer 212, i.e., L is greater than or equal to 1, which are sequentially connected. The downsampling layer 112 and the downsampling layer 212 are each used to downsample the feature map that they input.

Specifically, in the embodiment of the invention, the fish intake intensity recognition model can be constructed based on a network structure of a Mobile ViT V2 model. The Mobile ViT V2 model has the advantages of very light structure, low delay, suitability for Mobile deployment of fish intake intensity identification scenes, and combination of the advantages of a convolutional neural network (Convolutional Neural Network, CNN) and a visual task backbone network (Visual Transformer, viT).

Wherein, the CNN part is embodied in a MV2 (Mobile Net V2) module in the Mobile network layer. The MV2 module is an inverse residual structure, and the principle is that the dimension is increased through a one-dimensional convolution network Conv1×1 to expand data, then the characteristic extraction is performed through a three-dimensional convolution network Conv3×3, and then the dimension is reduced through the one-dimensional convolution network Conv1×1 to compress the data. In this way, the information remains more complete as the feature passes through the activation function.

The core of the Mobile ViT module is a transducer module, which can make the model notice the information of the global space. The transform module is formed by stacking Encoder blocks, and the Encoder blocks respectively use a Multi-Head Attention (MHA) mechanism, so that the actions of a plurality of convolution kernels can be analogized in CNN, and the Multi-Head Attention mechanism of the transform is helpful for the model to obtain richer features.

The core of the transducer module is the attention mechanism, which can be expressed by the following formula:

；

wherein Q represents an indexPolling (Query); k represents a Key (Key); v represents information (Value) extracted from the feature,

represents the transpose of K, ">

Representing the length of the feature vector. The dot product of Q and K is calculated as the similarity between them. The higher the similarity, the heavier the associated V. As the depth of the network grows, the number of patches decreases, and the perceived range of each Patch expands.

According to the method provided by the embodiment of the invention, the fish intake intensity identification model is constructed by adopting the structure of the Mobile ViT V2 model, the advantage of the Mobile ViT module in the performance of extracting global space information is utilized, the accurate identification of the fish intake behavior intensity classification task can be realized, the overall effect is better than that of a convolution network, and the identification precision is high.

Further, in the embodiment of the invention, after a spectrogram of a fish-swarm ingestion sound signal is obtained, the spectrogram is input into a fish-swarm ingestion intensity recognition model constructed based on a deep neural network, the spectrogram sequentially passes through a first convolution layer, a three-layer feature processing layer, a second convolution layer and a global pooling layer, acoustic features of the spectrogram are extracted, different features are learned, and finally a prediction result is output to obtain corresponding fish-swarm ingestion intensity, so that recognition of the fish-swarm ingestion intensity is completed.

According to the method provided by the embodiment of the invention, the association of the acoustic signals and the ingestion intensity generated in the fish swarm ingestion process is considered, the spectrogram of the fish swarm ingestion acoustic signals is encoded, the acoustic characteristics are extracted and identified by using the deep neural network, the effective identification of the ingestion behavior intensity of the fish under complex water scenes such as weak light, turbid water quality and the like can be realized, and the identification precision is high.

Based on the foregoing embodiments, as an alternative embodiment, the three feature handling layers include at least one modified mobile vision transducer module therein;

the output module adopts a shift patch marking module or a one-dimensional convolution module and is used for carrying out dimension lifting on the sixth characteristic diagram to obtain a seventh characteristic diagram; the seventh feature map comprises a second feature map, and the seventh feature map is the same as the fourth feature map in dimension;

the global feature extraction module comprises at least one improved transducer module, wherein the improved transducer module is obtained by adding a local self-attention module and/or adding an enhanced residual connection to the transducer module and is used for enhancing the attention of audio features in the input features and/or increasing the diversity of the input features.

Specifically, based on the foregoing embodiments, each feature processing layer includes a Mobile ViT module, and optionally, in this embodiment, the Mobile ViT module in the three feature processing layers may be modified, so that the three feature processing layers include at least one modified Mobile ViT module, that is, only one of the feature processing layers adopts the modified Mobile ViT module, or two of the feature processing layers adopts the modified Mobile ViT module, or all of the three feature processing layers adopt the modified Mobile ViT module.

It should be noted that, to use the attention mechanism of the transducer in the global feature extraction module, the high-dimensional feature vector must be converted into the low-dimensional feature vector. Therefore, the local feature extraction module needs to convert the input feature vector into a low-dimensional vector.

In an embodiment of the present invention, the improved Mobile ViT module includes a local feature extraction module, a global feature extraction module, and an output module connected in sequence.

The local feature extraction module may adopt a shift patch labeling (Shifted Patch Tokenization, SPT) module or a one-dimensional convolution module, for reducing the dimension of the input feature, and extracting the local feature representation.

It can be understood that each of the three feature processing layers includes a Mobile network layer and an improved Mobile ViT module that are sequentially connected, and the fourth feature map may be output data of the Mobile network layer in the first feature processing layer, output data of the Mobile network layer in the second feature processing layer, or output data of the Mobile network layer in the third feature processing layer. When the fourth feature map is output data of the mobile network layer in the third feature processing layer, the seventh feature map output by the output module in the third feature processing layer is the second feature map.

In embodiments of the invention, the fish feed is a series of sequential actions, with different periods of acoustics exhibiting different feed strengths. Thus, a sliding window of length 3s, with a step size of 0.1s, for example, may be used to intercept the acoustic signal and convert it to a corresponding spectrogram. Whereas visual Transformer has some drawbacks in handling small data sets, SPT can be introduced. By adopting the SPT module, the input object is processed to generate matrix block patches with equal size. Each patch is then mapped linearly to a one-dimensional vector. By using the SPT module, spatial relationships between adjacent pixels can be more effectively incorporated into each visual symbol.

Fig. 4 is a third schematic diagram of a model structure in the method for identifying the feeding intensity of a fish farm according to the present invention, as shown in fig. 4, which describes a processing flow of an SPT module, when an acoustic feature map obtained by extracting features of a spectrogram of a sound signal of feeding a fish farm is input to the SPT module, spatial displacement is performed on a diagonal line of the acoustic feature map to obtain feature maps with different spatial displacements, and then the feature maps are linked to the acoustic feature map, and then Patch Partition (Patch Partition) is performed. Finally, token Merging (token merge) is performed, such that the token Merging performs expansion, normalization, and linear projection over the partition patch, the entire process can be expressed as follows:

；

in the method, in the process of the invention,

representing the ith shift profile according to the spatial shift, < > and->

Representing a linear projection that can be learned, +.>

Representing the number of spatially shifted feature maps, LN represents the layer normalization operation, +.>

Representing the vector obtained by expanding Patch.

In addition to the input acoustic signature using efficient spatial modeling, the SPT module labels spatially shifted acoustic signatures. It can provide a model with a wider perceived domain than ordinary labeling and improve the local generalization bias by embedding additional spatial information into each visual label.

In this embodiment, the global feature extraction module specifically includes an image flattening (Unfold) layer, at least one improved transform module, and an image folding (Fold) layer, which are sequentially connected, and is configured to make the features output by the local feature extraction module perform global feature modeling through the Unfold-transform-Fold structure, and extract global feature representation.

Wherein the improved transducer module is obtained by adding a local Self-Attention (LSA) module and/or adding an enhanced residual connection to the transducer module, and is used for enhancing the Attention to the audio features in the input features and/or increasing the diversity of the input features.

Although the acoustic feature map is subjected to dimension reduction through the local feature extraction module, so that the feature of the transducer module input into the global feature extraction module is a low-dimension vector, the dimension of the input feature is reduced, but the number of input tokens is increased, and therefore the attention distribution of the tokens in the acoustic feature map is smoothed. For example, the acoustic information of fish intake is a small aspect of the time-frequency characteristics. After cutting the feature, a higher proportion of ambient interference tokens may be obtained. In calculating the similarity between tokens, attention cannot be focused on the audio features of the fish intake. This may reduce the accuracy of the identification. Thus, the LSA module may be utilized to allow the model to employ local self-attention, enhancing attention to audio features in the input features.

Meanwhile, the greater the depth, the less the diversity of features due to the global nature of the attention calculations. But this effect can be mitigated by residual connection, but it does not provide more diversified features. Thus, many simultaneous enhanced residual connections can be introduced to obtain more diversified features and avoid feature breakdown.

Fig. 5 is a schematic diagram of a model structure in the fish intake intensity recognition method provided by the present invention, as shown in fig. 5 (a), which illustrates a process flow of an improved transform module, the transform module mainly includes a Multi-head Self-Attention (MSA) module and a Multi-Layer Perceptron (MLP) module, and the transform module is modified to combine the MSA module and the LSA module (for convenience of description, hereinafter referred to as the MSA-LSA module) in the transform module, and simultaneously introduce enhanced residual connections parallel to the initial residual connections, each residual connection having own learnable parameters for performing different transformations on input features.

As in graph (b) of fig. 5, by combining the MSA module with the LSA module, a scale of the learnable temperature and diagonal masking can be added to the computation of Q, K and V in the MSA module with the LSA module, where WQ, WK, WV represent the Q weight, K weight, and V weight of the transducer module, respectively. The Softmax function specifies a learnable temperature parameter. Temperature scaling highlights output distribution non-uniformity. Diagonal masking involves assigning negative infinity to its own token without affecting all other tokens. It can suppress diagonal components of the similarity matrix calculated by Q and K, thereby improving the attention score between Token and making clear the attention score distribution. The whole process can be expressed by the following formula:

；

；

In the method, in the process of the invention,

each component matrix representing mask similarity, < ->

Representing a learnable temperature parameter,/->

Representing self-attention, ->

A learnable linear projection representing Value.

In this embodiment, after introducing the enhanced residual connection, the MSA-LSA module equipped with T enhanced residual connections can be expressed by the following formula:

；

；

in the method, in the process of the invention,

is an input feature of the MSA+LSA module in layer t, < >>

Ith enhanced residual connection being layer t，

Representing a weight matrix, +.>

A nonlinear activation function.

Considering that there is also a residual connection in the MLP module in the transducer module, the enhanced residual connection can also be embedded in the MLP module, namely:

；

wherein,,

indicate->

Input features of the MLP module in the layer, +.>

Is->

The i-th enhancement residual connection of the layer.

In the embodiment of the invention, the enhanced residual connection does not change the parameters of the model obviously, can be inserted into the model flexibly, and improves the accuracy of classification and identification of the fish intake intensity of the model.

Fig. 6 is a fifth schematic diagram of a model structure in the fish intake intensity recognition method provided by the present invention, as shown in fig. 6, in the embodiment of the present invention, an SPT module is adopted as the local feature extraction module 61 in the improved Mobile ViT module; the output module 63 also employs an SPT module, and the global feature extraction module 62 includes a Unfold layer 621, a modified transducer module 622, and a Fold layer 623, which are connected in sequence.

Wherein for a given input acoustic signature

The input acoustic feature map is converted into individual Token through an SPT module, and thenEach Token is reduced in dimension Cheng Yiwei vector and then integrated into a high-dimension vector to generate a feature

. Tensors are projected into a high-dimensional space by learning a linear combination of input channels. In the Unfold layer 621, to enable the Mobile ViT module to learn the global representation with spatially induced bias, the feature +.>

Expanded into N non-overlapping flattened Patches, i.e.)>

Wherein N is the number of Patches, </i >>

. Encoding the relation between Patches by modified transducer module 622, get +.>

. The Mobile ViT module does not lose the Patch order nor the spatial order of the pixels within each Patch, and therefore, can be used to add ∈>

Folding to obtain->

. Finally, the projection is returned to the low-dimensional space through the SPT module, and the +.>

. Through the model structure, the recognition accuracy of the fish swarm ingestion intensity recognition model is further improved.

According to the method provided by the embodiment of the invention, the improved Mobile ViT V2 model is adopted to construct the fish swarm feeding intensity identification model, the improved Mobile ViT module is utilized, the small sample processing capacity of the model is improved based on the introduced SPT module and the LSA module, and the internal characteristic diversity of the model is increased by utilizing the enhanced residual connection, so that the accuracy of fish swarm feeding intensity identification can be further improved under the condition of less sample scale.

Preferably, fig. 7 is a sixth schematic diagram of a model structure in the method for identifying the feeding strength of a fish farm according to the present invention, as shown in fig. 7, in an embodiment of the present invention, a mobile network layer 21 in the model for identifying the feeding strength of a fish farm may include two MV2 modules 211 and a downsampling layer 212 connected in sequence; the first convolution layer 1 comprises two convolution downsampling layers 11 which are sequentially connected, wherein the convolution layers in the convolution downsampling layers 11 are three-dimensional convolution layers, and the second convolution layer 3 is one-dimensional convolution layer. At least one modified Mobile ViT module may be included in the three Mobile ViT modules 22.

Specifically, in the embodiment of the present invention, after obtaining a spectrogram of a fish-group feeding sound signal, the spectrogram is input to a fish-group feeding intensity recognition model, and the spectrogram sequentially passes through the first convolution layer 1, the three-layer feature processing layer 2, the second convolution layer 3 and the global pooling layer 4, that is, sequentially passes through the 3×3 convolution layer 111, the downsampling layer 112, the two MV2 modules 211, the downsampling layer 212, the Mobile ViT module 22, the 1×1 convolution layer 3 and the global pooling layer 4, and then performs feature extraction and recognition on the input spectrogram of the fish-group feeding sound signal, and outputs the corresponding fish-group feeding intensity.

According to the method provided by the embodiment of the invention, through the designed model structure, the high-precision identification effect of the fish swarm feeding intensity is realized, and meanwhile, the high-efficiency identification efficiency of the fish swarm feeding intensity is also realized.

Based on the foregoing embodiment, as an optional embodiment, after inputting the spectrogram to the fish-school feeding intensity recognition model and obtaining the fish-school feeding intensity output by the fish-school feeding intensity recognition model, the method further includes:

acquiring a fish swarm feeding intensity identification result before and after the corresponding moment of the fish swarm feeding intensity;

when it is determined that the fish intake intensity is different from the fish intake intensity recognition results at the front and rear times and the fish intake intensity recognition results at the front and rear times are the same, the fish intake intensity is corrected to the fish intake intensity recognition results at the front and rear times.

In particular, in the specific embodiment, when the feeding status of some fish is inconsistent with the feeding status of the whole fish group, there is a possibility that erroneous judgment of the result is caused. Because the sound signals of the fish intake are more unpredictable and discontinuous than the images, the intake intensity of a group of fishes is easy to be misjudged according to the sound production of single fishes in the intake process, and the recognition result of the intake intensity of the final fish intake is discontinuous.

Therefore, in the embodiment of the invention, the recognition result output by the fish swarm feeding intensity recognition model can be further corrected and optimized by setting the fish swarm feeding intensity recognition result optimization module, so that the recognition accuracy of the fish swarm feeding intensity is improved.

In the embodiment of the invention, after the fish swarm feeding intensity recognition result output by the fish swarm feeding intensity recognition model is obtained, the fish swarm feeding intensity recognition result before and after the moment corresponding to the current fish swarm feeding intensity can be obtained by flattening the fish swarm feeding intensity recognition result input into the classification Head (Head) of the fish swarm feeding intensity recognition model, and the fish swarm feeding intensity recognition results before and after the moment corresponding to the current moment can be obtained by respectively obtaining the fish swarm feeding intensity recognition results before and after the moment corresponding to the current moment on the assumption that the current model outputs the fish swarm feeding intensity recognition result at the moment t.

Further, after the Head layer outputs, the characteristics of the front moment and the rear moment are compared, namely the fish-swarm feeding intensity recognition result at the t moment and the fish-swarm feeding intensity recognition results at the t-1 moment and the t+1 moment are compared in pairs, and when the fish-swarm feeding intensity recognition results at the t moment and the t-1 moment and the t+1 moment are different, and the fish-swarm feeding intensity recognition results at the t-1 moment and the t+1 moment are the same, the fish-swarm feeding intensity at the current t moment is corrected to be the fish-swarm feeding intensity recognition results at the t-1 moment and the t+1 moment, so that the discontinuous model recognition results are endowed with continuous results, and the accuracy of fish-swarm feeding intensity classification is further improved.

According to the method provided by the embodiment of the invention, the fish swarm ingestion intensity recognition result optimization module is arranged by considering the unpredictable and discontinuous states possibly existing in the fish swarm ingestion sound signals, and the recognition result output by the fish swarm ingestion intensity recognition model is corrected and optimized, so that the accuracy of fish swarm ingestion intensity recognition can be further effectively improved.

Based on the foregoing embodiment, as an optional embodiment, before inputting the spectrogram into the fish-school feeding intensity recognition model, and obtaining the fish-school feeding intensity output by the fish-school feeding intensity recognition model, the method further includes:

taking a spectrogram sample of the fish school feeding sound signal and a corresponding fish school feeding intensity label as a group of training samples to obtain a plurality of groups of training samples;

for any group of training samples, inputting the training samples into a fish group feeding intensity recognition model, and outputting a prediction probability corresponding to the training samples;

based on the loss value, adjusting model parameters of the fish intake intensity recognition model until the loss value is smaller than a preset threshold value or the training times reach preset times;

And taking the model parameters obtained when the loss value is smaller than a preset threshold value or the training times reach the preset times as the model parameters of the trained fish swarm ingestion intensity identification model.

Specifically, in the embodiment of the invention, before inputting the spectrogram of the fish intake sound signal into the fish intake intensity recognition model, the fish intake intensity recognition model is further trained to obtain a trained fish intake intensity recognition model.

In the embodiment of the invention, after the original fish-swarm feeding sound signal data is acquired, a plurality of continuous audio clips with fixed frame numbers can be obtained according to the sliding window sampling mode for the original fish-swarm feeding sound signal data, and the audio clip data are subjected to the processes of over-glossy privet, windowing and short-time Fourier transformation to extract time-frequency characteristics and convert the time-frequency characteristics into a spectrogram, so that a spectrogram sample of the fish-swarm feeding sound signal is obtained. And (3) manufacturing a four-level tag data set based on the obtained fish intake voice signal spectrogram sample data and the experience of fish intake quantization expert, dividing the data set into a training set and a verification set, wherein 20% of the total data can be taken as the verification set, and the rest 80% is divided into the training set.

In the embodiment of the invention, the training set data is utilized to train the fish swarm feeding strength identification model, and the specific training process is as follows:

taking the spectrogram sample of the fish swarm feeding sound signal and the corresponding fish swarm feeding intensity label as a group of training samples, namely taking the spectrogram sample of each fish swarm feeding sound signal with the real fish swarm feeding intensity label as a group of training samples, thereby obtaining a plurality of groups of training samples.

In the embodiment of the invention, the spectrogram samples of the fish school feeding sound signals and the fish school feeding intensity labels carried by the fish school feeding sound signals are in one-to-one correspondence.

Then, after obtaining the multiunit training sample, input multiunit training sample in proper order to the fish school and intake intensity recognition model, utilize multiunit training sample to carry out the training to the fish school and intake intensity recognition model, namely:

and simultaneously inputting the spectrogram sample of the fish swarm feeding sound signal in each group of training samples and the fish swarm feeding intensity label carried by the spectrogram sample into a fish swarm feeding intensity recognition model, and adjusting model parameters in the fish swarm feeding intensity recognition model by calculating a loss function value according to each output result in the fish swarm feeding intensity recognition model, so that the whole training process of the fish swarm feeding intensity recognition model is finally completed under the condition that the preset training termination condition is met, and the trained fish swarm feeding intensity recognition model is obtained.

In the embodiment of the invention, the frequency spectrogram sample of the fish group ingestion sound signal and the corresponding fish group ingestion intensity label are used as a group of training samples, and the fish group ingestion intensity recognition model is trained by utilizing a plurality of groups of training samples, so that the model precision of the trained fish group ingestion intensity recognition model is improved.

The preset loss function described in the embodiment of the invention refers to a loss function preset in a fish intake intensity recognition model and is used for model evaluation; the preset threshold refers to a threshold preset by the model, and is used for obtaining a minimum loss value and completing model training; the preset times refer to the preset maximum times of model iterative training.

After a plurality of groups of training samples are obtained, for any group of training samples, a spectrogram sample of a fish-swarm feeding sound signal in each group of training samples and a fish-swarm feeding intensity label carried by the spectrogram sample are simultaneously input into a fish-swarm feeding intensity recognition model, and a prediction probability corresponding to the training samples is output.

On the basis, a preset loss function is utilized, and a loss value is calculated according to the prediction probability corresponding to the training sample and the fish swarm ingestion intensity label corresponding to the training sample.

Further, after the loss value is obtained by calculation, the training process ends. And then, the model parameters of the fish-swarm feeding strength identification model are adjusted based on the loss value by using a Back Propagation (BP) algorithm, so that the weight parameters of each layer of the model in the fish-swarm feeding strength identification model are updated, and then, the next training is carried out, and the model training is carried out repeatedly and iteratively.

In the training process, if the training result of a certain group of training samples meets the preset training termination condition, if the loss value obtained by corresponding calculation is smaller than the preset threshold value, or the current iteration number reaches the preset number, the loss value of the model can be controlled within the convergence range, and the model training is ended. At this time, the obtained model parameters can be used as model parameters of a trained fish-swarm feeding intensity recognition model, and the fish-swarm feeding intensity recognition model is trained, so that the trained fish-swarm feeding intensity recognition model is obtained.

According to the method provided by the embodiment of the invention, the multi-group training samples are utilized to carry out repeated iterative training on the fish swarm feeding intensity recognition model, so that the loss value of the fish swarm feeding intensity recognition model is controlled within the convergence range, the accuracy of the fish swarm feeding intensity recognition result output by the model is improved, and the accuracy of the fish swarm feeding intensity recognition is improved.

In a specific embodiment, on a 64-bit Windows10 operating system platform, a model for identifying the feeding intensity quantization of the fish shoal is built on the basis of a PyTorch deep learning framework and by using a Python language, and training of the model is completed by using an NVIDIA GTX 2080ti GPU. The model training parameters may set the Batch Size to 32, the number of iterations to 600, and the learning rate to 0.001.

In addition, the invention also provides a sound monitoring device based on the fish intake quantification algorithm, which comprises an underwater sound transducer, a pre-signal amplifier, a filter, an A/D converter, a data recording unit and an operation processor, wherein the operation processor is connected with the data recording unit.

The underwater sound transducer can acquire sound signals generated in the fish group ingestion process in real time under the control of the operation processor, the sound signals acquired by the underwater sound transducer device sequentially pass through the preposed signal amplifier, the filter, the A/D converter and the data recording unit and finally are transmitted into the operation processor, and the operation processor judges the fish group ingestion intensity level according to the pre-stored trained fish group ingestion intensity identification model and outputs a corresponding fish group ingestion intensity label.

The fish school feeding intensity recognition device provided by the invention is described below, and the fish school feeding intensity recognition device described below and the fish school feeding intensity recognition method described above can be correspondingly referred to each other.

Fig. 8 is a schematic structural diagram of a fish school feeding strength identifying device provided by the invention, as shown in fig. 8, including:

a processing module 810 for determining a spectrogram of the sound signal generated during the intake of the fish school;

the recognition module 820 is used for inputting the spectrogram into the fish-school feeding intensity recognition model to obtain the fish-school feeding intensity output by the fish-school feeding intensity recognition model;

The device for identifying the feeding intensity of the fish school according to the embodiment may be used for executing the embodiment of the method for identifying the feeding intensity of the fish school, and the principle and the technical effect thereof are similar, and are not repeated here.

According to the fish swarm feeding intensity recognition device provided by the embodiment of the invention, the association of the acoustic signals and the feeding intensity generated in the fish swarm feeding process is considered, so that the fish swarm feeding audio information is obtained, the neural network model training is carried out by utilizing the spectrogram sample of the fish swarm feeding sound signals and the corresponding fish swarm feeding intensity label, the fish swarm feeding intensity recognition model is obtained, the fish swarm feeding intensity recognition is carried out on the acoustic characteristics obtained by extracting the fish swarm feeding audio information spectrogram through the fish swarm feeding intensity recognition model, the accuracy and effect of the fish swarm feeding intensity recognition can be effectively improved, and the high-precision fish swarm feeding intensity recognition effect can be realized under complex water area scenes such as weak light, water quality turbidity and the like.

Fig. 9 is a schematic physical structure of an electronic device according to the present invention, and as shown in fig. 9, the electronic device may include: processor 910, communication interface (Communications Interface), memory 930, and communication bus 940, wherein processor 910, communication interface 920, and memory 930 communicate with each other via communication bus 940. Processor 910 may invoke logic instructions in memory 930 to perform the method for identifying the intensity of intake of a fish school provided by the methods described above, the method comprising: determining a spectrogram of a sound signal generated in the process of feeding the fish school; inputting the spectrogram to a fish-swarm feeding intensity recognition model, and obtaining the fish-swarm feeding intensity output by the fish-swarm feeding intensity recognition model; the fish swarm feeding intensity recognition model is used for determining the fish swarm feeding intensity based on acoustic features obtained by feature extraction of the spectrogram; the fish swarm feeding intensity recognition model is obtained through training according to a spectrogram sample of a fish swarm feeding sound signal and a corresponding fish swarm feeding intensity label.

Further, the logic instructions in the memory 930 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Fig. 10 is a schematic structural view of a bait casting machine provided by the invention, as shown in fig. 10, comprising:

bait box 101, suction device 102, bait blowing device 103, bait channel 104 and power frequency converter 105;

the bait box 101 is used for containing bait;

the bottom discharge hole of the bait box 101 is connected with a suction device 102, the suction device 102 is connected with one end of a bait channel 104, and the bait blowing device 103 is connected with the other end of the bait channel 104;

the sucking device 102 is used for sucking the baits in the bait box 101 into the bait channel 104; the bait blowing device 103 is used for blowing out the bait in the bait channel 104 for feeding;

one end of the power frequency converter 105 is connected with the material sucking device 102 and the bait blowing device 103 respectively, and the other end of the power frequency converter 105 is connected with the communication interface 920 in the electronic equipment, so as to receive the identification result of the intake intensity of the fish shoal output by the electronic equipment, control the start and stop of the material sucking device 102 and the material sucking speed according to the identification result of the intake intensity of the fish shoal, and control the start and stop of the bait blowing device 103 and the bait blowing speed.

In an embodiment of the invention, the suction device comprises a suction motor for feeding the bait into the bait channel, and the bait blowing device comprises a blowing motor for blowing the bait out of the bait channel. Further, the start and stop of the material sucking device and the material sucking speed can be set in advance for each grade of the feeding strength of the fish shoal vAnd the start and stop of the bait blowing device and the bait blowing speedVWherein, the material sucking speedvWith bait blowing speedVThe same may be set.

For example, when the feeding strength of the fish school is "none", the operation of both the sucking device and the bait blowing device is stopped; when the feeding strength of the fish shoal is 'weak', the material sucking device and the bait blowing device are started, and the material sucking speed of the material sucking device is setv1, the bait blowing speed of the bait blowing device isV1, a step of; when the feeding strength of the fish shoal is medium, the sucking device and the bait blowing device are started, and the sucking speed of the sucking device is setv2, the bait blowing speed of the bait blowing device isV2; when the feeding strength of the fish shoal is strong, the sucking device and the bait blowing device are started, and the sucking speed of the sucking device is setv3, the bait blowing speed of the bait blowing device isV3, wherein, the method comprises the steps of,v3＞v2＞v1，V3＞V2＞V1。

according to the embodiment of the invention, through the operation, the control of the start and stop and the multi-gear speed of the material sucking device and the feed blowing device can be realized based on each level of the feeding intensity of the fish shoal, the purpose of intelligent and accurate bait feeding in the fishery cultivation process is achieved, the utilization efficiency of the aquaculture bait can be greatly improved, and the aquaculture cost is saved.

In another aspect, the present invention also provides a computer program product comprising a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of performing the method for identifying the feeding intensity of a fish farm provided by the above methods, the method comprising: determining a spectrogram of a sound signal generated in the process of feeding the fish school; inputting the spectrogram to a fish-swarm feeding intensity recognition model, and obtaining the fish-swarm feeding intensity output by the fish-swarm feeding intensity recognition model; the fish swarm feeding intensity recognition model is used for determining the fish swarm feeding intensity based on acoustic features obtained by feature extraction of the spectrogram; the fish swarm feeding intensity recognition model is obtained through training according to a spectrogram sample of a fish swarm feeding sound signal and a corresponding fish swarm feeding intensity label.

In yet another aspect, the present invention provides a non-transitory computer readable storage medium having stored thereon a computer program which when executed by a processor is implemented to perform the method of identifying the feeding intensity of a fish school provided by the above methods, the method comprising: determining a spectrogram of a sound signal generated in the process of feeding the fish school; inputting the spectrogram to a fish-swarm feeding intensity recognition model, and obtaining the fish-swarm feeding intensity output by the fish-swarm feeding intensity recognition model; the fish swarm feeding intensity recognition model is used for determining the fish swarm feeding intensity based on acoustic features obtained by feature extraction of the spectrogram; the fish swarm feeding intensity recognition model is obtained through training according to a spectrogram sample of a fish swarm feeding sound signal and a corresponding fish swarm feeding intensity label.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for identifying the ingestion intensity of a fish school, comprising the steps of:

2. The method for identifying the feeding strength of a fish school according to claim 1, wherein the model for identifying the feeding strength of the fish school comprises a first convolution layer, a three-layer feature processing layer, a second convolution layer and a global pooling layer which are connected in sequence;

3. The fish school feeding intensity recognition method according to claim 2, wherein the first convolution layer comprises at least one convolution downsampling layer connected in sequence, the convolution downsampling layer is used for convolving input data and downsampling the convolved input data; the first feature map is data output by the convolution downsampling layer of the last layer;

4. A method of identifying the ingestion intensity of a fish population according to claim 3, wherein the three feature handling layers include at least one modified mobile vision transducer module therein;

5. The method for identifying the feeding strength of a fish school according to claim 3, wherein the mobile network layer comprises two MV2 modules and a downsampling layer which are connected in sequence; the first convolution layer comprises two convolution downsampling layers which are sequentially connected, the convolution layers in the convolution downsampling layers are three-dimensional convolution layers, and the second convolution layer is one-dimensional convolution layer.

6. The method according to any one of claims 1 to 5, wherein after the inputting the spectrogram into a fish-school feeding intensity recognition model, the method further comprises, after obtaining the fish-school feeding intensity output from the fish-school feeding intensity recognition model:

7. The method of claim 1-5, wherein prior to said inputting the spectrogram into a fish-school feeding intensity recognition model, obtaining a fish-school feeding intensity output by the fish-school feeding intensity recognition model, the method further comprises:

8. A fish school feeding strength identification device, comprising:

9. An electronic device comprising a memory, a communication interface, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method for identifying the feeding strength of a fish population according to any one of claims 1 to 7 when executing the program.

10. A bait casting machine, comprising:

the bait box is used for containing bait;

one end of the power frequency converter is respectively connected with the material sucking device and the bait blowing device, and the other end of the power frequency converter is connected with the communication interface in the electronic equipment according to claim 9, and is used for receiving the identification result of the intake intensity of the shoal of fish output by the electronic equipment, controlling the start and stop of the material sucking device and the material sucking speed according to the identification result of the intake intensity of the shoal of fish, and controlling the start and stop of the bait blowing device and the bait blowing speed.