CN112215107A

CN112215107A - Pig behavior identification method and device, electronic equipment and storage medium

Info

Publication number: CN112215107A
Application number: CN202011052538.9A
Authority: CN
Inventors: 孙龙清; 孙美娜; 孙希蓓; 吴雨寒
Original assignee: China Agricultural University
Current assignee: China Agricultural University
Priority date: 2020-09-29
Filing date: 2020-09-29
Publication date: 2021-01-12

Abstract

The embodiment of the invention provides a pig behavior identification method and device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a video to be detected; the video to be detected comprises image information of the pig; inputting the video to be detected into a characteristic fusion time sequence segmentation network to obtain and output a behavior recognition result of the pig; the feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video. The embodiment of the invention can realize non-contact and low-cost identification of the pig behaviors, solves the problems of low efficiency, non-objective analysis result and the like caused by recording the pig behaviors by manual observation, and further provides a technical basis for monitoring and analyzing large-scale pig breeding.

Description

Pig behavior identification method and device, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of livestock breeding, in particular to a pig behavior identification method and device, electronic equipment and a storage medium.

Background

With the development of society, the living standard of people is also remarkably improved, and the demand for meat products is also only pursued to meet the demand quantitatively from the past, and is further increased to pursue the quality of the meat products.

In the live pig breeding aspect, the modernized and intensive breeding improves the production efficiency, effectively reduces the generation cost and improves the economic benefit. However, the excessively intensive feeding mode has great influence on the life of the live pigs, causes abnormal behaviors and epidemic diseases of the live pigs, is very easy to influence and infect other live pigs in the same colony house, and even endangers the whole live pig farm.

In the traditional method, sick pigs in pig farms are generally observed by workers and manually matured, and on the one hand, the diseases risk to be transmitted to workers. On the other hand, China is a world big pig breeding country, the scale of a breeding farm is large, but observers are not enough, and the phenomenon of forgetting and mistaking information due to negligence of fatigue often happens.

Therefore, manually monitoring the behavior of live pigs is time consuming and subjective. The method for monitoring the live pig behavior data by the sensors also has certain defects, most of the sensors are attached to the surface of the live pig, stress reaction of the live pig is easily caused, and normal behaviors of the live pig are changed.

Therefore, how to provide a pig behavior identification method and device, an electronic device and a storage medium can realize non-contact and low-cost identification of pig behaviors, solve the problems of low efficiency, non-objective analysis result and the like caused by manual observation and recording of pig behaviors, further provide an effective solution for monitoring and analyzing large-scale pig breeding, and become a problem to be solved urgently.

Disclosure of Invention

The embodiment of the invention provides a pig behavior identification method and device, electronic equipment and a storage medium, which are used for overcoming the defects of high storage resource consumption and low utilization rate in the prior art and achieving the purpose of saving storage resources.

In a first aspect, an embodiment of the present invention provides a pig behavior identification method, including:

acquiring a video to be detected; the video to be detected comprises image information of the pig;

inputting the video to be detected into a characteristic fusion time sequence segmentation network to obtain and output a behavior recognition result of the pig;

the feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video.

Optionally, in the behavior recognition method for pigs, the feature fusion time-series segmentation network includes: the system comprises a video processing layer, a time stream feature extraction layer, a space stream feature extraction layer, a first fusion layer, a second fusion layer and an identification classification layer;

the video processing layer is used for processing the video to be detected to obtain an image set to be detected; the image set to be detected comprises: a color space feature map and an optical flow feature map; the color space characteristic graph is an image representing space information; the optical flow characteristic diagram is an image representing time information;

the time flow feature extraction layer is used for obtaining the time flow features of the video to be detected according to the optical flow feature diagram;

the spatial stream feature extraction layer is used for obtaining the spatial stream features of the video to be detected according to the color spatial feature map;

the first fusion layer is used for fusing the time flow characteristics and the space flow characteristics to obtain fused characteristics;

the second fusion layer is used for fusing the fused features and the spatial stream features to obtain target features;

and the identification classification layer is used for obtaining a behavior identification result of the pig according to the target characteristics.

Optionally, in the pig behavior identification method, the feature fusion time-series segmentation network further includes: a data storage layer;

the data storage layer is used for storing the image set to be detected output by the video processing layer, selecting a color space feature map according to a preset rule and inputting the color space feature map into the space stream feature extraction layer, and selecting an optical flow feature map and inputting the optical flow feature map into the time stream feature extraction layer respectively.

Optionally, in the method for identifying pig behavior, before the step of segmenting the network according to the feature fusion time sequence based on the video to be detected to obtain and output the behavior identification result, the method further includes: training the feature fusion time sequence segmentation network;

the training of the feature fusion time sequence segmentation network specifically comprises:

obtaining a sample image set from a sample video; the sample image set includes: a color space feature map and an optical flow feature map;

training the feature fusion time sequence segmentation network by using the sample image set;

taking a cross entropy function as a cost function, and obtaining a gradient for a network parameter of the feature fusion time sequence segmentation network by using a back propagation algorithm;

updating the network parameters of the feature fusion time sequence segmentation network based on the gradient, and performing iterative training on the feature fusion time sequence segmentation network based on the updated network parameters until the feature fusion time sequence segmentation network converges.

Optionally, in the method for identifying pig behavior, the obtaining a sample image set from a sample video specifically includes:

dividing the sample video into a plurality of video segments;

converting each video segment into a succession of video frames;

selecting a frame from the continuous video frames as a color space feature map;

computing optical flow from the successive video frames to obtain an optical flow feature atlas;

and obtaining a sample image set according to the color space characteristic diagram and the optical flow characteristic diagram set obtained from each video segment in the sample video.

Optionally, in the behavior recognition method for pigs, the recognition classification layer is specifically configured to:

and classifying the target characteristics to obtain and output a classification score of each behavior in the video to be detected.

Optionally, in the pig behavior recognition method, the color space feature map is an RGB map, and the optical flow feature map includes: a light flow pattern and a distorted light flow pattern.

In a second aspect, an embodiment of the present invention provides a pig behavior identification device, including:

the acquisition module is used for acquiring a video to be detected; the video to be detected comprises image information of the pig;

the identification module is used for fusing the input characteristics of the video to be detected with a time sequence segmentation network to obtain and output a behavior identification result of the pig;

In a third aspect, an embodiment of the present invention provides an electronic device, including a memory and a processor, where the processor and the memory complete communication with each other through a bus; the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the steps of the pig behavior recognition method as described above.

In a fourth aspect, embodiments of the present invention provide a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the above pig behavior recognition method.

According to the pig behavior recognition method and device, the electronic equipment and the storage medium, the time-space information is extracted from the pig behavior video to recognize the pig behavior, so that the pig behavior can be recognized accurately in a non-contact and low-cost mode, and the problems of low efficiency, non-objective analysis result and the like caused by the fact that the pig behavior is recorded by means of manual observation are solved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

Fig. 1 is a flowchart of a pig behavior identification method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a feature fusion time sequence segmentation network according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a pig behavior recognition device according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The non-contact, low-cost, simple and effective computer vision technology is widely applied in the animal detection process and plays an important role in animal behavior evaluation.

Nasicahmadi et al performs behavior recognition of pigs by least square ellipse fitting, and can judge climbing behavior when the major axis of the ellipse is 1.3-2 times the length of the major axis of the normal ellipse and the minor axis is 1.3-1.8 times the length of the minor axis of the normal ellipse, and the accuracy of the method is 92.7%.

Lao et al obtained characteristic values for identifying sow behaviors through depth image data, and when the head of a live pig moves up and down in a feeder, the pig can be judged as a feeding behavior, and the accuracy rate of the method is 97.4%.

The extraction of the behavior characteristics mainly depends on manual observation, design and high-precision image segmentation, so the methods have high requirements on pigsty environment and shooting conditions, and the deep learning can solve the problems. Schlemia chamomile et al used modified Faster R-CNN to identify sow 5 types of stance, sitting, prone, abdominal and lateral postures with an average accuracy of over 93%. Nasicahmadi et al proposed Faster R-CNN, SSD, and R-FCN to recognize the standing, lying and lying postures of pigs with an average recognition rate of 98.99%.

However, the existing deep learning method identifies the behavior of the live pig based on the static image frame only containing the spatial information, and cannot effectively acquire the coherence time information of the target.

Fig. 1 is a flowchart of a pig behavior identification method according to an embodiment of the present invention, and as shown in fig. 1, the method includes:

step S1, acquiring a video to be detected; the video to be detected comprises image information of the pig;

specifically, a live pig breeding environment is recorded in real time, and a video to be detected including image information of a pig is acquired and stored.

Step S2, fusing the input characteristics of the video to be tested with a time sequence segmentation network to obtain and output a behavior recognition result of the pig;

Specifically, in step S2, the video to be tested obtained in step S1 is input into a feature fusion time sequence segmentation network trained in advance, and a behavior recognition result of the pig is obtained and output.

The feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video. The time sequence segmentation network consists of a space stream network and a time stream network, and can extract the space and time information of the video. The characteristic fusion time sequence segmentation network fuses the space stream and the time stream on the basis of the time sequence segmentation network, and can effectively improve the accuracy of pig behavior identification.

It should be noted that, besides obtaining the video to be detected by recording the video in real time in the pig breeding environment, other methods may be used to obtain the video to be detected, which is not limited in this embodiment.

It can be understood that, in the embodiment of the present invention, both the video to be detected and the sample video are videos containing image information of a pig, so that the feature fusion time sequence segmentation network obtained by training the sample image set obtained by the sample video is used for identifying the behavior of the pig.

The embodiment of the invention provides a pig behavior identification method, which can accurately identify the behavior of a pig in a non-contact and low-cost manner by extracting time-space information from a pig behavior video to identify the behavior of the pig, and solves the problems of low efficiency, non-objective analysis result and the like caused by recording the behavior of the pig by manual observation.

Based on the foregoing embodiment, optionally, fig. 2 is a schematic structural diagram of a feature fusion time-series segmentation network provided by an embodiment of the present invention, and as shown in fig. 2, in the pig behavior identification method, the feature fusion time-series segmentation network includes: the system comprises a video processing layer, a time stream feature extraction layer, a space stream feature extraction layer, a first fusion layer, a second fusion layer and an identification classification layer;

Specifically, the feature fusion time sequence segmentation network includes: the system comprises a video processing layer, a time stream feature extraction layer, a space stream feature extraction layer, a first fusion layer, a second fusion layer and an identification classification layer.

After the video to be detected is input into the time sequence segmentation network, the video processing layer processes the video to be detected and divides the video V to be detected into K video segments T₁,T₂,…,T_KAnd processing the video to obtain continuous video frames, and selecting one frame of each video segment to obtain a color space characteristic graph based on the color space for representing the space information. And calculating an optical flow characteristic diagram according to the continuous video frames to be used for representing the time information. And the color space characteristic graph and the optical flow characteristic graph of K video segments of the video to be detected jointly form an image set to be detected.

It should be noted that the selection principle for selecting a frame of video in each video segment may be random extraction, or may select a fixed frame number of video frames, or further may extract a video key frame based on a method such as an inter-frame difference method, which is not limited in this embodiment.

Secondly, the color space includes an RGB color space, an HSI color space, an HVS color space, and the like, which can be selected according to the actual situation, and this embodiment does not limit this.

In addition, in order to further enrich the time flow characteristics obtained by the network and improve the accuracy of behavior recognition, one color space characteristic diagram in each part of the video may correspond to multiple optical flow characteristic diagrams, and a specific corresponding proportional relationship may be selected according to an actual situation, which is not limited in this embodiment.

The feature extraction layer of the feature fusion time sequence segmentation network is divided into a time flow feature extraction layer and a space flow feature extraction layer. And inputting the color space characteristic diagram in the image set to be detected into the space stream characteristic extraction layer, and outputting the space stream characteristic of the video to be detected. And inputting the optical flow characteristic graph in the to-be-detected image set into the time flow characteristic extraction layer, and outputting the time flow characteristic of the to-be-detected video.

The feature fusion layer of the feature fusion time sequence segmentation network is divided into a first fusion layer and a second fusion layer. And inputting the acquired time stream characteristics and the acquired space stream characteristics into the first fusion layer to acquire fused characteristics. And inputting the fused features and the space flow features obtained in advance into a second fusion layer together to obtain the target features.

The temporal flow feature and the spatial flow feature may be fused at the first fusion layer by using a connection fusion method, and each feature map is superposed as a different channel to be used as a fused feature map. Taking two feature maps as an example, the fusion form is:

wherein the content of the first and second substances,

and

a characteristic diagram showing the number of channels d,

a feature map showing the number of channels obtained by junction fusion as 2d is shown.

And reducing the dimension of the connected and fused feature map on the channel dimension by using a convolution kernel.

y^conv＝y^cat*f+b

Wherein, y^catIs the feature map obtained by join fusion, f is the convolution kernel, b is the offset, y^convIs the output fused feature map.

And fusing the fused features and the spatial stream features by using an averaging mode in the second fusion layer to obtain the target features.

The fusion method is described as a specific example of the fusion process, and other fusion mechanisms may be used to perform feature fusion, which is not limited in this embodiment.

And inputting the obtained target characteristics into an identification classification layer to obtain a behavior identification result of the pig.

It should be noted that the final behavior recognition result may be output only by the behavior with the highest probability, or may be output by other manners such as selecting multiple behaviors with higher probabilities, which is not limited in this embodiment.

It should be noted that, in this embodiment, when the video processing layer processes the video to obtain the video frame, the video frame may further be subjected to image processing (for example, image enhancement processing, denoising, and the like) to enhance the capability of the image to embody the target, so as to further improve the recognition accuracy, and the specific processing method is not limited in this embodiment.

On the basis of the embodiment, because the change of continuous frames in the video is very small, and the dense sampling has the problem of relatively high cost, the characteristic fusion time sequence segmentation network in the embodiment of the invention adopts a sparse sampling strategy, and the network can process the video for a longer time by segmenting and then sampling the whole video, so that the time sequence characteristics obtained by the network are richer, and a foundation is laid for the live pig behavior identification work based on the subsequent video processing technology.

By adopting a mode of further performing secondary fusion with the previous spatial stream characteristics after the time stream characteristics and the spatial stream characteristics are fused, the network structure is more end-to-end, and the steps of manually designing and processing data are fewer. The corresponding relation of the characteristic diagrams of the time domain network and the space network on the picture space position can be effectively obtained. The secondary fusion can keep more space flow information, further deepens the depth of the network, and reduces the risk of overfitting, thereby improving the accuracy of live pig behavior identification.

Based on the above embodiment, optionally, in the pig behavior identification method, the feature fusion time-series segmentation network further includes: a data storage layer;

Specifically, one color space feature map in each part of video can correspond to a plurality of optical flow feature maps, and the data storage layer in the feature fusion time sequence segmentation network is used for storing an image set to be detected output by the video processing layer, extracting a corresponding number of color space feature maps and optical flow feature maps in the image set to be detected according to a preset corresponding proportion, and respectively inputting the color space feature maps and the optical flow feature maps into the space flow feature extraction layer and the time flow feature extraction layer.

It should be noted that the specific corresponding proportional relationship may be selected according to actual situations, and this embodiment does not limit this.

On the basis of the above embodiments, the method for extracting the temporal flow and spatial flow features by using one color-space feature map corresponding to a plurality of optical flow feature maps as input in the embodiments of the present invention can better capture the motion information of the target in the video segment, and effectively improve the accuracy of behavior recognition.

Based on the above embodiment, optionally, in the method for identifying pig behavior, before the step of obtaining and outputting the behavior identification result by segmenting the network according to the feature fusion time sequence based on the video to be detected, the method further includes: training the feature fusion time sequence segmentation network;

Specifically, before the network is segmented using the feature fusion timing, the network needs to be trained. Selecting a certain number of sample videos and processing the sample videos to obtain a sample image set corresponding to a color space feature map and an optical flow feature map of the sample videos as training samples of the feature fusion time sequence segmentation network.

The feature fusion time sequence segmentation network can be expressed by a formula as follows:

FTS(V)＝H(G(F(T₁，T_1rgb，T_1flows；W)，...，F(T_K，T_Krgb，T_Kflows；W)))

where V denotes input video data, T_KRepresenting the Kth video segment, T_KrgbRepresenting the color space characteristic map, T, corresponding to the Kth video segment_KflowsAnd showing the optical flow characteristic diagram corresponding to the Kth video clip. W denotes a network parameter, F (T)_K，T_Krgb，T_Kflows(ii) a W) returned for the function is the segment T_KScores of the respective behavioral classifications. The loss function G will integrate the scores of the individual segments to derive a classification score for the entire video. The Softmax function H will get the probability of each classification behavior of the input video data.

In the training process, taking a Softmax function as a function H, obtaining the probability of each classification behavior of input video data:

G_ito infer the score for category i from the scores of the same category in all fragments using the aggregation function g:

G_i＝g(F_i(T₁)，F_i(T₂)，...，F_i(T_K))，i＝1，2，...，C

wherein the aggregation function g adopts a uniform average method to represent the final recognition accuracy.

The cross-entropy loss is classified according to the standard, and the formula for the final loss function for the partial consensus is:

in the back propagation process, the gradient of the model network parameter W with respect to the loss value L is:

where K represents the number of segments of the video segmentation and C is the number of behavior classes. y is_iIs a class label for class i, G_jTo infer the score for category j from the scores of the same category in all segments using the aggregation function g.

Updating the network parameters of the feature fusion time sequence segmentation network based on the calculated gradient, and performing iterative training on the feature fusion time sequence segmentation network based on the updated network parameters until the feature fusion time sequence segmentation network converges.

And further, new video data can be collected to serve as a test video, the test video is input into the trained feature fusion time sequence segmentation network, the test error is calculated according to the output of the feature fusion time sequence segmentation network, and if the test error is within an error tolerable range, the feature fusion time sequence segmentation network is successfully trained.

Based on the foregoing embodiment, optionally, in the method for identifying pig behavior, the obtaining a sample image set from a sample video specifically includes:

dividing the sample video into a plurality of video segments;

converting each video segment into a succession of video frames;

Specifically, the obtained sample video is divided into a plurality of video segments, each video segment is converted into a continuous video frame, and one frame of each video segment is selected to obtain a color space characteristic diagram based on a color space for representing spatial information. And calculating an optical flow characteristic atlas according to continuous video frames, wherein the optical flow characteristic atlas is used for representing time information.

And obtaining a sample image set according to the color space feature map and the optical flow feature map set acquired by each video segment in the sample video, wherein the sample image set is used as a training set of the feature fusion time sequence segmentation network.

On the basis of the above embodiment, the embodiment of the present invention uses a sample image set obtained by processing a sample video as a training set of a feature fusion time sequence segmentation network. The method has the advantages that characteristics can be effectively learned when the whole long video learning is carried out, the network training efficiency is improved, and the calculation cost and the consumed time are reduced.

Based on the above embodiment, optionally, in the pig behavior identification method, the identification classification layer is specifically configured to:

Specifically, taking the Softmax function as a function H, the probability of each classification behavior of the input video data is obtained:

G_i＝g(F_i(T₁)，F_i(T₂)，...，F_i(T_K))，i＝1，2，...，C

Based on the foregoing embodiment, optionally, in the pig behavior identification method, the color space feature map is an RGB map, and the optical flow feature map includes: a light flow pattern and a distorted light flow pattern.

Specifically, when a video is processed, the color space is determined to be an RGB color space, an RGB image is obtained, and the RGB image represents static information at a specific time point.

A light flow map and a warped light flow map are computed from RGB maps of successive frames. The optical flow graph can capture motion information, and the distorted optical flow graph can effectively inhibit background motion, so that the motion objects in the video can be concentrated.

On the basis, besides the light flow graph and the distorted light flow graph, RGB difference values can be introduced as time information, the RGB difference values of two continuous frames represent the change of actions, and the change has good representation effect corresponding to the motion salient region.

Fig. 3 is a schematic structural diagram of a pig behavior recognition system according to an embodiment of the present invention, and as shown in fig. 3, the pig behavior recognition system includes:

an obtaining module 310, configured to obtain a video to be tested; the video to be detected comprises image information of the pig;

the identification module 320 is used for fusing the input features of the video to be detected with a time sequence segmentation network to obtain and output a behavior identification result of the pig;

Specifically, the obtaining module 310 is configured to record a video of a live pig breeding environment in real time, and obtain and store a video to be tested including image information of a pig.

And the recognition module 320 is configured to input the video to be detected acquired by the acquisition module 310 into a feature fusion time sequence segmentation network trained in advance, and acquire and output a behavior recognition result of the pig.

The embodiment of the invention provides a pig behavior recognition system, which can accurately recognize the behavior of a pig in a non-contact and low-cost manner by extracting the time-space information from the behavior video of the pig, and solves the problems of low efficiency, non-objective analysis result and the like caused by recording the behavior of the pig by manual observation.

It should be noted that, the pig behavior recognition system provided in the embodiment of the present invention is used for executing the pig behavior recognition method, and the specific implementation manner of the pig behavior recognition system is consistent with the implementation manner of the method, and is not described herein again.

Fig. 4 is a schematic entity structure diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device may include: a processor (processor)410, a communication interface (communication interface)420, a memory (memory)430 and a communication bus (bus)440, wherein the processor 410, the communication interface 420 and the memory 430 are communicated with each other via the communication bus 440. The processor 410 may call logic instructions in the memory 430 to perform the above pig behavior recognition method, including: acquiring a video to be detected; the video to be detected comprises image information of the pig; inputting the video to be detected into a characteristic fusion time sequence segmentation network to obtain and output a behavior recognition result of the pig; the feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video.

In addition, the logic instructions in the memory 430 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.

In another aspect, an embodiment of the present invention further provides a computer program product, where the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed by a computer, the computer is capable of executing the behavior recognition method for pigs provided by the above-mentioned embodiments of the method, including: acquiring a video to be detected; the video to be detected comprises image information of the pig; inputting the video to be detected into a characteristic fusion time sequence segmentation network to obtain and output a behavior recognition result of the pig; the feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video.

In still another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented by a processor to execute the method for identifying pig behavior provided in the foregoing embodiments, and the method includes: acquiring a video to be detected; the video to be detected comprises image information of the pig; inputting the video to be detected into a characteristic fusion time sequence segmentation network to obtain and output a behavior recognition result of the pig; the feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A behavior recognition method for pigs is characterized by comprising the following steps:

2. The method for identifying pig behavior according to claim 1,

the feature fusion temporal segmentation network comprises: the system comprises a video processing layer, a time stream feature extraction layer, a space stream feature extraction layer, a first fusion layer, a second fusion layer and an identification classification layer;

3. The method for identifying pig behavior according to claim 2,

the feature fusion temporal segmentation network further comprises: a data storage layer;

4. The method for identifying pig behavior according to claim 2,

before the step of fusing a time sequence segmentation network according to features based on the video to be detected and obtaining and outputting a behavior recognition result, the method further comprises the following steps of: training the feature fusion time sequence segmentation network;

5. The method for identifying pig behavior according to claim 4,

the obtaining of the sample image set from the sample video specifically includes:

dividing the sample video into a plurality of video segments;

converting each video segment into a succession of video frames;

6. The method for identifying pig behavior according to claim 2,

the identification classification layer is specifically configured to:

7. The pig behavior recognition method according to any one of claims 2-6, wherein the color space feature map is an RGB map, and the optical flow feature map comprises: a light flow pattern and a distorted light flow pattern.

8. A behavior recognition device for pigs, comprising:

9. An electronic device, comprising a memory and a processor, wherein the processor and the memory communicate with each other via a bus; the memory stores program instructions executable by the processor, the processor calling the program instructions to perform the method of behavioral recognition of pigs according to any one of claims 1 to 7.

10. A non-transitory computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the method of behavioral recognition in a pig according to any one of claims 1 to 7.