CN112215107A - Pig behavior identification method and device, electronic equipment and storage medium - Google Patents

Pig behavior identification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112215107A
CN112215107A CN202011052538.9A CN202011052538A CN112215107A CN 112215107 A CN112215107 A CN 112215107A CN 202011052538 A CN202011052538 A CN 202011052538A CN 112215107 A CN112215107 A CN 112215107A
Authority
CN
China
Prior art keywords
video
pig
detected
feature
segmentation network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011052538.9A
Other languages
Chinese (zh)
Inventor
孙龙清
孙美娜
孙希蓓
吴雨寒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Agricultural University
Original Assignee
China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Agricultural University filed Critical China Agricultural University
Priority to CN202011052538.9A priority Critical patent/CN112215107A/en
Publication of CN112215107A publication Critical patent/CN112215107A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Abstract

The embodiment of the invention provides a pig behavior identification method and device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a video to be detected; the video to be detected comprises image information of the pig; inputting the video to be detected into a characteristic fusion time sequence segmentation network to obtain and output a behavior recognition result of the pig; the feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video. The embodiment of the invention can realize non-contact and low-cost identification of the pig behaviors, solves the problems of low efficiency, non-objective analysis result and the like caused by recording the pig behaviors by manual observation, and further provides a technical basis for monitoring and analyzing large-scale pig breeding.

Description

Pig behavior identification method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of livestock breeding, in particular to a pig behavior identification method and device, electronic equipment and a storage medium.
Background
With the development of society, the living standard of people is also remarkably improved, and the demand for meat products is also only pursued to meet the demand quantitatively from the past, and is further increased to pursue the quality of the meat products.
In the live pig breeding aspect, the modernized and intensive breeding improves the production efficiency, effectively reduces the generation cost and improves the economic benefit. However, the excessively intensive feeding mode has great influence on the life of the live pigs, causes abnormal behaviors and epidemic diseases of the live pigs, is very easy to influence and infect other live pigs in the same colony house, and even endangers the whole live pig farm.
In the traditional method, sick pigs in pig farms are generally observed by workers and manually matured, and on the one hand, the diseases risk to be transmitted to workers. On the other hand, China is a world big pig breeding country, the scale of a breeding farm is large, but observers are not enough, and the phenomenon of forgetting and mistaking information due to negligence of fatigue often happens.
Therefore, manually monitoring the behavior of live pigs is time consuming and subjective. The method for monitoring the live pig behavior data by the sensors also has certain defects, most of the sensors are attached to the surface of the live pig, stress reaction of the live pig is easily caused, and normal behaviors of the live pig are changed.
Therefore, how to provide a pig behavior identification method and device, an electronic device and a storage medium can realize non-contact and low-cost identification of pig behaviors, solve the problems of low efficiency, non-objective analysis result and the like caused by manual observation and recording of pig behaviors, further provide an effective solution for monitoring and analyzing large-scale pig breeding, and become a problem to be solved urgently.
Disclosure of Invention
The embodiment of the invention provides a pig behavior identification method and device, electronic equipment and a storage medium, which are used for overcoming the defects of high storage resource consumption and low utilization rate in the prior art and achieving the purpose of saving storage resources.
In a first aspect, an embodiment of the present invention provides a pig behavior identification method, including:
acquiring a video to be detected; the video to be detected comprises image information of the pig;
inputting the video to be detected into a characteristic fusion time sequence segmentation network to obtain and output a behavior recognition result of the pig;
the feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video.
Optionally, in the behavior recognition method for pigs, the feature fusion time-series segmentation network includes: the system comprises a video processing layer, a time stream feature extraction layer, a space stream feature extraction layer, a first fusion layer, a second fusion layer and an identification classification layer;
the video processing layer is used for processing the video to be detected to obtain an image set to be detected; the image set to be detected comprises: a color space feature map and an optical flow feature map; the color space characteristic graph is an image representing space information; the optical flow characteristic diagram is an image representing time information;
the time flow feature extraction layer is used for obtaining the time flow features of the video to be detected according to the optical flow feature diagram;
the spatial stream feature extraction layer is used for obtaining the spatial stream features of the video to be detected according to the color spatial feature map;
the first fusion layer is used for fusing the time flow characteristics and the space flow characteristics to obtain fused characteristics;
the second fusion layer is used for fusing the fused features and the spatial stream features to obtain target features;
and the identification classification layer is used for obtaining a behavior identification result of the pig according to the target characteristics.
Optionally, in the pig behavior identification method, the feature fusion time-series segmentation network further includes: a data storage layer;
the data storage layer is used for storing the image set to be detected output by the video processing layer, selecting a color space feature map according to a preset rule and inputting the color space feature map into the space stream feature extraction layer, and selecting an optical flow feature map and inputting the optical flow feature map into the time stream feature extraction layer respectively.
Optionally, in the method for identifying pig behavior, before the step of segmenting the network according to the feature fusion time sequence based on the video to be detected to obtain and output the behavior identification result, the method further includes: training the feature fusion time sequence segmentation network;
the training of the feature fusion time sequence segmentation network specifically comprises:
obtaining a sample image set from a sample video; the sample image set includes: a color space feature map and an optical flow feature map;
training the feature fusion time sequence segmentation network by using the sample image set;
taking a cross entropy function as a cost function, and obtaining a gradient for a network parameter of the feature fusion time sequence segmentation network by using a back propagation algorithm;
updating the network parameters of the feature fusion time sequence segmentation network based on the gradient, and performing iterative training on the feature fusion time sequence segmentation network based on the updated network parameters until the feature fusion time sequence segmentation network converges.
Optionally, in the method for identifying pig behavior, the obtaining a sample image set from a sample video specifically includes:
dividing the sample video into a plurality of video segments;
converting each video segment into a succession of video frames;
selecting a frame from the continuous video frames as a color space feature map;
computing optical flow from the successive video frames to obtain an optical flow feature atlas;
and obtaining a sample image set according to the color space characteristic diagram and the optical flow characteristic diagram set obtained from each video segment in the sample video.
Optionally, in the behavior recognition method for pigs, the recognition classification layer is specifically configured to:
and classifying the target characteristics to obtain and output a classification score of each behavior in the video to be detected.
Optionally, in the pig behavior recognition method, the color space feature map is an RGB map, and the optical flow feature map includes: a light flow pattern and a distorted light flow pattern.
In a second aspect, an embodiment of the present invention provides a pig behavior identification device, including:
the acquisition module is used for acquiring a video to be detected; the video to be detected comprises image information of the pig;
the identification module is used for fusing the input characteristics of the video to be detected with a time sequence segmentation network to obtain and output a behavior identification result of the pig;
the feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video.
In a third aspect, an embodiment of the present invention provides an electronic device, including a memory and a processor, where the processor and the memory complete communication with each other through a bus; the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the steps of the pig behavior recognition method as described above.
In a fourth aspect, embodiments of the present invention provide a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the above pig behavior recognition method.
According to the pig behavior recognition method and device, the electronic equipment and the storage medium, the time-space information is extracted from the pig behavior video to recognize the pig behavior, so that the pig behavior can be recognized accurately in a non-contact and low-cost mode, and the problems of low efficiency, non-objective analysis result and the like caused by the fact that the pig behavior is recorded by means of manual observation are solved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a flowchart of a pig behavior identification method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a feature fusion time sequence segmentation network according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a pig behavior recognition device according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The non-contact, low-cost, simple and effective computer vision technology is widely applied in the animal detection process and plays an important role in animal behavior evaluation.
Nasicahmadi et al performs behavior recognition of pigs by least square ellipse fitting, and can judge climbing behavior when the major axis of the ellipse is 1.3-2 times the length of the major axis of the normal ellipse and the minor axis is 1.3-1.8 times the length of the minor axis of the normal ellipse, and the accuracy of the method is 92.7%.
Lao et al obtained characteristic values for identifying sow behaviors through depth image data, and when the head of a live pig moves up and down in a feeder, the pig can be judged as a feeding behavior, and the accuracy rate of the method is 97.4%.
The extraction of the behavior characteristics mainly depends on manual observation, design and high-precision image segmentation, so the methods have high requirements on pigsty environment and shooting conditions, and the deep learning can solve the problems. Schlemia chamomile et al used modified Faster R-CNN to identify sow 5 types of stance, sitting, prone, abdominal and lateral postures with an average accuracy of over 93%. Nasicahmadi et al proposed Faster R-CNN, SSD, and R-FCN to recognize the standing, lying and lying postures of pigs with an average recognition rate of 98.99%.
However, the existing deep learning method identifies the behavior of the live pig based on the static image frame only containing the spatial information, and cannot effectively acquire the coherence time information of the target.
Fig. 1 is a flowchart of a pig behavior identification method according to an embodiment of the present invention, and as shown in fig. 1, the method includes:
step S1, acquiring a video to be detected; the video to be detected comprises image information of the pig;
specifically, a live pig breeding environment is recorded in real time, and a video to be detected including image information of a pig is acquired and stored.
Step S2, fusing the input characteristics of the video to be tested with a time sequence segmentation network to obtain and output a behavior recognition result of the pig;
the feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video.
Specifically, in step S2, the video to be tested obtained in step S1 is input into a feature fusion time sequence segmentation network trained in advance, and a behavior recognition result of the pig is obtained and output.
The feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video. The time sequence segmentation network consists of a space stream network and a time stream network, and can extract the space and time information of the video. The characteristic fusion time sequence segmentation network fuses the space stream and the time stream on the basis of the time sequence segmentation network, and can effectively improve the accuracy of pig behavior identification.
It should be noted that, besides obtaining the video to be detected by recording the video in real time in the pig breeding environment, other methods may be used to obtain the video to be detected, which is not limited in this embodiment.
It can be understood that, in the embodiment of the present invention, both the video to be detected and the sample video are videos containing image information of a pig, so that the feature fusion time sequence segmentation network obtained by training the sample image set obtained by the sample video is used for identifying the behavior of the pig.
The embodiment of the invention provides a pig behavior identification method, which can accurately identify the behavior of a pig in a non-contact and low-cost manner by extracting time-space information from a pig behavior video to identify the behavior of the pig, and solves the problems of low efficiency, non-objective analysis result and the like caused by recording the behavior of the pig by manual observation.
Based on the foregoing embodiment, optionally, fig. 2 is a schematic structural diagram of a feature fusion time-series segmentation network provided by an embodiment of the present invention, and as shown in fig. 2, in the pig behavior identification method, the feature fusion time-series segmentation network includes: the system comprises a video processing layer, a time stream feature extraction layer, a space stream feature extraction layer, a first fusion layer, a second fusion layer and an identification classification layer;
the video processing layer is used for processing the video to be detected to obtain an image set to be detected; the image set to be detected comprises: a color space feature map and an optical flow feature map; the color space characteristic graph is an image representing space information; the optical flow characteristic diagram is an image representing time information;
the time flow feature extraction layer is used for obtaining the time flow features of the video to be detected according to the optical flow feature diagram;
the spatial stream feature extraction layer is used for obtaining the spatial stream features of the video to be detected according to the color spatial feature map;
the first fusion layer is used for fusing the time flow characteristics and the space flow characteristics to obtain fused characteristics;
the second fusion layer is used for fusing the fused features and the spatial stream features to obtain target features;
and the identification classification layer is used for obtaining a behavior identification result of the pig according to the target characteristics.
Specifically, the feature fusion time sequence segmentation network includes: the system comprises a video processing layer, a time stream feature extraction layer, a space stream feature extraction layer, a first fusion layer, a second fusion layer and an identification classification layer.
After the video to be detected is input into the time sequence segmentation network, the video processing layer processes the video to be detected and divides the video V to be detected into K video segments T1,T2,…,TKAnd processing the video to obtain continuous video frames, and selecting one frame of each video segment to obtain a color space characteristic graph based on the color space for representing the space information. And calculating an optical flow characteristic diagram according to the continuous video frames to be used for representing the time information. And the color space characteristic graph and the optical flow characteristic graph of K video segments of the video to be detected jointly form an image set to be detected.
It should be noted that the selection principle for selecting a frame of video in each video segment may be random extraction, or may select a fixed frame number of video frames, or further may extract a video key frame based on a method such as an inter-frame difference method, which is not limited in this embodiment.
Secondly, the color space includes an RGB color space, an HSI color space, an HVS color space, and the like, which can be selected according to the actual situation, and this embodiment does not limit this.
In addition, in order to further enrich the time flow characteristics obtained by the network and improve the accuracy of behavior recognition, one color space characteristic diagram in each part of the video may correspond to multiple optical flow characteristic diagrams, and a specific corresponding proportional relationship may be selected according to an actual situation, which is not limited in this embodiment.
The feature extraction layer of the feature fusion time sequence segmentation network is divided into a time flow feature extraction layer and a space flow feature extraction layer. And inputting the color space characteristic diagram in the image set to be detected into the space stream characteristic extraction layer, and outputting the space stream characteristic of the video to be detected. And inputting the optical flow characteristic graph in the to-be-detected image set into the time flow characteristic extraction layer, and outputting the time flow characteristic of the to-be-detected video.
The feature fusion layer of the feature fusion time sequence segmentation network is divided into a first fusion layer and a second fusion layer. And inputting the acquired time stream characteristics and the acquired space stream characteristics into the first fusion layer to acquire fused characteristics. And inputting the fused features and the space flow features obtained in advance into a second fusion layer together to obtain the target features.
The temporal flow feature and the spatial flow feature may be fused at the first fusion layer by using a connection fusion method, and each feature map is superposed as a different channel to be used as a fused feature map. Taking two feature maps as an example, the fusion form is:
Figure BDA0002709987150000081
wherein the content of the first and second substances,
Figure BDA0002709987150000082
and
Figure BDA0002709987150000083
a characteristic diagram showing the number of channels d,
Figure BDA0002709987150000084
a feature map showing the number of channels obtained by junction fusion as 2d is shown.
And reducing the dimension of the connected and fused feature map on the channel dimension by using a convolution kernel.
yconv=ycat*f+b
Wherein, ycatIs the feature map obtained by join fusion, f is the convolution kernel, b is the offset, yconvIs the output fused feature map.
And fusing the fused features and the spatial stream features by using an averaging mode in the second fusion layer to obtain the target features.
The fusion method is described as a specific example of the fusion process, and other fusion mechanisms may be used to perform feature fusion, which is not limited in this embodiment.
And inputting the obtained target characteristics into an identification classification layer to obtain a behavior identification result of the pig.
It should be noted that the final behavior recognition result may be output only by the behavior with the highest probability, or may be output by other manners such as selecting multiple behaviors with higher probabilities, which is not limited in this embodiment.
It should be noted that, in this embodiment, when the video processing layer processes the video to obtain the video frame, the video frame may further be subjected to image processing (for example, image enhancement processing, denoising, and the like) to enhance the capability of the image to embody the target, so as to further improve the recognition accuracy, and the specific processing method is not limited in this embodiment.
On the basis of the embodiment, because the change of continuous frames in the video is very small, and the dense sampling has the problem of relatively high cost, the characteristic fusion time sequence segmentation network in the embodiment of the invention adopts a sparse sampling strategy, and the network can process the video for a longer time by segmenting and then sampling the whole video, so that the time sequence characteristics obtained by the network are richer, and a foundation is laid for the live pig behavior identification work based on the subsequent video processing technology.
By adopting a mode of further performing secondary fusion with the previous spatial stream characteristics after the time stream characteristics and the spatial stream characteristics are fused, the network structure is more end-to-end, and the steps of manually designing and processing data are fewer. The corresponding relation of the characteristic diagrams of the time domain network and the space network on the picture space position can be effectively obtained. The secondary fusion can keep more space flow information, further deepens the depth of the network, and reduces the risk of overfitting, thereby improving the accuracy of live pig behavior identification.
Based on the above embodiment, optionally, in the pig behavior identification method, the feature fusion time-series segmentation network further includes: a data storage layer;
the data storage layer is used for storing the image set to be detected output by the video processing layer, selecting a color space feature map according to a preset rule and inputting the color space feature map into the space stream feature extraction layer, and selecting an optical flow feature map and inputting the optical flow feature map into the time stream feature extraction layer respectively.
Specifically, one color space feature map in each part of video can correspond to a plurality of optical flow feature maps, and the data storage layer in the feature fusion time sequence segmentation network is used for storing an image set to be detected output by the video processing layer, extracting a corresponding number of color space feature maps and optical flow feature maps in the image set to be detected according to a preset corresponding proportion, and respectively inputting the color space feature maps and the optical flow feature maps into the space flow feature extraction layer and the time flow feature extraction layer.
It should be noted that the specific corresponding proportional relationship may be selected according to actual situations, and this embodiment does not limit this.
On the basis of the above embodiments, the method for extracting the temporal flow and spatial flow features by using one color-space feature map corresponding to a plurality of optical flow feature maps as input in the embodiments of the present invention can better capture the motion information of the target in the video segment, and effectively improve the accuracy of behavior recognition.
Based on the above embodiment, optionally, in the method for identifying pig behavior, before the step of obtaining and outputting the behavior identification result by segmenting the network according to the feature fusion time sequence based on the video to be detected, the method further includes: training the feature fusion time sequence segmentation network;
the training of the feature fusion time sequence segmentation network specifically comprises:
obtaining a sample image set from a sample video; the sample image set includes: a color space feature map and an optical flow feature map;
training the feature fusion time sequence segmentation network by using the sample image set;
taking a cross entropy function as a cost function, and obtaining a gradient for a network parameter of the feature fusion time sequence segmentation network by using a back propagation algorithm;
updating the network parameters of the feature fusion time sequence segmentation network based on the gradient, and performing iterative training on the feature fusion time sequence segmentation network based on the updated network parameters until the feature fusion time sequence segmentation network converges.
Specifically, before the network is segmented using the feature fusion timing, the network needs to be trained. Selecting a certain number of sample videos and processing the sample videos to obtain a sample image set corresponding to a color space feature map and an optical flow feature map of the sample videos as training samples of the feature fusion time sequence segmentation network.
The feature fusion time sequence segmentation network can be expressed by a formula as follows:
FTS(V)=H(G(F(T1,T1rgb,T1flows;W),...,F(TK,TKrgb,TKflows;W)))
where V denotes input video data, TKRepresenting the Kth video segment, TKrgbRepresenting the color space characteristic map, T, corresponding to the Kth video segmentKflowsAnd showing the optical flow characteristic diagram corresponding to the Kth video clip. W denotes a network parameter, F (T)K,TKrgb,TKflows(ii) a W) returned for the function is the segment TKScores of the respective behavioral classifications. The loss function G will integrate the scores of the individual segments to derive a classification score for the entire video. The Softmax function H will get the probability of each classification behavior of the input video data.
In the training process, taking a Softmax function as a function H, obtaining the probability of each classification behavior of input video data:
Figure BDA0002709987150000111
Gito infer the score for category i from the scores of the same category in all fragments using the aggregation function g:
Gi=g(Fi(T1),Fi(T2),...,Fi(TK)),i=1,2,...,C
wherein the aggregation function g adopts a uniform average method to represent the final recognition accuracy.
The cross-entropy loss is classified according to the standard, and the formula for the final loss function for the partial consensus is:
Figure BDA0002709987150000112
in the back propagation process, the gradient of the model network parameter W with respect to the loss value L is:
Figure BDA0002709987150000113
where K represents the number of segments of the video segmentation and C is the number of behavior classes. y isiIs a class label for class i, GjTo infer the score for category j from the scores of the same category in all segments using the aggregation function g.
Updating the network parameters of the feature fusion time sequence segmentation network based on the calculated gradient, and performing iterative training on the feature fusion time sequence segmentation network based on the updated network parameters until the feature fusion time sequence segmentation network converges.
And further, new video data can be collected to serve as a test video, the test video is input into the trained feature fusion time sequence segmentation network, the test error is calculated according to the output of the feature fusion time sequence segmentation network, and if the test error is within an error tolerable range, the feature fusion time sequence segmentation network is successfully trained.
Based on the foregoing embodiment, optionally, in the method for identifying pig behavior, the obtaining a sample image set from a sample video specifically includes:
dividing the sample video into a plurality of video segments;
converting each video segment into a succession of video frames;
selecting a frame from the continuous video frames as a color space feature map;
computing optical flow from the successive video frames to obtain an optical flow feature atlas;
and obtaining a sample image set according to the color space characteristic diagram and the optical flow characteristic diagram set obtained from each video segment in the sample video.
Specifically, the obtained sample video is divided into a plurality of video segments, each video segment is converted into a continuous video frame, and one frame of each video segment is selected to obtain a color space characteristic diagram based on a color space for representing spatial information. And calculating an optical flow characteristic atlas according to continuous video frames, wherein the optical flow characteristic atlas is used for representing time information.
And obtaining a sample image set according to the color space feature map and the optical flow feature map set acquired by each video segment in the sample video, wherein the sample image set is used as a training set of the feature fusion time sequence segmentation network.
It should be noted that the selection principle for selecting a frame of video in each video segment may be random extraction, or may select a fixed frame number of video frames, or further may extract a video key frame based on a method such as an inter-frame difference method, which is not limited in this embodiment.
Secondly, the color space includes an RGB color space, an HSI color space, an HVS color space, and the like, which can be selected according to the actual situation, and this embodiment does not limit this.
In addition, in order to further enrich the time flow characteristics obtained by the network and improve the accuracy of behavior recognition, one color space characteristic diagram in each part of the video may correspond to multiple optical flow characteristic diagrams, and a specific corresponding proportional relationship may be selected according to an actual situation, which is not limited in this embodiment.
On the basis of the above embodiment, the embodiment of the present invention uses a sample image set obtained by processing a sample video as a training set of a feature fusion time sequence segmentation network. The method has the advantages that characteristics can be effectively learned when the whole long video learning is carried out, the network training efficiency is improved, and the calculation cost and the consumed time are reduced.
Based on the above embodiment, optionally, in the pig behavior identification method, the identification classification layer is specifically configured to:
and classifying the target characteristics to obtain and output a classification score of each behavior in the video to be detected.
Specifically, taking the Softmax function as a function H, the probability of each classification behavior of the input video data is obtained:
Figure BDA0002709987150000131
Gito infer the score for category i from the scores of the same category in all fragments using the aggregation function g:
Gi=g(Fi(T1),Fi(T2),...,Fi(TK)),i=1,2,...,C
wherein the aggregation function g adopts a uniform average method to represent the final recognition accuracy.
Based on the foregoing embodiment, optionally, in the pig behavior identification method, the color space feature map is an RGB map, and the optical flow feature map includes: a light flow pattern and a distorted light flow pattern.
Specifically, when a video is processed, the color space is determined to be an RGB color space, an RGB image is obtained, and the RGB image represents static information at a specific time point.
A light flow map and a warped light flow map are computed from RGB maps of successive frames. The optical flow graph can capture motion information, and the distorted optical flow graph can effectively inhibit background motion, so that the motion objects in the video can be concentrated.
On the basis, besides the light flow graph and the distorted light flow graph, RGB difference values can be introduced as time information, the RGB difference values of two continuous frames represent the change of actions, and the change has good representation effect corresponding to the motion salient region.
Fig. 3 is a schematic structural diagram of a pig behavior recognition system according to an embodiment of the present invention, and as shown in fig. 3, the pig behavior recognition system includes:
an obtaining module 310, configured to obtain a video to be tested; the video to be detected comprises image information of the pig;
the identification module 320 is used for fusing the input features of the video to be detected with a time sequence segmentation network to obtain and output a behavior identification result of the pig;
the feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video.
Specifically, the obtaining module 310 is configured to record a video of a live pig breeding environment in real time, and obtain and store a video to be tested including image information of a pig.
And the recognition module 320 is configured to input the video to be detected acquired by the acquisition module 310 into a feature fusion time sequence segmentation network trained in advance, and acquire and output a behavior recognition result of the pig.
The feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video. The time sequence segmentation network consists of a space stream network and a time stream network, and can extract the space and time information of the video. The characteristic fusion time sequence segmentation network fuses the space stream and the time stream on the basis of the time sequence segmentation network, and can effectively improve the accuracy of pig behavior identification.
It should be noted that, besides obtaining the video to be detected by recording the video in real time in the pig breeding environment, other methods may be used to obtain the video to be detected, which is not limited in this embodiment.
It can be understood that, in the embodiment of the present invention, both the video to be detected and the sample video are videos containing image information of a pig, so that the feature fusion time sequence segmentation network obtained by training the sample image set obtained by the sample video is used for identifying the behavior of the pig.
The embodiment of the invention provides a pig behavior recognition system, which can accurately recognize the behavior of a pig in a non-contact and low-cost manner by extracting the time-space information from the behavior video of the pig, and solves the problems of low efficiency, non-objective analysis result and the like caused by recording the behavior of the pig by manual observation.
It should be noted that, the pig behavior recognition system provided in the embodiment of the present invention is used for executing the pig behavior recognition method, and the specific implementation manner of the pig behavior recognition system is consistent with the implementation manner of the method, and is not described herein again.
Fig. 4 is a schematic entity structure diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device may include: a processor (processor)410, a communication interface (communication interface)420, a memory (memory)430 and a communication bus (bus)440, wherein the processor 410, the communication interface 420 and the memory 430 are communicated with each other via the communication bus 440. The processor 410 may call logic instructions in the memory 430 to perform the above pig behavior recognition method, including: acquiring a video to be detected; the video to be detected comprises image information of the pig; inputting the video to be detected into a characteristic fusion time sequence segmentation network to obtain and output a behavior recognition result of the pig; the feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video.
In addition, the logic instructions in the memory 430 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.
In another aspect, an embodiment of the present invention further provides a computer program product, where the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed by a computer, the computer is capable of executing the behavior recognition method for pigs provided by the above-mentioned embodiments of the method, including: acquiring a video to be detected; the video to be detected comprises image information of the pig; inputting the video to be detected into a characteristic fusion time sequence segmentation network to obtain and output a behavior recognition result of the pig; the feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video.
In still another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented by a processor to execute the method for identifying pig behavior provided in the foregoing embodiments, and the method includes: acquiring a video to be detected; the video to be detected comprises image information of the pig; inputting the video to be detected into a characteristic fusion time sequence segmentation network to obtain and output a behavior recognition result of the pig; the feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A behavior recognition method for pigs is characterized by comprising the following steps:
acquiring a video to be detected; the video to be detected comprises image information of the pig;
inputting the video to be detected into a characteristic fusion time sequence segmentation network to obtain and output a behavior recognition result of the pig;
the feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video.
2. The method for identifying pig behavior according to claim 1,
the feature fusion temporal segmentation network comprises: the system comprises a video processing layer, a time stream feature extraction layer, a space stream feature extraction layer, a first fusion layer, a second fusion layer and an identification classification layer;
the video processing layer is used for processing the video to be detected to obtain an image set to be detected; the image set to be detected comprises: a color space feature map and an optical flow feature map; the color space characteristic graph is an image representing space information; the optical flow characteristic diagram is an image representing time information;
the time flow feature extraction layer is used for obtaining the time flow features of the video to be detected according to the optical flow feature diagram;
the spatial stream feature extraction layer is used for obtaining the spatial stream features of the video to be detected according to the color spatial feature map;
the first fusion layer is used for fusing the time flow characteristics and the space flow characteristics to obtain fused characteristics;
the second fusion layer is used for fusing the fused features and the spatial stream features to obtain target features;
and the identification classification layer is used for obtaining a behavior identification result of the pig according to the target characteristics.
3. The method for identifying pig behavior according to claim 2,
the feature fusion temporal segmentation network further comprises: a data storage layer;
the data storage layer is used for storing the image set to be detected output by the video processing layer, selecting a color space feature map according to a preset rule and inputting the color space feature map into the space stream feature extraction layer, and selecting an optical flow feature map and inputting the optical flow feature map into the time stream feature extraction layer respectively.
4. The method for identifying pig behavior according to claim 2,
before the step of fusing a time sequence segmentation network according to features based on the video to be detected and obtaining and outputting a behavior recognition result, the method further comprises the following steps of: training the feature fusion time sequence segmentation network;
the training of the feature fusion time sequence segmentation network specifically comprises:
obtaining a sample image set from a sample video; the sample image set includes: a color space feature map and an optical flow feature map;
training the feature fusion time sequence segmentation network by using the sample image set;
taking a cross entropy function as a cost function, and obtaining a gradient for a network parameter of the feature fusion time sequence segmentation network by using a back propagation algorithm;
updating the network parameters of the feature fusion time sequence segmentation network based on the gradient, and performing iterative training on the feature fusion time sequence segmentation network based on the updated network parameters until the feature fusion time sequence segmentation network converges.
5. The method for identifying pig behavior according to claim 4,
the obtaining of the sample image set from the sample video specifically includes:
dividing the sample video into a plurality of video segments;
converting each video segment into a succession of video frames;
selecting a frame from the continuous video frames as a color space feature map;
computing optical flow from the successive video frames to obtain an optical flow feature atlas;
and obtaining a sample image set according to the color space characteristic diagram and the optical flow characteristic diagram set obtained from each video segment in the sample video.
6. The method for identifying pig behavior according to claim 2,
the identification classification layer is specifically configured to:
and classifying the target characteristics to obtain and output a classification score of each behavior in the video to be detected.
7. The pig behavior recognition method according to any one of claims 2-6, wherein the color space feature map is an RGB map, and the optical flow feature map comprises: a light flow pattern and a distorted light flow pattern.
8. A behavior recognition device for pigs, comprising:
the acquisition module is used for acquiring a video to be detected; the video to be detected comprises image information of the pig;
the identification module is used for fusing the input characteristics of the video to be detected with a time sequence segmentation network to obtain and output a behavior identification result of the pig;
the feature fusion time sequence segmentation network is obtained by training based on a sample image set obtained by a sample video.
9. An electronic device, comprising a memory and a processor, wherein the processor and the memory communicate with each other via a bus; the memory stores program instructions executable by the processor, the processor calling the program instructions to perform the method of behavioral recognition of pigs according to any one of claims 1 to 7.
10. A non-transitory computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the method of behavioral recognition in a pig according to any one of claims 1 to 7.
CN202011052538.9A 2020-09-29 2020-09-29 Pig behavior identification method and device, electronic equipment and storage medium Pending CN112215107A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011052538.9A CN112215107A (en) 2020-09-29 2020-09-29 Pig behavior identification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011052538.9A CN112215107A (en) 2020-09-29 2020-09-29 Pig behavior identification method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112215107A true CN112215107A (en) 2021-01-12

Family

ID=74052174

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011052538.9A Pending CN112215107A (en) 2020-09-29 2020-09-29 Pig behavior identification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112215107A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801061A (en) * 2021-04-07 2021-05-14 南京百伦斯智能科技有限公司 Posture recognition method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110032942A (en) * 2019-03-15 2019-07-19 中山大学 Action identification method based on Time Domain Piecewise and signature differential
CN111046821A (en) * 2019-12-19 2020-04-21 东北师范大学人文学院 Video behavior identification method and system and electronic equipment
US10672383B1 (en) * 2018-12-04 2020-06-02 Sorenson Ip Holdings, Llc Training speech recognition systems using word sequences
CN111709351A (en) * 2020-06-11 2020-09-25 江南大学 Three-branch network behavior identification method based on multipath space-time characteristic reinforcement fusion

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10672383B1 (en) * 2018-12-04 2020-06-02 Sorenson Ip Holdings, Llc Training speech recognition systems using word sequences
CN110032942A (en) * 2019-03-15 2019-07-19 中山大学 Action identification method based on Time Domain Piecewise and signature differential
CN111046821A (en) * 2019-12-19 2020-04-21 东北师范大学人文学院 Video behavior identification method and system and electronic equipment
CN111709351A (en) * 2020-06-11 2020-09-25 江南大学 Three-branch network behavior identification method based on multipath space-time characteristic reinforcement fusion

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李洪均,丁宇鹏,李超波,张士兵: "基于特征融合时序分割网络的行为识别研究", 《计算机研究与发展》, vol. 57, no. 01, pages 145 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801061A (en) * 2021-04-07 2021-05-14 南京百伦斯智能科技有限公司 Posture recognition method and system

Similar Documents

Publication Publication Date Title
CN107977671B (en) Tongue picture classification method based on multitask convolutional neural network
CN111178197B (en) Mass R-CNN and Soft-NMS fusion based group-fed adherent pig example segmentation method
US20230281265A1 (en) Method for estimating body size and weight of pig based on deep learning
WO2020125057A1 (en) Livestock quantity identification method and apparatus
CN107330403B (en) Yak counting method based on video data
CN112613428B (en) Resnet-3D convolution cattle video target detection method based on balance loss
Hasan et al. Fish diseases detection using convolutional neural network (CNN)
CN108009567A (en) A kind of automatic discriminating conduct of the fecal character of combination color of image and HOG and SVM
Kanjalkar et al. Detection and classification of plant leaf diseases using ANN
CN114898405B (en) Portable broiler chicken anomaly monitoring system based on edge calculation
CN111968081A (en) Fish shoal automatic counting method and device, electronic equipment and storage medium
CN110163103B (en) Live pig behavior identification method and device based on video image
CN116778309A (en) Residual bait monitoring method, device, system and storage medium
CN115471871A (en) Sheldrake gender classification and identification method based on target detection and classification network
Isa et al. CNN transfer learning of shrimp detection for underwater vision system
CN112215107A (en) Pig behavior identification method and device, electronic equipment and storage medium
Muñoz-Benavent et al. Impact evaluation of deep learning on image segmentation for automatic bluefin tuna sizing
Pauzi et al. A review on image processing for fish disease detection
CN113470076A (en) Multi-target tracking method for yellow-feather chickens in flat-breeding henhouse
CN116912674A (en) Target detection method and system based on improved YOLOv5s network model under complex water environment
Gabriel et al. Wildlife Detection and Recognition in Digital Images Using YOLOv3
CN115578423A (en) Fish key point detection, individual tracking and biomass estimation method and system based on deep learning
Woodward-Greene et al. PreciseEdge raster RGB image segmentation algorithm reduces user input for livestock digital body measurements highly correlated to real-world measurements
CN112465821A (en) Multi-scale pest image detection method based on boundary key point perception
Yu et al. An automatic detection and counting method for fish lateral line scales of underwater fish based on improved YOLOv5

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination