CN113177564A

CN113177564A - Computer vision pig key point identification method

Info

Publication number: CN113177564A
Application number: CN202110531027.3A
Authority: CN
Inventors: 张玉良; 李攀鹏; 黄煜; 尤园; 刘兴宇; 黄晓晖
Original assignee: Henan Muyuan Intelligent Technology Co Ltd
Current assignee: Henan Muyuan Intelligent Technology Co Ltd
Priority date: 2021-05-16
Filing date: 2021-05-16
Publication date: 2021-07-27
Anticipated expiration: 2041-05-16
Also published as: CN113177564B

Abstract

The invention discloses a method for identifying key points of a computer vision pig, and belongs to the technical field of machine vision. Wherein, the method comprises the following steps: acquiring image information from the acquired image data of the pig, and removing abnormal pictures in the image information to obtain a target image; processing the target image by using an open source target detection model, and detecting the frame coordinates of all target pigs in the field range under the dense scene; extracting the frame coordinates of the target pig to form single pig pictures one by one, carrying out deep neural network calculation on the single pig pictures to obtain feature mapping, and obtaining the position coordinates of the key points of the pigs through the feature mapping. The method can work in a pig dense scene, can automatically detect the key points of the parts of the pigs in the dense columns, and can provide technical support for the development of various subsequent specific sick pig identification methods through the accurate identification of the key point states.

Description

Computer vision pig key point identification method

Technical Field

The invention relates to the technical field of machine vision, in particular to a computer vision pig key point identification method.

Background

In the process of researching relevant machine vision of pigs, key points of characteristic regions corresponding to pig bodies are generally extracted first, and different pig key points can reflect health conditions of pigs to a certain extent.

The computer vision method provides a technology for automatically identifying sick pigs without watching videos, is the most basic and key technology in the existing multiple technologies for identifying sick pigs by computer vision, and can greatly reduce the technical difficulty of identifying the sick pigs (such as external injuries, ear biting and the like) in the later period. The current market technology or related technical literature in development is simply monitored in a limited and simple environment, such as a single pig fence or a column within 4 pigs, and the complex scene (the number of pigs in the column is generally more than 5) cannot be realized. Meanwhile, the identified key points of the pigs are few, generally the head, the ears, the nose, the legs and the tail, the characteristic areas are large and not detailed enough, and the key points of the pig parts cannot be accurately identified.

The computer vision pig key point identification algorithm in the dense scene can automatically detect the pig part key points in the dense column, and can reduce the technical difficulty in developing various subsequent identification methods for sick pigs.

Disclosure of Invention

In view of the above, the invention provides a computer vision identification method for pig key points, which can not only work in a pig dense scene, but also automatically detect the pig part key points in a dense column, and can provide technical support for the development of various subsequent identification methods for sick pigs through accurate identification of the state of the key points.

In order to achieve the purpose, the invention provides the following technical scheme:

the invention provides a computer vision pig key point identification method, which comprises the following steps:

step 1, collecting image data of pigs;

step 2, eliminating abnormal pictures in the image data to obtain a target image;

step 3, processing the target image by using the open source target detection model to acquire the frame coordinates of all target pigs in the field of view under the dense scene;

step 4, extracting the box coordinates of all target pigs to form pictures of single pig one by one;

and 5, processing the single pig image through a deep neural network to obtain feature mapping, and obtaining the position coordinates of the key points of the pig by using the feature mapping.

Further, the abnormal pictures comprise an illumination abnormal picture, a fuzzy abnormal picture, a fog abnormal picture and an angle abnormal picture.

Further, the step 2 includes:

1) processing the abnormal picture through the opencv value and the HSV value of the image to eliminate the illumination abnormal picture;

2) converting the abnormal picture into a gray map, averagely dividing the gray map into 4 areas, and calculating a fixed value of each area through Laplace to remove the fuzzy abnormal picture;

3) processing the abnormal picture through minimum filtering to remove the abnormal picture with fog;

4) and processing the abnormal picture through a FastLineDetector straight line, and removing the angle abnormal picture.

Further, the step 3 includes:

1) matching the data of the target image with the yolov4 target detection model;

2) processing the target image through a yolov4 target detection model, acquiring a reasoning result, and obtaining box coordinates and confidence probabilities of all pigs according to the reasoning result;

3) and carrying out confidence probability processing on the box coordinates of the pigs to obtain the box coordinates of the target pigs.

Further, when the confidence probability of the target pig is processed, the box coordinates of the target pig with the confidence probability less than 0.3 are filtered out.

Further, the step 4 includes:

performing inverse calculation on the frame coordinates of the target pig and returning the frame coordinates to the matrix of the target image, and taking out pixel values according to the position index of the target frame;

and performing pixel indexing on the pixel values in the target image by using the coordinates calculated by yolov4 to obtain a target area, so that each target pig in any picture in the target area forms a single pig picture in the box coordinates of the target pig.

Further, the step 5 includes:

converting the single pig picture into a square size by a resize picture conversion algorithm;

inputting the single pig picture into an open source resnet50 deep neural network for calculation, obtaining feature mapping of the pig key points, and obtaining position coordinates of the pig key points through the feature mapping.

Furthermore, a single pig picture with the size of 224x224 is obtained and input into a trained resnet50 deep neural network, a resnet50 deep neural network or a series of convolution operations and pooling operations are used for calculating the single pig picture into a feature map of 20x24x48, 20 in 20x24x48 corresponds to 20 key points, and a sub-array of 24x48 corresponds to the feature map of each key point;

finding out a peak value from each feature map, wherein the feature map with the peak value larger than a threshold value of 0.5 is a target key point, and calculating the size of a target image by using the coordinates of the target key point, wherein the size of the target image is the coordinates of the position of the key point of the pig.

Further, the processing of inversely calculating the target image size by the target key point coordinates comprises: the coordinates of the target key points are (x, y), the coordinates of the upper left corner of the corresponding pig target detection frame are (x0, y0), and the coordinates of the positions of the pig key points in the target image are (x0+ x, y0+ y).

Furthermore, the pig key points comprise a left ear, a right ear, a delayed center, a dorsal front point, a dorsal midpoint, a dorsal tail point, a left posterior hip joint point, a left posterior knee joint point, a left posterior ankle joint point, a left anterior knee joint point, a left anterior hip joint point, a right posterior ankle joint point, a right posterior knee joint point, a right posterior hip joint point, a right anterior knee joint point, a right anterior ankle joint point and an abdominal center of the pig.

Compared with the prior art, the invention has the beneficial effects that: firstly, acquiring image information from image data of a pig image, and removing abnormal pictures in the image information to obtain a target image; and finally, carrying out deep neural network calculation on the single pig picture to obtain feature mapping, and obtaining the position coordinates of the pig key points through the feature mapping so as to obtain the information of the pig key points. The computer vision pig key point identification method provided by the invention can automatically detect the position key points of the pig, and can reduce the development technical difficulty of the subsequent identification methods of various specific sick pigs through the key points, such as: detecting where the ears of the pigs in a dense scene need to be detected when the ears are bitten, wherein only key points of the pigs comprise left and right ears; when detecting the belly biting, the belly of the pig in a dense scene needs to be detected, and only key points of the pig comprise the belly; and directly detecting ear trauma and the like of the pig by using the 'ear' semantic recognized by the key point detection.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic view of key points on one side of a pig according to the present invention;

FIG. 2 is a schematic view of the key points on the other side of the pig according to the present invention;

FIG. 3 is a flowchart of a method for identifying key points of a computer-vision pig according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In combination with the technical scheme of the invention, the term explanation of the related contents is exemplified as follows:

opencv: an open source cross-platform computer vision and machine learning software library;

HSV: HSV (Hue, Saturation, Value) is a color space created according to the intuitive characteristics of color, and the parameters of color in this model are: hue (H), saturation (S), lightness (V);

gray scale map: dividing the white and black into several grades according to logarithmic relation, called gray scale, dividing the gray scale into 256 steps, and using the image represented by gray scale as gray scale image;

laplace operator: the method is used for calculating the second derivative, and finding an area with a rapidly changed pixel value in the picture, the boundary of the normal picture is clear, so the square difference is larger, the boundary information contained in the fuzzy picture is less, and the square difference is smaller, namely, the variance of the output image is calculated by Gauss fuzzy- > graying- > Laplace calculation- > absolute value (convertScaleabs) - > and the fuzzy degree is judged according to the variance;

minimum value filtering: filling the target pixel after the minimum value of the target pixel and the peripheral pixels is obtained;

LineSegmentDetector: a method of detecting a straight line;

a neural network: an arithmetic mathematical model simulating animal neural network behavior characteristics and performing distributed parallel information processing. The network achieves the aim of processing information by adjusting the mutual connection relationship among a large number of internal nodes according to the complexity of the system;

a feed-forward neural network: the simplest neural network is characterized in that each neuron is arranged in a layered mode, each neuron is only connected with a neuron of the previous layer, receives the output of the previous layer and outputs the output to the next layer, and no feedback exists between the layers;

convolution: generating a mathematical operator of a third function through two functions f and g, and representing the integral of the product of function values of the overlapping parts of the functions f and g after overturning and translation and the overlapping length;

a convolutional neural network: a class of feedforward neural networks that includes convolution calculations and has a depth structure;

full connection layer: each node of the full connection layer is connected with all nodes of the previous layer and used for integrating the extracted features;

masking: for example, a round object is arranged in a picture, a circle with the same size as the object is cut from a piece of paper, the piece of paper is covered on the picture, only the round object can be seen at the moment, and the piece of paper is the mask;

image classification: a fixed classification label set exists, then a classification label is found from the classification label set for an input image, and finally the classification label is distributed to the input image;

resize picture transformation algorithm: assuming that the height, width and number of channels of the original image size are respectively HxWx3,

the height and width of the converted size are H1xW1x3, the number of channels is H1xW1x3, the conversion algorithm traverses each channel, the pixels of the original picture are filled into the pixels of the picture matrix with the new size, and the filled position on the new picture matrix cannot find the exact pixel filling on the original picture, so that the average value of the adjacent pixels is found for filling;

target detection and confidence probability: firstly, defining a target to be detected, for example, the target is a pig in the patent, and then detecting, wherein the detection is to develop a computer algorithm, so that the computer extracts the target in a picture, namely a pig frame (namely, the position coordinate of the upper left corner and the position coordinate of the lower right corner), and the computer also gives a confidence probability (the value range is 0 to 1) of the rectangular frame when extracting the frame, and particularly, the yolo network head calculates the value according to the feature mapping in the patent.

Yolov4 open source project and inference architecture: the yolov4 whole model is divided into three sub-models: dark net backbone network, spp (spatial pyramid pooling) + PANet (path aggregation network) neck, yolo network head and non-maximum suppression post-processing; when yolov4 deduces, firstly, converting a picture (with the size of 1920x 1080) into the size of 608x608x3, then, entering a darknet backbone network for calculation, calculating the picture of 608x608x3 into a floating point array of 19x19x1024 as a calculation result, then, calculating through spp (spatial pyramid pooling) + PANet (path aggregation network) neck to obtain a floating point array of 76x76x256, and finally, calculating through yolo network head and non-maximum suppression post-processing to obtain the names of a target detection position frame and a target;

characteristic mapping: the feature mapping is a special multidimensional array stored in a computer, the data type is a floating point type, and each value in the array is stored with a confidence probability value range between 0 and 1.

Fig. 1 and fig. 2 schematically show a pig key point to be obtained by the present invention, and identification of a pig with fever can be performed by three key points of a left ear 1, a right ear 2 and a center of a rethreading 3 of the pig, and the three points identify the approximate position of an ear root, so as to extract the temperature of the ear root; the weight of the pig can be estimated through three key points, namely the anterior dorsal point 4, the middle dorsal point 6 and the tail dorsal point 5; the identification of abnormal behavior actions such as lameness, paralysis and the like of the pigs is carried out through six key points, namely a left rear hip joint point 11, a left rear knee joint point 12, a left rear ankle joint point 13, a left front ankle joint point 7, a left front knee joint point 8 and a left front hip joint point 9, and the identification can also be used for assisting in weight estimation of the pigs; seven key points including a right rear ankle joint point 13, a right rear knee joint point 12, a right rear hip joint point 11, a right front hip joint point, a right front knee joint point, a right front ankle joint point and an abdomen center 10 can be used for detecting whether a pustule exists on the leg of the pig or whether the leg of the pig turns when the pig walks.

There is provided in accordance with the present invention a computer-vision porcine keypoint identification method, it being noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer-executable instructions and that, while a logical ordering is shown in the flowchart summary, in some cases the steps illustrated or described may be performed in an order different than that described herein.

Fig. 1 is a flowchart of a method for identifying key points of a computer-vision pig according to an embodiment of the present invention, as shown in fig. 1, the method includes the following steps:

step 1, collecting image data of pigs;

More specifically, according to the step 2, the abnormal pictures comprise pictures with abnormal illumination, blur, fog and angle abnormality; carrying out opencv and image HSV value processing on the illumination abnormal pictures in the abnormal pictures, and rejecting the illumination abnormal pictures; converting fuzzy abnormal pictures in the abnormal pictures into a gray map, averagely dividing the gray map into 4 areas, calculating the edge fuzzy degree value of a Laplacian operator in each area, basically speaking and adjusting the fixed value of the image, and removing the fuzzy pictures; carrying out minimum value filtering processing on the fog-containing abnormal pictures in the abnormal pictures to remove the pictures containing fog; and performing FastLineDetector linear processing on the angle abnormal pictures in the abnormal pictures, eliminating the angle abnormal pictures, and processing the abnormal pictures in the image data to obtain a target image.

According to the step 3, more specifically, a pig target detection data set labeled inside a company is used for training, the pig target detection data comprise picture data and artificial labels, during training, the picture data and the artificial labels are matched into a Yolov4 open source technology, a training program can be run, and a model can be obtained after running, wherein a Yolov4 target detection model capable of detecting pigs is obtained inside the company; reasoning on a target picture by using the yolov4 target detection model; obtaining the box coordinates and confidence probabilities of all pigs according to the inference result of the yolov4 target detection model on the pictures; and filtering the confidence probability to obtain the box coordinates of the target pig, and filtering out the box coordinates of the target pig with the confidence probability less than 0.3 when filtering the confidence probability.

More specifically, according to the step 4, the coordinates of only the frame of the target pig are inversely calculated and returned to the original picture (namely the target image after the abnormal picture is removed) matrix, and pixel values are taken out according to the position index of the target frame; and performing pixel indexing on the pixel values in the target image by using the coordinates calculated by yolov4 to obtain a target area, so that each target pig in any picture in the target area forms a single pig picture in the box coordinates of the target pigs.

More specifically, according to the step 5, the obtained single pig picture is converted into a square size through a resize picture conversion algorithm; and (3) carrying out trained open source resnet50 deep neural network calculation by using single pig picture input, obtaining feature mapping of the pig key points, and obtaining the position coordinates of the pig key points through the feature mapping.

In the embodiment, the method for obtaining the feature map of the key points of the pig by using the single pig image input to perform the trained open source resnet50 deep neural network calculation comprises the following steps: firstly, a single pig picture with the size of 224x224 is obtained and input into a trained resnet50 deep neural network, a resnet50 deep neural network or a series of convolution operation and pooling operation to calculate the single pig picture into a feature map of 20x24x48, wherein the first 20 corresponds to 20 key points, and each sub-array of 24x48 corresponds to the feature map of each key point; the feature mapping of each key point firstly uses a picture resize algorithm to calculate the feature mapping into the size of an original pig small picture, namely a 224x224 size array, a peak value is found from each feature probability mapping and is larger than a certain threshold (0.5), namely the corresponding key point, and the coordinates of the key points are inversely calculated to the size of a target image, namely the coordinates of the key points of the target image.

In connection with the foregoing embodiment, the performing the key point coordinates back-calculated to the target image size as the target image key point coordinates includes: assuming that the coordinates of the key points of the target image are (x, y), and the coordinates of the upper left corner of the corresponding pig target detection box are (x0, y0), the coordinates of the positions of the key points of the pig in the target image are (x0+ x, y0+ y).

The working process of the invention comprises the following steps of firstly reading the picture, and then detecting the swinery key points of the picture:

step 1: collecting image data of pigs;

the specific steps of step 2 are as follows:

step 2.1: calculating an HSV value of the picture, and eliminating the picture with abnormal illumination;

step 2.2: calculating an edge fuzzy degree value of a picture Laplacian operator, and removing a fuzzy picture;

step 2.3: based on the minimum value filtering, removing the image containing the fog;

step 2.4: based on a FastLineDetector straight line detector, removing pictures with abnormal column angles;

the specific steps of step 3 are as follows:

step 3.1: training by using a pig target detection data set labeled inside a company based on a yolov4 open source project to obtain a yolov4 target detection model capable of detecting pigs;

step 3.2: reasoning is carried out on the target image by using the yolov4 target detection model;

step 3.3: using the inference result of yolov4 target images to obtain the box coordinates and confidence probability of each pig;

step 3.4: filtering out targets with lower confidence probability to obtain the box coordinates of the target pigs which are successfully detected;

the specific steps of step 4 are as follows:

step 4.1: the coordinates of the target frame obtained in the previous step are used for back calculation to the original picture size coordinates;

step 4.2: performing pixel indexing in the original picture by using the coordinates after inverse calculation to obtain a target area;

step 4.3: repeating the operation on each target frame obtained from any one picture, so that each target pig in the target image forms a single pig picture;

the specific steps of step 5 are as follows:

step 5.1: using the obtained single pig picture, converting the picture into a square size such as 224x224 size by using a picture conversion algorithm;

step 5.2: calculating the trained open source resnet50 deep neural network by using the image input to obtain the feature mapping of the pig body key points, wherein each key point of 20 key points has a feature probability mapping; wherein the resnet50 deep neural network computing keypoint mapping uses open source projects;

step 5.3: finding a peak value from each characteristic probability mapping, wherein the peak value is greater than a certain threshold value (0.5), and the corresponding key point is obtained;

step 5.4: and the coordinates of the key points are inversely calculated to the size of the original image, namely the coordinates of the key points of the original image.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A computer vision pig key point identification method is characterized by comprising the following steps:

step 1, collecting image data of pigs;

2. The computer-vision pig keypoint identification method according to claim 1, wherein the abnormal pictures comprise a light abnormal picture, a fuzzy abnormal picture, a fog abnormal picture and an angle abnormal picture.

3. The method of claim 2, wherein the step 2 comprises:

2) converting the abnormal picture into a gray map, averagely dividing the gray map into 4 areas, and removing the fuzzy abnormal picture from each area by a fixed value calculated by Laplacian;

4. The method of claim 1, wherein the step 3 comprises:

5. The computer-vision pig keypoint identification method of claim 4, wherein when processing the confidence probability of the target pig, the box coordinates of the target pig with the confidence probability less than 0.3 are filtered out.

6. The method of claim 1, wherein the step 4 comprises:

7. The method of claim 1, wherein the step 5 comprises:

8. The computer-vision pig keypoint identification method of claim 7, wherein single pig pictures with the size of 224x224 are obtained and input into a trained resnet50 deep neural network, a resnet50 deep neural network or a series of convolution operations and pooling operations to compute the single pig pictures into a feature map of 20x24x48, wherein 20 in 20x24x48 corresponds to 20 keypoints, and a sub-array of 24x48 corresponds to the feature map of each keypoint;

9. The computer-vision pig keypoint identification method according to claim 8, wherein said process of inverse calculating the target image size from the target keypoint coordinates comprises: the coordinates of the target key points are (x, y), the coordinates of the upper left corner of the corresponding pig target detection frame are (x0, y0), and the coordinates of the positions of the pig key points in the target image are (x0+ x, y0+ y).

10. The computer-vision pig key point identification method as claimed in claim 1, wherein the pig key points comprise a left ear, a right ear, a central hind arch, an anterior point of the back, a central point of the back, a tail point of the back, a left posterior hip joint point, a left posterior knee joint point, a left posterior ankle joint point, a left anterior knee joint point, a left anterior hip joint point, a right posterior ankle joint point, a right posterior knee joint point, a right posterior hip joint point, a right anterior knee joint point, a right anterior ankle joint point and an abdominal center.