CN108388830B

CN108388830B - Animal body shape detection method and device based on convolutional neural network

Info

Publication number: CN108388830B
Application number: CN201810019554.4A
Authority: CN
Inventors: 高万林; 仲贞; 王敏娟; 于丽娜; 陈治昌
Original assignee: China Agricultural University
Current assignee: China Agricultural University
Priority date: 2018-01-09
Filing date: 2018-01-09
Publication date: 2020-08-14
Anticipated expiration: 2038-01-09
Also published as: CN108388830A

Abstract

The invention provides an animal body shape detection method and device based on a convolutional neural network. The method comprises the following steps: carrying out feature extraction on an original image of an animal by using a Gabor sequencing feature descriptor with illumination invariance to obtain a first image feature map; registering the depth image of the animal with the first image feature map to obtain a second image feature map; superposing the first image feature map and the second image feature map, and obtaining a candidate target region according to the superposed feature fusion map; and inputting the candidate target area into a convolutional neural network to obtain a detection result of the animal body shape. The invention can improve the accuracy and real-time property of animal body shape detection.

Description

Animal body shape detection method and device based on convolutional neural network

Technical Field

The invention relates to the technical field of image detection, in particular to a method and a device for detecting animal body shapes based on a convolutional neural network.

Background

Some animals on earth have many similarities with human beings in physiological and pathological processes of life activities and can be referred to each other, and particularly, pigs, which are large animals, have close relationship with human beings and have many similarities. Therefore, studying the health status of animals, especially pigs, is of great value to the understanding of human life.

In recent years, artificial intelligence dominated by machine learning has been rapidly developed, and a major breakthrough has been made in many research fields. The image detection technology is an important research hotspot and difficulty in the field of artificial intelligence. The body shape of the animal is one of the phenotypic characteristics of the animal, which can reflect the health condition of the animal, so the body shape detection of the animal is an important research direction.

Due to the fact that the postures of animals are different, visible light images are easily influenced by illumination and complex breeding environments, the animals are mutually shielded, and the like, animal body shape detection is always a difficult point in the field of image detection, and therefore a robustness algorithm for animal body shape detection is very important to research.

Disclosure of Invention

The present invention provides a convolutional neural network based animal body shape detection method and apparatus that overcomes, or at least partially solves, the above-mentioned problems.

According to one aspect of the present invention, there is provided a method for detecting the shape of an animal, comprising:

carrying out feature extraction on an original image of an animal by using a Gabor sequencing feature descriptor with illumination invariance to obtain a first image feature map;

registering the depth image of the animal with the first image feature map to obtain a second image feature map;

superposing the first image feature map and the second image feature map, and obtaining a candidate target region according to the superposed feature fusion map;

and inputting the candidate target area into a convolutional neural network to obtain a detection result of the animal body shape.

Further, the extracting the features of the original image of the animal by using a Gabor sequencing feature descriptor with illumination invariance to obtain a first image feature map specifically includes:

converting the original image into a first gray image, and carrying out Gabor filtering on the first gray image by adopting a Gabor filter with a scale parameter sigma of 5 to obtain amplitude characteristic images of animal figures in multiple directions;

utilizing a sequencing filter to encode the amplitude characteristic images of the animal body shapes in the multiple directions to obtain GOM characteristic encoded images of the animal body shapes in the multiple directions;

and obtaining the GOM feature coded image of the animal body shape according to the GOM feature coded images of the animal body shape in the multiple directions, wherein the GOM feature coded image of the animal body shape is used as the first image feature map.

Further, the expression of the Gabor filter is:

where σ represents the scale parameter of the Gabor filter, θ_kAn angle value representing a k-th direction;

as a scale parameter in the x-direction of the Gabor filter,

as a scale parameter in the y-direction of the Gabor filter,

for the value of x for the k-th angular direction,

is the value of y, f, in the k-th angular direction_kIs the center frequency in the k direction;

the expression of the sequencing filter is:

where ω and represents the center position and scale of the sequencing filter, N, respectively_pNumber of positive lobes, N_nNumber of negative lobes, coefficient C_pAnd C_nIs a coefficient that keeps the number of positive and negative lobes balanced, and N_pC_p＝N_nC_n；_piIs a scale parameter of the positive lobe,_njscale parameter of the negative lobe, ω_piIs the central position of the positive lobe, ω_njIs the central position of the negative valve

The plurality of directions, specifically 8 directions, includes 0 °,22.5 °,45 °,67.5 °,90 °,112.5 °,135 ° and 157.5 °.

Further, registering the depth image of the animal with the first image feature map to obtain a second image feature map, specifically including:

registering the depth image and the first image feature map in an affine transformation mode to obtain an image feature map of the depth image, wherein the affine transformation formula is as follows:

wherein x is₁Is the value of the first characteristic image in the x direction, y₁Is a value in the y direction of the first feature image, x₂Is the value of the second characteristic image in the x direction, y₂Is the value of the y direction of the second feature image, t_xIs the translation value, t, in the x direction_yThe value of the translation in the y direction, s the scaling, and θ the counterclockwise rotation angle with (x, y) as the axis.

Further, the superimposing the first image feature map and the second image feature map, and obtaining a candidate target region according to the superimposed feature fusion map specifically include:

compressing the gray value of the pixel point in the second image characteristic image to a gray value range consistent with the gray value range of the first image characteristic image;

based on the compressed second image characteristic map and the corresponding pixel points in the first image characteristic map, selecting the gray value of the pixel point with a larger gray value as the gray value of the pixel point at the corresponding position of the fused image;

and obtaining the characteristic fusion image according to the gray values of the pixel points at the corresponding positions of all the fusion images, thereby obtaining and obtaining the candidate target area.

Further, compressing the gray value of the pixel point in the second image feature map to a gray value range consistent with the second image feature map further includes: if the gray value of the compressed second image feature map is non-integer, performing approximate calculation through the following formula to obtain an approximate gray value I_a(x,y)：

And I (x, y) represents the gray value of the pixel point of the second image characteristic diagram.

Further, the convolutional neural network comprises three convolutional layers, three pooling layers and three full-connection layers; inputting the candidate target area into a convolutional neural network to obtain a detection result of the animal body shape, which specifically comprises:

the first layer of convolutional layer is filtered by using 96 convolutional filters with the convolutional kernel size of 11 multiplied by 11 and the step size of 4; the second convolution layer uses 256 convolution filters with convolution kernel size of 5 multiplied by 5 and step length of 1 to filter; the third layer of convolution layer uses 384 convolution filters with convolution kernel size of 3 multiplied by 3 and step length of 1 to filter;

sending the filtering results of the first layer of convolutional layer, the second layer of convolutional layer and the third layer of convolutional layer into a maximum pooling layer, wherein the maximum pooling layer is set with a pooling window of 3 x 3 and a step length of 2;

and obtaining the detection result of the animal body shape by passing the output result of the maximum pooling layer through three full-connection layers.

According to another aspect of the present invention, there is also provided an animal body shape detecting apparatus including:

the first image feature map module is used for extracting features of an original image of an animal by using a Gabor sequencing feature descriptor with illumination invariance to obtain a first image feature map;

the second image feature map module is used for registering the depth image of the animal with the first image feature map to obtain a second image feature map;

the candidate target area module is used for superposing the first image characteristic diagram and the second image characteristic diagram and obtaining a candidate target area according to the superposed characteristic fusion diagram; and

and the body shape detection result module is used for inputting the candidate target area into the convolutional neural network to obtain the detection result of the animal body shape.

According to another aspect of the present invention, there is provided a non-transitory computer readable storage medium, characterized in that the non-transitory computer readable storage medium stores computer instructions for causing the computer to execute the method of the convolutional neural network-based animal body shape detection method of the present invention and any optional embodiment thereof.

The invention provides an animal body shape detection method based on a convolutional neural network, which comprises the steps of extracting features of an original image of an animal, registering the extracted features with a depth image of the animal to obtain a feature map of the depth image, fusing the two feature maps to obtain a candidate region, and inputting the obtained candidate region into a discrimination model of the convolutional neural network so that the discrimination model of the convolutional network outputs an animal body shape detection result, and the accuracy and the real-time performance of animal body shape detection can be improved.

Drawings

FIG. 1 is a schematic flow chart of a method for detecting an animal body shape based on a convolutional neural network according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an animal body shape detection device based on a convolutional neural network according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a frame of an electronic device according to an embodiment of the invention.

Detailed Description

The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.

Fig. 1 is a schematic flow chart of an animal body shape detection method based on a convolutional neural network according to an embodiment of the present invention, where the animal body shape detection method based on the convolutional neural network shown in fig. 1 includes:

s100, extracting the features of an original image of an animal by using a Gabor sequencing feature descriptor with illumination invariance to obtain a first image feature map;

the Gabor sequencing feature of the embodiment of the invention is a feature which can be used for describing image texture information, and the frequency and the direction of a Gabor filter are similar to the human visual system, so that the Gabor sequencing feature is particularly suitable for texture representation and discrimination.

S200, registering the depth image of the animal with the first image feature map to obtain a second image feature map;

the depth image (depth image) according to the embodiment of the present invention is also referred to as a range image (range image), and refers to an image in which the distance (depth) from an image collector to each point in a scene is used as a pixel value, and directly reflects the geometric shape of a visible surface of an object in the image.

In the embodiment of the present invention, the original image and the depth image in step S100 and step S200 are images from the same animal. And the second image feature map is obtained by registering the depth image of the animal and the image feature map of the original image through affine transformation.

S300, overlapping the first image feature map and the second image feature map, and obtaining a candidate target area according to the overlapped feature fusion map;

and S400, inputting the candidate target area into a convolutional neural network to obtain a detection result of the animal body shape.

In an optional embodiment, in step S100, the extracting features of the original image of the animal by using a Gabor sequencing feature descriptor with illumination invariance to obtain a first image feature map specifically includes:

s100.1, converting the original image into a first gray image, and carrying out Gabor filtering on the first gray image by adopting a Gabor filter with a scale parameter sigma of 5 to obtain amplitude characteristic images of animal body shapes in multiple directions;

s100.2, coding the amplitude characteristic images of the animal body shapes in the multiple directions by utilizing a sequencing filter to obtain GOM characteristic coded images of the animal body shapes in the multiple directions;

and S100.3, obtaining the GOM feature coded image of the animal body shape according to the GOM feature coded images of the animal body shape in the multiple directions, and using the GOM feature coded image of the animal body shape as the first image feature map.

Specifically, the expression of the Gabor filter is as follows:

as a scale parameter in the x-direction of the Gabor filter,

as a scale parameter in the y-direction of the Gabor filter,

for the value of x for the k-th angular direction,

specifically, the expression of the sequence filter is:

Specifically, since the three-lobe sequencing filter is robust, let C be the same as the three-lobe sequencing filter in the present embodiment_p＝1,N_p＝2；C_n＝2,N_n＝1。

Specifically, the plurality of directions are 8 directions, including 0 °,22.5 °,45 °,67.5 °,90 °,112.5 °,135 °, and 157.5 °.

Specifically, in this embodiment, amplitude characteristic images of animal figures in 8 directions are obtained by using a Gabor filter shown in formula (1); then, coding the amplitude characteristic images of the animal body shapes in the 8 directions by using a sequencing filter shown in a formula (2) to obtain GOM characteristic coded images of the animal body shapes in the 8 directions; finally, the GOM feature coded images with the animal body shapes in the 8 directions are merged into a GOM feature coded image with the animal body shape as the first image feature map, and the method comprises the following steps:

where k represents the kth direction, where oritations is 8 in this embodiment, and Ordinal (k) represents the result of filtering the animal body shape amplitude feature image in the kth direction by the sequencing filter.

In an optional embodiment, in step S200, the registering the depth image of the animal with the first image feature map to obtain a second image feature map specifically includes:

In an optional embodiment, in step S300, the superimposing the first image feature map and the second image feature map, and obtaining a candidate target region according to the superimposed feature fusion map specifically includes:

In this embodiment, the gray value of the pixel point of the compressed second image feature map is compared with the gray value of the corresponding pixel point in the first image feature map, that is, two corresponding pixel points are compared, where the two pixel points include a pixel point of the first image feature map and a corresponding pixel point on the second image feature map; and selecting the gray value of the pixel point with the larger gray value in the two pixel points as the gray value of the pixel point at the corresponding position of the fused image. And after all pixel points of the compressed second image characteristic diagram and the first image characteristic diagram are compared in this way, all the compared pixel points with larger gray values are obtained, so that a fusion image is constructed, and a candidate target region is obtained.

Further, compressing the gray value of the pixel point in the second image feature map to a gray value range consistent with the gray value range of the first image feature map further comprises: if the gray value of the compressed second image feature map is non-integer, performing approximate calculation through the following formula to obtain an approximate gray value I_a(x,y)：

In this embodiment, the gray scale value of the compressed second image feature map is likely not an integer, and for the gray scale value that is not an integer, the approximate calculation is performed by a rounding method; after the approximate calculation, the gray values of the compressed second image feature map can be guaranteed to be integers, so that the subsequent fusion calculation is facilitated.

In an alternative embodiment, the convolutional neural network comprises three convolutional layers, three pooling layers and three fully-connected layers; step S400, inputting the candidate target region into a convolutional neural network to obtain a detection result of the animal body shape, which specifically includes:

The embodiment of the invention obtains the image characteristic diagram of the original image by processing the original image; simultaneously registering the depth image with the image feature map of the original image to obtain an image feature map of the depth image; superposing the image characteristic diagram of the original image and the image characteristic diagram of the depth image to obtain a multi-source image characteristic fusion diagram to obtain a candidate target area; and inputting the candidate target area into a discrimination model of the convolutional neural network so that the convolutional network discrimination model outputs an animal body shape detection result. The invention improves the accuracy and the real-time property of the animal body shape detection.

Fig. 2 is a schematic diagram of an animal body shape detection device based on a convolutional neural network according to an embodiment of the present invention, and the animal body shape detection device based on the convolutional neural network shown in fig. 2 includes:

and the temperature detection result module is used for inputting the candidate target area into the convolutional neural network to obtain the detection result of the animal body shape.

The device of the embodiment of the invention can be used for executing the technical scheme of the embodiment of the animal body shape detection method based on the convolutional neural network shown in fig. 1, and the implementation principle and the technical effect are similar, and are not repeated here.

Fig. 3 shows a schematic diagram of a framework of an electronic device according to an embodiment of the invention.

Referring to fig. 3, the apparatus includes: a processor (processor)601, a memory (memory)602, and a bus 603; wherein, the processor 601 and the memory 602 complete the communication with each other through the bus 603;

the processor 601 is configured to call program instructions in the memory 602 to perform the methods provided by the above-mentioned method embodiments, for example, including: carrying out feature extraction on an original image of an animal by using a Gabor sequencing feature descriptor with illumination invariance to obtain a first image feature map; registering the depth image of the animal with the first image feature map to obtain a second image feature map; superposing the first image feature map and the second image feature map, and obtaining a candidate target region according to the superposed feature fusion map; and inputting the candidate target area into a convolutional neural network to obtain a detection result of the animal body shape.

Another embodiment of the present invention discloses a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the methods provided by the above-mentioned method embodiments, for example, including: carrying out feature extraction on an original image of an animal by using a Gabor sequencing feature descriptor with illumination invariance to obtain a first image feature map; registering the depth image of the animal with the first image feature map to obtain a second image feature map; superposing the first image feature map and the second image feature map, and obtaining a candidate target region according to the superposed feature fusion map; and inputting the candidate target area into a convolutional neural network to obtain a detection result of the animal body shape.

Another embodiment of the invention provides a non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform a method provided by the above method embodiments, for example, comprising: carrying out feature extraction on an original image of an animal by using a Gabor sequencing feature descriptor with illumination invariance to obtain a first image feature map; registering the depth image of the animal with the first image feature map to obtain a second image feature map; superposing the first image feature map and the second image feature map, and obtaining a candidate target region according to the superposed feature fusion map; and inputting the candidate target area into a convolutional neural network to obtain a detection result of the animal body shape.

Those of ordinary skill in the art will understand that: the implementation of the above-described apparatus embodiments or method embodiments is merely illustrative, wherein the processor and the memory may or may not be physically separate components, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A method for detecting the shape of an animal, comprising:

inputting the candidate target area into a convolutional neural network to obtain a detection result of the animal body shape;

registering the depth image of the animal with the first image feature map to obtain a second image feature map, specifically comprising:

2. The method according to claim 1, wherein the extracting features of the original image of the animal by using a Gabor sequencing feature descriptor with illumination invariance to obtain a first image feature map specifically comprises:

3. The method of claim 2, wherein the Gabor filter is expressed by:

as a scale parameter in the x-direction of the Gabor filter,

as a scale parameter in the y-direction of the Gabor filter,

for the value of x for the k-th angular direction,

the expression of the sequencing filter is:

where ω and represents the center position and scale of the sequencing filter, N, respectively_pNumber of positive lobes, N_nNumber of negative lobes, coefficient C_pAnd C_nIs a coefficient that keeps the number of positive and negative lobes balanced, and N_pC_p＝N_nC_n；_piIs a scale parameter of the positive lobe,_njscale parameter of the negative lobe, ω_piIs the central position of the positive lobe, ω_njIs the central position of the negative lobe;

4. The method according to claim 1, wherein the superimposing the first image feature map and the second image feature map and obtaining the candidate target region according to the superimposed feature fusion map specifically include:

5. The method of claim 4, wherein the compressing the gray scale values of the pixels in the second image feature map to a gray scale value range consistent with the first image feature map further comprises: if the gray value of the compressed second image feature map is non-integer, performing approximate calculation through the following formula to obtain an approximate gray value I_a(x,y)：

6. The method of claim 1, wherein the convolutional neural network comprises three convolutional layers, three pooling layers, three fully-connected layers; inputting the candidate target area into a convolutional neural network to obtain a detection result of the animal body shape, which specifically comprises:

7. An animal body shape detection device, comprising:

the second image feature map module is used for registering the depth image of the animal with the first image feature map to obtain a second image feature map; registering the depth image of the animal with the first image feature map to obtain a second image feature map, specifically comprising:

wherein x is₁Is the value of the first characteristic image in the x direction, y₁Is a value in the y direction of the first feature image, x₂Is the value of the second characteristic image in the x direction, y₂Is the value of the y direction of the second feature image, t_xIs the translation value, t, in the x direction_yThe value is a translation value in the y direction, s is a scaling scale, and theta is a counterclockwise rotation angle taking (x, y) as an axis;

8. A non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform the method of any one of claims 1 to 6.