CN111914815A

CN111914815A - Machine vision intelligent recognition system and method for garbage target

Info

Publication number: CN111914815A
Application number: CN202010924563.5A
Authority: CN
Inventors: 黄纯根; 冼海仪
Original assignee: Guangdong Kunpeng Intelligent Machine Equipment Co ltd
Current assignee: Guangdong Kunpeng Intelligent Machine Equipment Co ltd
Priority date: 2020-09-05
Filing date: 2020-09-05
Publication date: 2020-11-10

Abstract

The invention discloses a machine vision intelligent recognition system of a garbage target, which comprises a milemeter module, a target recognition module and a mapping module, wherein the system of the invention establishes a dynamic map of garbage target distribution, reduces the monitoring pressure of managers and improves the working efficiency of cleaning personnel, and also discloses a machine vision intelligent recognition method of the garbage target, which comprises the following steps: the method comprises the steps of calculating pose data, identifying garbage targets, establishing a point cloud map, establishing a training set and a testing set, establishing an improved yolov3 model, training a network and testing the network, wherein the method uses a deep separable convolution to replace a part of traditional convolution in darknet-53, the calculated amount of the model is greatly reduced, the garbage targets can be identified more accurately based on an improved yolov3 target identification algorithm, the parameters of the model are fewer, the time for training and optimizing is reduced, the identification speed of the model is improved, and the requirement on hardware is reduced.

Description

Machine vision intelligent recognition system and method for garbage target

[ technical field ]

The invention relates to the field of machine vision and machine learning, in particular to a machine vision intelligent recognition system and method for a garbage target.

[ background art ]

With the increase of the consumption level of residents, the quantity of household garbage generated by people in the community is continuously increased. The waste domestic garbage not only seriously affects the environmental sanitation of communities, but also is easy to breed mosquitoes and spread diseases so as to harm the health of people. At present, domestic garbage in the community is cleaned regularly by cleaning workers, the mode has good effect on treating the domestic garbage in the downstairs garbage can of residents, but the effect is poor and manpower is wasted for the domestic garbage which is nearly randomly generated in other areas in the community. Therefore, there is a need to develop a visual system capable of monitoring the changing conditions of the domestic waste in the community.

The existing garbage identification system is mainly provided with fixed cameras, and when the garbage change condition of the whole community needs to be monitored, more cameras need to be arranged; meanwhile, the identification system cannot integrate the identification results from all paths of monitoring videos into one map, so that huge challenges are brought to monitoring management personnel in the background.

In addition, the current garbage machine vision identification method mostly adopts two-step methods (R-CNN and false-RCNN) based on a deep learning target identification algorithm, but the method has the problems of more complicated training process, more difficult optimization result, large calculated amount, longer time for processing one frame of image, higher requirement on a hardware platform and the like. .

The present invention has been made in view of the above problems.

[ summary of the invention ]

The invention aims to provide a machine vision intelligent recognition system and a machine vision intelligent recognition method for garbage targets, which aim to overcome the defects of the prior art.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows:

a machine vision intelligent recognition system for garbage targets comprises

The odometer module is used for calculating pose data of the robot;

the target recognition module is used for searching the position of the junk target in the image by using a target recognition algorithm and calculating the position data and the depth data of the junk target;

and the mapping module is used for receiving and processing the pose data of the odometer module, the garbage target position data and the depth data of the target identification module to obtain the coordinates of the garbage target in a world coordinate system, and splicing and constructing to obtain a point cloud map of the garbage target.

The machine vision intelligent recognition system for the garbage target comprises the wheel type odometer and the laser odometer.

The machine vision intelligent recognition system for the garbage target comprises a high-definition camera and a target recognition module.

A machine vision intelligent identification method of a rubbish target comprises the following steps:

step A, calculating pose data:

a1, wheel type odometer calibration: collecting a plurality of data segments, each data segment comprising angular velocities W of two wheels_LAnd W_RCalculating the distance b between the two wheels and the radius of the two wheels by a model-based method according to the duration of the data and the matching value of the laser radar;

a2, removing the motion distortion of the laser radar data: the industrial personal computer reads laser data, the STM32 uploads odometer integral data, whether the position and attitude data in an odometer queue are synchronous with the laser radar data in time is checked, if so, the processing is not carried out, otherwise, secondary interpolation is carried out on the position and attitude data, and the laser radar data is transformed according to the position and attitude data to obtain the laser radar data with motion distortion removed;

a3, using a front-end registration algorithm on the acquired lidar data: firstly, searching a matching point in the point cloud, then carrying out pose change and error calculation on the point cloud according to a matching calculation rotation matrix R and a translation matrix T, and continuously iterating until the error is smaller than a tolerance value.

A4, optimizing the pose graph based on a graph optimization technology;

step B, identifying a garbage target:

the target recognition module finds the position of the junk target in the image by using a target recognition algorithm and then issues position data and depth data of the junk target;

step C, establishing a point cloud map:

the mapping module processes data from the odometer module and the target identification module, and performs coordinate transformation according to the following formula:

z＝d

and transforming the coordinates of the target under the camera coordinate system by using the optimized pose to obtain the coordinates of the target under the world coordinate system, and finally splicing and constructing a point cloud map.

The machine vision intelligent identification method of the rubbish target further comprises the following steps:

step D, establishing a training set and a testing set:

d1, refining the definition of the garbage target: selecting a plurality of waste products as definitions of garbage targets;

d2, collecting photographs of the waste: acquiring a corresponding sample image by adopting a field acquisition mode and a crawling mode;

d3, preprocessing of the image: cropping the collected image data and transforming to 416 x 416 standard sample data;

d4, amplification of sample data: sample data is expanded in the following way:

d4-1. magnification of width and height of sample image 1.5 times;

d4-2. width reduction 1/3 and height reduction 1/2 of the sample image, and ensuring that the image is a multiple of 32;

d4-3, brightness enhancement of the sample image;

d4-4, the brightness of the sample image is reduced;

d4-5, turning the sample image by 90 degrees or 180 degrees;

d4-6, adding noise to the sample image;

d5, manual labeling: according to the definition in the step D1, labeling the image data in the sample set by using an image labeling tool to obtain label data (x, y, w, h, c);

d6, partitioning of sample set: dividing a sample set into a training set and a test set according to a ratio of 98: 2;

step E, constructing an improved yolov3 model:

respectively creating a convolutional layer, a depth separable convolutional layer, an upsampling layer, a residual layer, a splicing layer and a prediction layer, wherein the upsampling layer uses a bilinear interpolation algorithm to expand a feature map, the residual layer adds the output of a lower layer and the output of an upper layer, the splicing layer superposes the output of the upper layer containing rich semantics and the output of a lower layer with higher resolution, the prediction layer comprises 3 feature maps, each grid of each feature map also comprises 3 anchor boxes with different length-width ratios, and each anchor box is a1 × 20 vector and comprises the following prediction information: t is t_x,t_y,t_w,t_hThe bounding box contains the confidence of the object and the probability that the object belongs to the garbage target;

step F, training the network:

initializing the weight of the model by using a random initialization strategy, then, throwing training data into the model, predicting each grid of the layer feature diagram after forward propagation to obtain a prediction result of 3 x (4+1+15), and finally, calculating the error between the prediction result and the real result according to the following cost function formula:

calculating an error between a predicted value and a true value through the cost function, then reversely propagating the error, updating the weight and deviation of the network through a momentum gradient descent strategy, and stopping training when the loss value of the cost function is small and stable, namely, convergence;

step G, network testing:

test set data is lost into the model, a prediction result is obtained, and a confidence threshold of a boundary box is set to be Th_scoresAnd (5) rejecting the prediction result with low confidence coefficient, setting a non-maximum suppression threshold value of 0.5, executing non-maximum suppression, preventing the same target from being identified in multiple ways, and finally outputting the corrected prediction result.

Compared with the prior art, the invention has the beneficial effects that:

1. the invention establishes a dynamic map of garbage target distribution based on the pose of the odometer module, reduces the monitoring pressure of managers and improves the working efficiency of cleaning personnel.

2. According to the method, a part of traditional convolution in darknet-53 is replaced by the deep separable convolution, the calculated amount of the model is greatly reduced, the garbage target can be identified more accurately based on the improved yolov3 target identification algorithm, the parameters of the model are fewer, the time for training and optimizing is shortened, the identification speed of the model is increased, and the requirement on hardware is lowered.

[ description of the drawings ]

FIG. 1 is a schematic structural view of the present invention;

FIG. 2 is a computing flow diagram of the odometer module;

FIG. 3 is a flow chart of the present invention for performing object recognition;

FIG. 4 is a flow chart of the present invention for establishing a point cloud map;

fig. 5 is a schematic structural diagram of the modified yolov3 model of the present invention.

[ detailed description of the invention ]

Reference will now be made in detail to the present preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout.

As shown in fig. 1 to 5, a machine vision intelligent recognition system for a garbage target includes:

the odometer module 1 is used for calculating pose data of the robot;

the target recognition module 2 is used for searching the position of the junk target in the image by using a target recognition algorithm and calculating the position data and the depth data of the junk target;

and the mapping module 3 is used for receiving the pose data of the mileage meter processing module 1, the garbage target position data and the depth data of the target identification module 2, obtaining the coordinates of the garbage target in a world coordinate system, and splicing and constructing to obtain a point cloud map of the garbage target.

The odometer module 1 comprises a wheel type odometer 11 and a laser odometer 12, the laser odometer 12 comprises a 16-line laser radar, the wheel type odometer 11 comprises 2 photoelectric encoders, and the odometer module 1 estimates the pose of the robot in real time.

The target recognition module 2 comprises a high-definition camera.

The invention establishes a dynamic map of garbage target distribution based on the pose of the odometer module 1, reduces the monitoring pressure of managers and improves the working efficiency of cleaning personnel.

step A, calculating pose data:

a1, calibration of the wheeled odometer 11: collecting n data, each data including angular velocities W of two wheels_LAnd W_RCalculating the distance b between the two wheels and the radius of the two wheels by a model-based method according to the duration of the data and the matching value of the laser radar;

A4, optimizing the pose graph based on a graph optimization technology;

step B, identifying a garbage target:

the target recognition module 2 finds the position of the junk target in the image by using a target recognition algorithm, and then issues position data and depth data of the junk target;

step C, establishing a point cloud map:

the mapping module 3 processes the data from the odometer module 1 and the target identification module 2, and performs coordinate transformation according to the following formula:

z＝d

wherein, (x, z) is the coordinate of the garbage target in the camera coordinate system, u is the position of the target in the image coordinate system, d is the distance between the target and the camera, f_xAnd c_xAll the parameters are internal parameters of the camera, coordinates of the target under a camera coordinate system are transformed by using the optimized pose to obtain coordinates of the target under a world coordinate system, point cloud splicing and filtering (an outlier removal filter and a down-sampling filter) are performed, and finally a point cloud map is spliced and constructed;

step D, establishing a training set and a testing set:

d1, refining the definition of the garbage target: selecting a plurality of waste products as definitions of garbage targets, specifically selecting 14 common waste products in communities as definitions of garbage targets, wherein the waste products comprise garbage bags, express packaging bags, take-out packing boxes, cat and dog excrement, orange peel, banana peel, shaddock peel, mashed apples, cigarette boxes, thick paper boards, waste newspapers, plastic beverage bottles, beer bottles and pop cans;

d4, amplification of sample data: sample data is expanded in the following way:

d4-1. magnification of width and height of sample image 1.5 times;

d4-3, brightness enhancement of the sample image;

d4-4, the brightness of the sample image is reduced;

d4-5, turning the sample image by 90 degrees or 180 degrees;

d4-6, adding noise to the sample image;

d5, manual labeling: labeling the image data in the sample set by using an image labeling tool imageLabel according to the definition in the step D1 to obtain label data (x, y, w, h, c), wherein (x, y) represents the center coordinates of the garbage target in the image, w and h represent the width and height of the garbage target respectively, c represents the category of the garbage target, c is 0 to represent the background, and c is 1 to 14 to sequentially represent the garbage targets defined in the step D1;

step E, constructing an improved yolov3 model:

respectively creating a convolutional layer, a depth separable convolutional layer, an upsampling layer, a residual layer, a splicing layer and a prediction layer, wherein the improved yolov3 model is composed of the 6 basic levels, the depth separable convolutional layer is used for replacing a3 multiplied by 3 convolutional layer in an original yolov3 residual module, the upsampling layer uses a bilinear interpolation algorithm to expand a feature map, the residual layer adds the output of a lower layer with the output of a higher layer, and the splicing layer adds the output of the higher layer containing rich semantics with the output of a lower layer with higher resolutionFor superposition, the prediction layer contains 3 feature maps, each grid of each feature map also contains 3 anchors box with different aspect ratios, each anchor box is a1 × 20 vector and contains the following prediction information: t is t_x,t_y,t_w,t_hThe bounding box contains the confidence of the object and the probability that the object belongs to the garbage target;

step F, training the network:

step G, network testing:

According to the method, a part of traditional convolution in darknet-53 is replaced by the deep separable convolution, the calculated amount of the model is greatly reduced, the garbage target can be identified more accurately based on the improved yolov3 target identification algorithm, the parameters of the model are fewer, the time for training and optimizing is shortened, the identification speed of the model is increased, and the requirement on hardware is lowered.

The above examples are merely preferred embodiments of the present invention, and other embodiments of the present invention are possible, such as a reasonable combination of the technical solutions described in the examples. Those skilled in the art can make equivalent changes or substitutions without departing from the spirit of the present invention, and such equivalent changes or substitutions are included in the scope set forth in the claims of the present application.

Claims

1. The utility model provides a machine vision intelligent recognition system of rubbish target which characterized in that: comprises that

The odometer module (1) is used for calculating pose data of the robot;

the target recognition module (2) searches the position of the junk target in the image by using a target recognition algorithm and calculates the position data and the depth data of the junk target;

and the mapping module (3) is used for receiving and processing the pose data of the odometer module (1), the garbage target position data and the depth data of the target identification module (2) to obtain the coordinates of the garbage target under a world coordinate system, and splicing and constructing to obtain a point cloud map of the garbage target.

2. The machine vision intelligent recognition system of a spam target according to claim 1, wherein: the odometer module (1) comprises a wheeled odometer (11) and a laser odometer (12).

3. The machine vision intelligent recognition system of a spam target according to claim 1, wherein: the target recognition module (2) comprises a high-definition camera.

4. A machine vision intelligent identification method of a rubbish target is characterized by comprising the following steps:

step A, calculating pose data:

a1, calibration of wheeled odometer (11): collecting a plurality of data segments, each data segment comprising angular velocities W of two wheels_LAnd W_RCalculating the distance b between the two wheels and the radius of the two wheels by a model-based method according to the duration of the data and the matching value of the laser radar;

A4, optimizing the pose graph based on a graph optimization technology;

step B, identifying a garbage target:

the target recognition module (2) finds the position of the junk target in the image by using a target recognition algorithm, and then releases the position data and the depth data of the junk target;

step C, establishing a point cloud map:

the mapping module (3) processes data from the odometer module (1) and the target identification module (2), and performs coordinate transformation according to the following formula:

z＝d

5. The machine vision intelligent recognition method of the junk target according to claim 4, further comprising the steps of:

step D, establishing a training set and a testing set:

d4, amplification of sample data: sample data is expanded in the following way:

d4-1. magnification of width and height of sample image 1.5 times;

d4-3, brightness enhancement of the sample image;

d4-4, the brightness of the sample image is reduced;

d4-5, turning the sample image by 90 degrees or 180 degrees;

d4-6, adding noise to the sample image;

step E, constructing an improved yolov3 model:

step F, training the network:

step G, network testing: