CN113658221B

CN113658221B - AGV pedestrian following method based on monocular camera

Info

Publication number: CN113658221B
Application number: CN202110857535.0A
Authority: CN
Inventors: 刘成菊; 袁家遥; 陈启军
Original assignee: Tongji University
Current assignee: Tongji University
Priority date: 2021-07-28
Filing date: 2021-07-28
Publication date: 2024-04-26
Anticipated expiration: 2041-07-28
Also published as: CN113658221A

Abstract

The invention relates to an AGV pedestrian following method based on a monocular camera, which comprises the following steps: 1) Detecting a pedestrian target: obtaining a pedestrian target detection frame according to a pedestrian detection model deployed on the upper computer; 2) Calibrating a homography matrix by using a monocular camera: acquiring a homography matrix H from a three-dimensional world coordinate system to a two-dimensional pixel coordinate system; 3) And (3) solving pedestrian coordinates: calculating coordinates of contact points of pedestrians and the ground under a three-dimensional world coordinate system, namely world coordinates (x _w,y_w) of a pedestrian target; 4) Real-time calculation of mobile robot speed and angle: respectively designing a linear speed PID controller and an angle PID controller to calculate the linear speed and the angular speed of the mobile robot in real time; 5) Controlling chassis movement: the upper computer issues linear speed and angular speed information to the lower computer through the ROS system, and the lower computer solves the speed instruction into the expected rotating speed of the driving motor, so that the AGV can follow the pedestrian. Compared with the prior art, the method has the advantages of simplicity, convenience, high efficiency, real-time accuracy, distributed communication of the master and slave computers, small calculated amount and the like.

Description

AGV pedestrian following method based on monocular camera

Technical Field

The invention relates to the field of robot target detection and tracking control, in particular to an AGV pedestrian following method based on a monocular camera in an indoor environment.

Background

The rapid development of the related technology of computer vision and the improvement of the hardware calculation speed lead the functions of the robot to be greatly enriched, the mobile AGV service robot uses the monocular camera to simply, conveniently and efficiently estimate the distance and the angle of the target pedestrian based on the ROS, maintains a certain safe distance, and has wide application value in the scenes of office work, inspection, welcome and the like in the future.

The existing technologies for estimating the distance of a pedestrian target by using a sensor mainly include the following steps:

Firstly, a ranging method based on a laser radar comprises the following steps: a common method for laser ranging is a triangular reflection method. And the triangular distance measurement is carried out on the moving distance of the light spot reflected by the measured object on the CCD sensor, and then the distance and relative angle value between the target and the radar are estimated by the triangular relation formed by the incident light and the reflected light. The laser ranging operation is simple and convenient, the speed is high, the measuring precision is high, the millimeter level can be achieved, meanwhile, the common handheld laser measuring range can reach 200 meters at the highest, the distance can still be measured well for scenes with poor light conditions, but the method has high requirements on the cleanliness of the sensor and the environmental humidity, and the hardware cost is too high.

Secondly, a ranging method based on a binocular camera comprises the following steps: the information collected by the camera sensor is more comprehensive than the laser, and contains color information. Binocular ranging is similar to human eye perception, after calibration, correction and binocular matching, image parallax acquired by the binocular on the same target is calculated, ranging of pixel points in the image is achieved, a mobile robot can brake obstacles in real time according to the change of distance information, but the method is large in calculated amount, high in requirement on illumination conditions, difficult to match scenes lacking visual features, and distance measurement errors are increased.

Thirdly, a distance measuring method based on a depth camera: the common TOF structured light method for the depth camera is to calculate the target distance by calculating the time difference between the infrared rays emitted by the IR module and the reflected light rays received by the receiver and calculating the product of the flight time and the light velocity. However, the method has the defects that the measurement range is limited by the camera baseline, the accuracy requirement on time measurement is high, the accuracy of the depth camera in estimating the distance is difficult to reach millimeter level, and the distance cannot be accurately measured on black objects, transparent objects and objects in a short distance.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide an AGV pedestrian following method based on a monocular camera.

The aim of the invention can be achieved by the following technical scheme:

an AGV pedestrian following method based on a monocular camera is used for realizing the AGV pedestrian following in an indoor environment, and comprises the following steps:

1) Detecting a pedestrian target: realizing real-time pedestrian target detection according to a pedestrian detection model deployed on the upper computer to obtain a pedestrian target detection frame;

2) Calibrating a homography matrix by using a monocular camera: acquiring images and corresponding multiple groups of characteristic point pairs on the ground through a monocular camera according to the priori condition that the pedestrian target is positioned on the ground plane, and calibrating to obtain a homography matrix H from a three-dimensional world coordinate system to a two-dimensional pixel coordinate system;

3) And (3) solving pedestrian coordinates: calculating coordinates of contact points of pedestrians and the ground under a three-dimensional world coordinate system according to the calibrated homography matrix H and the two-dimensional pixel coordinates of the midpoint of the bottom edge of the pedestrian target detection frame, namely world coordinates (x _w,y_w) of the pedestrian target;

4) Real-time calculation of mobile robot speed and angle: acquiring distance deviation and angle deviation of a pedestrian relative to the mobile robot according to world coordinates of a pedestrian target, and respectively designing a linear speed PID controller and an angle PID controller to calculate the linear speed and the angular speed of the mobile robot in real time;

5) Controlling chassis movement: the upper computer issues linear speed and angular speed information to the lower computer through the ROS system, and the lower computer solves the speed instruction into the expected rotating speed of the driving motor according to the AGV kinematic model, so that the AGV follows the pedestrian.

In the step 1), a MobileNet improved SSD single-stage target detector is adopted as a pedestrian detection model, and the specific structure is as follows:

and MobileNet v2 is adopted to replace VGG-16 in the SSD original model as a backbone network backup to extract the characteristics, and the SSD classifier is still used by the classifier.

In the step 2), the number of the groups of the selected feature points is four because the practical degree of freedom of the homography matrix H is only eight.

In the step 3), considering that the three-dimensional world coordinate system takes the intersection point of the camera fixed at the edge of the mobile robot and the ground as the origin of coordinates, the world coordinate point of the pedestrian target after the radius of the mobile robot is calculated by taking the radius r _AGV of the mobile robot into account, and the world coordinate point coordinate of the pedestrian target after the radius of the mobile robot is calculated by taking the coordinate point as the origin of coordinates (x _w+r_AGV,y_w).

The design steps of the linear speed PID controller and the angle PID controller comprise:

41 Initialization setting, specifically:

411 Setting an initial linear velocity cmdv, an initial angular velocity cmdw, and a starting rotation angle;

412 Acquiring world coordinate point coordinates (x _w+r_AGV,y_w) of a pedestrian target counted by the radius of the mobile robot at the current moment;

413 Calculating the distance between the mobile robot and the pedestrian after accounting for the safety distance at the current moment Wherein d _safe is a set safety distance for preventing the mobile robot from tracking too close to the pedestrian;

42 A linear speed PID controller is designed, specifically:

Setting control parameters of a linear speed PID controller, taking the distance between the mobile robot and the pedestrian after the safety distance is counted at the current moment as the input of the linear speed PID controller, setting the expected value of the distance between the mobile robot and the pedestrian as 0, and controlling the linear speed v according to the distance error value;

43 The design angle PID controller is specifically:

Setting control parameters of an angle PID controller, taking the angle between the mobile robot and the pedestrian after the safety distance is counted at the current moment as input, setting the expected angle value between the mobile robot and the pedestrian as 0 degree, and obtaining the control angular speed w according to the angle error value.

The linear speed PID controller and the angle PID controller are both P controllers.

In the step 413), the set safety distance d _safe is 0.5m.

In the step 411), the initial rotation angle is set to 10 degrees to reduce unnecessary frequent rotation of the robot.

In the step 5), the upper computer adopts Jetson Nano visual computing cards, the lower computer is a mobile robot controller, and the upper computer and the lower computer communicate through a Topic Topic communication mode.

The Jetson Nano visual computing card is used for completing a main thread task and comprises the steps of processing visual perception information, detecting a pedestrian target and calculating a distance, communicating calculated linear speed and angular speed results through topics of an ROS system, and creating a publisher publishing speed message;

the mobile robot controller is used for completing the task of the sub-thread and comprises motion control according to the received linear speed and angular speed.

Compared with the prior art, the invention has the following advantages:

1. The invention utilizes a monocular estimation distance method by selecting at least four groups of characteristic point pairs: coordinate points of the pixel plane and the coordinate positions of the coordinate points relative to the robot, so that a homography matrix of the transformation from the pixel plane to the ground is solved, world coordinates of pedestrian targets in the image are further calculated, the estimation of the distance and the angle of the pedestrian by the robot is realized, and the simple, convenient, efficient and real-time tracking of the pedestrian in the indoor environment is realized.

2. According to the invention, the topic and the node sharing between Jetson Nano and the industrial personal computers are realized by utilizing the distributed communication of the master-slave machine based on the ROS, then the P control is used for controlling the linear speed and the angular speed through the distance and the angle error, and finally Jetson Nano distributes the calculated speed information to the industrial personal computers of the AGV mobile robot in a topic communication mode, so that the calculation pressure of the mobile robot is dispersed, and the real-time performance of following the target is improved.

Drawings

FIG. 1a is a flow chart of the method of the present invention.

FIG. 1b is a schematic illustration of homography matrix calibration.

Fig. 1c is a schematic diagram of pedestrian coordinate point selection.

Fig. 2 is a schematic diagram of a pedestrian detection running and displaying window at a terminal.

FIG. 3 is a diagram of a pixel coordinate to world coordinate transformation.

Fig. 4 is a schematic diagram of four pairs of feature point pair selection.

Fig. 5 is a speed P control block diagram.

Fig. 6 is a bottom layer control block diagram.

FIG. 7 is a partial ROS node running chart.

Fig. 8 is a pedestrian tracking result diagram.

Detailed Description

The invention will now be described in detail with reference to the drawings and specific examples. The present embodiment is implemented on the premise of the technical scheme of the present invention, and a detailed implementation manner and a specific operation process are given, but the protection scope of the present invention is not limited to the following examples.

The invention provides an AGV pedestrian following method based on a monocular camera, which realizes the following of an indoor mobile robot to a pedestrian, wherein the flow diagram of the method is shown in figures 1a-1c, and specifically comprises the following steps:

S1, detecting a pedestrian target: firstly, mobileNet is utilized to improve an SSD single-stage target detector, and a model is deployed on an AGV embedded platform through TensoRT acceleration model reasoning, so that real-time pedestrian target detection is realized, and the speed reaches more than 22 FPS;

S2, calibrating a homography matrix by using a monocular camera: according to the invention, only one monocular camera is used for realizing distance estimation, the priori condition that a pedestrian target is positioned on a ground plane is utilized, and a homography matrix H from the ground to a camera imaging plane is calculated by collecting at least 4 groups of feature point pairs (P (u, v), P (x, y)) corresponding to the image and the ground;

s3, solving pedestrian coordinates: calculating coordinates (x, y) of a contact point between the pedestrian and the ground under a camera coordinate system by using the calibrated homography matrix H and a midpoint pixel coordinate (u, v) at the bottom edge of the pedestrian target detection frame to obtain world coordinates of the pedestrian target, and calculating the distance between the pedestrian and the robot;

S4, the PID control robot follows the pedestrian: firstly, calculating distance deviation and angle deviation of a traveler relative to a robot by utilizing coordinates (x, y) of the pedestrians, and then designing a linear speed PID controller aiming at the distance deviation to calculate the linear speed v of the mobile robot in real time; and designing an angle PID controller aiming at the angle deviation, and calculating the angular speed w of the robot.

S5, controlling chassis movement by utilizing the ROS system: the upper computer of the mobile robot issues a speed message (v, w) to the lower computer, the lower computer utilizes an AGV kinematic model to calculate a speed instruction into the expected rotating speed of the driving motor, and then utilizes the motor encoder to feed back the current rotating speed of the motor to carry out closed-loop control, so that the AGV follows pedestrians.

In step S1, the present example employs a pre-training model ssd_ mobilenet _v2 provided by nvidia authorities, which takes days to complete training using a coco 91 object dataset, selects an SSD single-stage detector to detect a pedestrian target, and lightweight the SSD using MobileNet v2 to meet the requirements of the mobile device for real-time and accuracy, to complete pedestrian tracking identification, compatible with Jetson Nano computing cards.

The pedestrian detection model deployed on Jetson Nano computing cards uses MobileNet v to replace VGG-16 in an SSD original SSD model as a backbone network backup extraction feature, and the SSD classifier is still used by the classifier. Because of the different backbone networks, the size of the output feature map and the number of default frames of the feature extraction part are necessarily changed, the number of default frames is reduced from 8732 to 2268, the size of the feature map is changed to [19,10,5,3,2,1], and the calculated amount is greatly reduced. As shown in the coco data set test result, although the detection accuracy is slightly reduced, the parameter quantity is reduced by about 10 times, and the speed is improved by about 20 times, so that the coco data set test result is very suitable for being used as a target detection network model on a mobile robot, and the accuracy is ensured, and the calculation force and real-time requirements of equipment are met.

In the implementation of step S1, it should be noted that the transformation of the current Jetson Nano port number can be updated in real time by looking at the port number of the current camera on the computing card using ll/dev/video at the terminal. And using a single-stage detection model SSD based on SSD MobileNet v < 2 >, replacing the backbone network VGG-16 with MobileNet v < 2 >, loading the pedestrian detection model, using circulation to judge whether the currently detected target label is person or not, and if yes, completing subsequent coordinate transformation speed calculation. The terminal is started under the folder where the source code is located, a legal command is input to select the port number/dev/video 0, the display resolution can be set through input-with or input-height, and finally the detected target type, confidence level, detection frame width, height, area, pixel coordinates of the midpoint of the bottom edge, world coordinates, distance and angle values of the pedestrian target and the robot are displayed in real time, the detection speed can reach 22FPS, and single detection cycle time consumption is only 47.59771ms after CUDA acceleration is used. Fig. 2 is a schematic diagram of a pedestrian detection running and displaying window at a terminal.

In the implementation process of step S2, the conversion from the two-dimensional pixel coordinate system to the three-dimensional world coordinate system needs to be completed, as shown in fig. 3, the image displayed by the display is a result under the two-dimensional pixel coordinate system, the upper left corner of the image result is taken as the origin, the right is taken as the x positive half axis, and the downward is taken as the y positive half axis. The intersection point of the camera on the mobile robot extending downwards and vertically and the grassland is taken as the origin of a world coordinate system, the front of the robot is an x _w positive half shaft, the left is a y _w positive half shaft, and the upper is a z _w positive half shaft. By simplifying the problem with z _w =0 to detect only the object located on the grass ground plane or the intersection point of the object and the grass ground plane, it is known that in the projection picture of the same object under two coordinate planes, a one-to-one matrix transformation relationship exists between each pixel point, and coordinate transformation can be completed through the homography matrix homography.

The prior condition that the sole of a pedestrian is positioned on the ground is converted into the distance from one point on the ground to the imaging plane of the camera, the midpoint of the bottom edge of a pedestrian target detection frame is used as the intersection point of the pedestrian and the ground, the coordinate of the point is calculated, and the three-dimensional world coordinate of the target relative to the robot is solved by solving a rotation matrix R transformed by a rigid coordinate system and a translation vector t through rotating first and translating then by using the two-dimensional pixel coordinate because the target can be approximately a rigid body. However, the difficulty of solving the changed R and t in real time is large, and the ranging result is not ideal, so that the problem is simplified in the example: let z _w =0, namely only detect the goal located on the grass ground plane, or the intersection point of goal and grass ground plane, change from two-dimentional pixel coordinate to three-dimentional coordinate to change to two-dimentional coordinate from three-dimentional coordinate, known that the same goal has the matrix transformation relation of one-to-one between every pixel point in the projection picture under two coordinate planes, can finish the coordinate transformation through homography matrix homography. Since the camera position is fixed, the rotation matrix of the outlier is fixed, while the homography matrix H contains the outlier rotation matrix R and the outlier matrix a, and is therefore also fixed.

Because a homogeneous coordinate system is used, arbitrary scaling can be performed, so that the practical degrees of freedom of the homography matrix are only eight, at least four pairs of corresponding points are needed to calculate, pixel coordinates can be displayed in a terminal in a mouse click graph by calling a cv2.findHomoprography (pts_src, pts_dst) function of opencv, the upper left corner of the image is the origin of the pixel coordinate system, and the corresponding coordinate points in the pts_src image and the coordinate point matrix of the measured pts_dst practical world coordinate system are modified. As shown in fig. 4, the mouse clicks four feature points in the image: left and right boundary points of the checkerboard near and left and right boundary points of the black partition board far, and finally obtaining homography matrix

In the implementation process of the step S3, the homography matrix solved by the step two can solve the distance and the angle of the pedestrian relative to the robot on the actual ground from the position of the pixel plane. Assuming that the aligned coordinates of the corresponding features in the image are (u, v, 1) and (x, y, 1), the world coordinates of the travel person can be calculated as (x, y, 1) =h· (u, v, 1). Since the world coordinate system uses the intersection point of the camera fixed at the edge of the mobile robot and the artificial grass as the origin of coordinates, the radius of the robot is also required to be counted as 18.15cm, and the world coordinate of the actual pedestrian target relative to the robot is (x+0.1815, y). Finally, the current distance is calculated asThe safety distance is set to be 0.5m, the situation that the pedestrian is hit due to too close tracking is prevented, the current azimuth angle=arctan (y _w/(x_w + 0.1815)) is calculated, the initial rotation angle theta=10° =10·pi/180 is set, and unnecessary frequent rotation of the robot is reduced.

In the implementation process of step S4, the P controller is used to control the speed by the distance and angle errors, and the control block diagram is shown in fig. 5, and specifically includes the following steps:

a) Initializing the setting:

1) Setting an initial linear velocity cmdv =0.1 m/s and an angular velocity cmdw =0.1 rad/s;

2) Reading current pedestrian world coordinates (personX, personY) from the real parameters of the pedestrian tracking callback function personCallback (), wherein the world coordinate system takes the intersection point of a camera fixed at the edge of the mobile robot and the artificial grassland as a coordinate origin, so that the radius of the robot is 18.15cm, and the world coordinates of an actual pedestrian target relative to the robot are (x _w+0.1815,y_w);

3) Finally, the current distance is calculated as The distance is set to be 0.5m, so that the situation that the pedestrian is hit due to too close tracking is prevented; calculate the current azimuth = arctan (y _w/(x_w + 0.1815)), and set the initial rotation angle θ = 10 ° = 10·pi/180, reducing unnecessary frequent rotation of the robot.

B) Design line speed P controller

The current linear speed is controlled by the distance error, and the tracking speed is faster when the distance error is larger. The distance error value is distance _d =0.0 m when the distance between the current target and the robot is ideally 0.0m under the safety distance. Through parameter adjustment experience selection k _p =0.6, when distance is more than 0.0m, cmdv (t) =e _distance·k_p =distance (t) ·0.6, the linear speed range is set to be 0.1 m/s-0.4 m/s.

C) Design angular velocity P controller

Ideally, angle _d =0° between the pedestrian target and the robot, k _p =2.0 is selected through parameter adjustment experience, and when angle > θ, cmdw (t) =e _angle*k_p =angle (t) ×2.0, an angular velocity range of 0.15rad/s to 0.65rad/s is set, cmdw =sign (cmdw) ×min (max (| cmdw |, 0.15), 0.65).

In the implementation process of step S5, the present invention selects the Topic communication mode to realize the data exchange between the mobile robot controller and the computing card Jetson Nano, after the master and slave machines add addresses and names to each other, jetson Nano is connected with the host WIFI to share the rosmaster master node of the industrial robot with the start-up self-start under the same network segment, thus completing the subsequent speed Topic release, etc., creating a publisher track, the publisher cmdPub releases topics/cmd_vel at 20 times per second, updates the level. The single motor speed control adopts a PID algorithm, the actual speed is fed back through an encoder, PWM driving waveforms corresponding to the duty ratio are sent to a motor driving plate, the power motor is controlled to rotate, and the closed-loop control of the double motor speeds is completed. FIG. 6 is a bottom level control block diagram, FIG. 7 is a partial ROS node running chart, and FIG. 8 is a pedestrian tracking result diagram.

Compared with the pedestrian following method in the prior art, the indoor pedestrian following method provided by the invention has the following main innovation points: firstly, the invention utilizes a monocular distance estimation method without using a binocular or depth camera, and at least four groups of characteristic point pairs are selected: the coordinate points of the pixel plane and the coordinate positions of the coordinate points relative to the robot solve a homography matrix of the transformation from the pixel plane to the ground, so that the world coordinates of the pedestrian target in the image can be calculated, and the estimation of the distance and the angle of the pedestrian by the robot is realized; secondly, the sharing of topics and nodes between Jetson Nano and the industrial personal computers is realized by utilizing the distributed communication of the master-slave machine based on the ROS, the P control is used for controlling the linear speed and the angular speed through the distance and the angle error, and Jetson Nano distributes the calculated speed information to the industrial personal computers of the AGV mobile robot in a topic communication mode. The two innovations disperse the calculation pressure of the autonomous mobile robot, and improve the real-time performance of the autonomous mobile robot following the target.

The foregoing describes in detail preferred embodiments of the present invention. It should be understood that numerous modifications and variations can be made in accordance with the concepts of the invention by one of ordinary skill in the art without undue burden. Therefore, all technical solutions which can be obtained by logic analysis, reasoning or limited experiments based on the prior art by the person skilled in the art according to the inventive concept shall be within the scope of protection defined by the claims.

Claims

1. The AGV pedestrian following method based on the monocular camera is characterized by comprising the following steps of:

5) Controlling chassis movement: the upper computer issues a linear speed and angular speed message to the lower computer through the ROS system, and the lower computer calculates a speed instruction into the expected rotating speed of the driving motor according to the AGV kinematic model, so that the AGV follows the pedestrian;

In the step 3), considering that the three-dimensional world coordinate system takes the intersection point of the camera fixed at the edge of the mobile robot and the ground as the origin of coordinates, the world coordinate point of the pedestrian target after the radius of the mobile robot is calculated is (x _w+r_AGV,y_w) when the radius r _AGV of the mobile robot is calculated;

41 Initialization setting, specifically:

42 A linear speed PID controller is designed, specifically:

43 The design angle PID controller is specifically:

2. The method for pedestrian following by an AGV based on a monocular camera according to claim 1, wherein in the step 1), a single-stage object detector of MobileNet modified SSD is adopted as the pedestrian detection model, and the specific structure is as follows:

3. The method according to claim 1, wherein in the step 2), the number of the selected feature points is four because the number of the actual degrees of freedom of the homography matrix H is only eight.

4. The method of claim 1, wherein the linear speed PID controller and the angular PID controller are P controllers.

5. The method according to claim 1, wherein in the step 413), the set safety distance d _safe is 0.5m.

6. The method according to claim 1, wherein in the step 411), the initial rotation angle is set to 10 degrees to reduce unnecessary frequent rotation of the robot.

7. The method for following the AGV pedestrian based on the monocular camera according to claim 1, wherein in the step 5), a Jetson Nano visual computing card is adopted by the upper computer, the lower computer is a mobile robot controller, and the upper computer and the lower computer are communicated through a Topic communication mode.

8. The method of claim 7, wherein the Jetson Nano vision computing card is configured to perform a main thread task, and includes processing vision perception information, detecting a pedestrian target and calculating a distance, and communicating the calculated linear speed and angular speed results through a ROS system topic to create a publisher publication speed message;