CN111382637A

CN111382637A - Pedestrian detection tracking method, device, terminal equipment and medium

Info

Publication number: CN111382637A
Application number: CN201811641863.1A
Authority: CN
Inventors: 熊友军; 白龙彪; 刘志超; 蒋晨晨; 刘洪剑; 庞建新
Original assignee: Ubtech Robotics Corp
Current assignee: Ubtech Robotics Corp
Priority date: 2018-12-29
Filing date: 2018-12-29
Publication date: 2020-07-07
Anticipated expiration: 2038-12-29
Also published as: CN111382637B

Abstract

The invention is suitable for the technical field of mobile terminals, and provides a pedestrian detection tracking method, terminal equipment and a medium, wherein the method comprises the following steps: acquiring a depth map corresponding to a target area; converting the depth map into a three-dimensional point cloud map, and identifying a space region containing a moving object in the three-dimensional point cloud map; determining a region of interest (ROI) corresponding to the space region in the depth map, and detecting head and shoulder features in the ROI; and if the head and shoulder features are detected to exist in the ROI, detecting and tracking the drive features existing in the ROI, and outputting pedestrian position information according to the moving track of the drive features. The invention realizes a pedestrian detection tracking algorithm based on multi-feature fusion, so that the robot can detect the head and shoulder features and the drying features of the pedestrian in a relatively accurate space area, and under the condition that the sensor cannot capture the whole image of the pedestrian, the robot can also continuously track the pedestrian according to the drying features obtained by detection, thereby improving the accuracy of pedestrian detection.

Description

Pedestrian detection tracking method, device, terminal equipment and medium

Technical Field

The invention belongs to the technical field of image processing, and particularly relates to a pedestrian detection tracking method, a pedestrian detection tracking device, terminal equipment and a computer-readable storage medium.

Background

With the rapid progress of computers, microelectronics and information technologies, the development speed of the robot technology is faster and faster, the intelligence is higher and higher, and the application range is greatly expanded. For service robots, pedestrian detection and tracking technologies are generally required to be used when performance of the service robots is improved in the aspects of intellectualization and socialization. For example, it is determined whether or not the obstacle avoiding operation needs to be performed by detecting whether or not a pedestrian is present in front.

In the existing robots, a sensor is generally installed, and pedestrian detection and tracking are realized by capturing images of a position area in front of the robot and recognizing human head-shoulder features or facial features. However, such a method has a large detection blind area, and may cause certain limitations on the structure, appearance design, and the like of the robot. For example, when the distance between the pedestrian and the robot is too close, the sensor cannot capture the whole image of the pedestrian, so that the head and shoulder features and the face features cannot be detected, and the absence of the pedestrian is determined, so that the pedestrian detection and tracking error is caused, and the detection accuracy is reduced. If such a problem is to be avoided, the installation position of the sensor on the robot may need to be severely limited, which may impose a great limitation on the structural design of the robot.

Disclosure of Invention

In view of this, embodiments of the present invention provide a pedestrian detection and tracking method, apparatus, terminal device and computer readable storage medium, so as to solve the technical problem that in the existing robot pedestrian detection and tracking method, the pedestrian detection accuracy is low.

A first aspect of an embodiment of the present invention provides a pedestrian detection and tracking method, including:

acquiring a depth map corresponding to a target area;

converting the depth map into a three-dimensional point cloud map, and identifying a space region containing a moving object in the three-dimensional point cloud map;

determining a region of interest (ROI) corresponding to the space region in the depth map, and detecting head and shoulder features in the ROI;

if the head and shoulder features are detected to exist in the ROI, detecting and tracking the drive features existing in the ROI, and outputting pedestrian position information according to the moving track of the drive features.

A second aspect of an embodiment of the present invention provides a pedestrian detection tracking apparatus, including:

the acquisition unit is used for acquiring a depth map corresponding to the target area;

the conversion unit is used for converting the depth map into a three-dimensional point cloud map, and identifying a space region containing a moving object in the three-dimensional point cloud map;

the determining unit is used for determining a region of interest (ROI) corresponding to the space region in the depth map and detecting head and shoulder features in the ROI;

and the detection unit is used for detecting and tracking the drying characteristics existing in the ROI if the head and shoulder characteristics exist in the ROI, and outputting pedestrian position information according to the moving track of the drying characteristics.

A third aspect of the embodiments of the present invention provides a terminal device, including a memory and a processor, where the memory stores a computer program operable on the processor, and the processor implements the steps of the pedestrian detection and tracking method when executing the computer program.

A fourth aspect of the embodiments of the present invention provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the processor implements the steps of the pedestrian detection and tracking method as described above.

In the embodiment of the invention, the depth map corresponding to the target area is acquired, the depth map is converted into the three-dimensional point cloud map, and the space area containing the moving object is identified in the three-dimensional point cloud map, so that the robot can detect the head and shoulder characteristics and the drive-dry characteristics of pedestrians in a relatively accurate space area, and the accuracy rate of detecting the characteristics of the pedestrians is improved; by detecting the head and shoulder features in the region of interest and then detecting the drying features associated with the head and shoulder features and outputting pedestrian position information according to the moving track of the drying features, the pedestrian detection and tracking algorithm based on multi-feature fusion is realized, so that under the condition that the sensor cannot capture the whole image of a pedestrian, the robot can also continue to track the pedestrian according to the drying features obtained through detection, and therefore the accuracy of pedestrian detection is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

FIG. 1 is a flow chart of an implementation of a pedestrian detection and tracking method provided by an embodiment of the invention;

fig. 2 is a flowchart of a specific implementation of the pedestrian detection and tracking method S102 according to an embodiment of the present invention;

fig. 3 is a flowchart of a specific implementation of the pedestrian detection and tracking method S1023 according to the embodiment of the invention;

fig. 4 is a flowchart of a specific implementation of the pedestrian detection and tracking method S104 according to an embodiment of the present invention;

fig. 5 is a flowchart of a specific implementation of the pedestrian detection and tracking method S104 according to another embodiment of the present invention;

fig. 6 is a block diagram of a pedestrian detection and tracking device according to an embodiment of the present invention;

fig. 7 is a schematic diagram of a terminal device according to an embodiment of the present invention.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

In order to explain the technical means of the present invention, the following description will be given by way of specific examples.

Fig. 1 shows an implementation process of the pedestrian detection and tracking method provided by the embodiment of the invention, and the method is suitable for terminal devices equipped with RGBD sensors, including but not limited to various mobile terminals such as mobile phones, tablet computers and handheld computers, and intelligent robots. The method flow includes steps S101 to S104. The specific realization principle of each step is as follows:

s101: and acquiring a depth map corresponding to the target area.

In the embodiment of the invention, the RGBD sensor is used for capturing RGB-D information in the target area. The RGB-D information includes R, G, B three-dimensional color channel information as well as depth information. Wherein the target area is a viewing angle area detectable by the RGBD sensor. And capturing RGB-D information corresponding to the target area according to a preset acquisition frequency, and generating a depth map and a color map corresponding to the target area based on the acquired RGB-D information.

S102: and converting the depth map into a three-dimensional point cloud map, and identifying a space region containing a moving object in the three-dimensional point cloud map.

In the embodiment of the invention, a three-dimensional point cloud picture is constructed according to the generated depth picture and the color picture. A three-dimensional point cloud map is a collection of multiple point clouds used to describe the surface characteristics of a target object. Each point cloud includes three-dimensional coordinate information and RGB color information.

And extracting the height value of any point cloud in the vertical direction from the three-dimensional coordinate information of the point cloud. And identifying the scene category in the three-dimensional point cloud picture according to the height value of each point cloud. The scene categories include, but are not limited to, wall, ground, roof, and moving object categories, which are divided according to setting instructions previously input by a developer.

Specifically, as an embodiment of the present invention, fig. 2 shows a specific implementation flow of the pedestrian detection and tracking method S102 provided by the embodiment of the present invention, which is detailed as follows:

s1021: and acquiring the height value of each point cloud in the three-dimensional point cloud picture.

S1022: determining a height interval to which the height value of the point cloud belongs, and outputting the scene category corresponding to the point cloud according to a corresponding relation between a pre-stored scene category and the height interval.

In the three-dimensional coordinate information (x, y, z) of the point cloud, its height value z is extracted.

In the embodiment of the invention, the corresponding relation between the pre-stored scene category and the height interval is loaded. And judging a height interval to which the height value of each point cloud in the three-dimensional point cloud picture belongs. And after the scene category corresponding to the height interval is obtained, outputting the scene category as the scene category corresponding to the point cloud.

Exemplarily, in the vertical direction, determining a scene class corresponding to the point cloud with the height value greater than a preset threshold a as a wall class; determining the scene type corresponding to the point cloud with the height value larger than a preset threshold value b as a roof type; and determining the scene category corresponding to the point cloud with the height value larger than c and smaller than d as the moving object. Wherein a, b, c and d are integers greater than zero.

Preferably, before determining a height interval to which the height value of the point cloud belongs and outputting a scene category corresponding to the point cloud according to a correspondence between a pre-stored scene category and the height interval, the method further includes:

and constructing a ground plane equation for the ground class scene in the target area. In the embodiment of the invention, since the installation position of the RGBD sensor on the robot is known, a developer can calculate a plane equation of the ground under a camera coordinate system. And acquiring a ground plane equation recorded by a developer.

And judging whether the three-dimensional coordinate information (x, y, z) of each point cloud in the three-dimensional point cloud picture meets the ground plane equation. If the judgment result is yes, determining that the scene type corresponding to the point cloud is a ground type; if the judgment result is negative, determining the scene type corresponding to the point cloud according to the height interval to which the point cloud height value belongs.

Preferably, after determining the height interval to which the height value of the point cloud belongs, further determining whether the point cloud is a continuous point cloud. And if the point cloud is a continuous point cloud, outputting a scene type corresponding to the point cloud according to a corresponding relation between a pre-stored scene type and the height interval.

S1023: and screening each point cloud with the scene category as a moving object, and selecting a space area to which each screened point cloud belongs in the three-dimensional point cloud picture.

In the embodiment of the invention, each point cloud with the scene type as the moving object is screened out, and the spatial region occupied by each point cloud in the three-dimensional point cloud picture is extracted through a preset image processing tool.

As shown in fig. 3, the step S1023 specifically includes:

s10231: and outputting the point cloud with the scene category as a moving object type point cloud.

S10232: and respectively carrying out overlook projection and horizontal projection on the moving object point cloud in the three-dimensional point cloud picture to obtain a projection intersection space.

S10233: and cutting the projection intersection space according to the height information of each point cloud of the moving object class in the projection intersection space to obtain a space area containing a single moving object.

In the embodiment of the present invention, each point cloud in which a scene type is determined as a moving object is referred to as a moving object type point cloud. And in the three-dimensional point cloud picture, performing overlook projection on the moving object type point cloud. And the overlook space region of each moving object can be obtained by segmentation according to the height value distribution of the point cloud. For example, if a moving object type point cloud with discontinuous height values is detected, it is determined that two moving objects are contained in the current spatial region by using a point partitioned by the point cloud with the minimum height value.

In the embodiment of the invention, the point cloud of the moving object is horizontally projected, so that the head-up space area occupied by the continuous moving object can be viewed at the head-up angle. And outputting a space area containing a single moving object according to the intersection of the overlook space area and the head-up space area of each moving object obtained by dividing.

S103: and determining a region of interest (ROI) corresponding to the space region in the depth map, and detecting head and shoulder features in the ROI.

In the embodiment of the invention, the space area occupied by each extracted moving object is converted into a depth map, so as to obtain a corresponding region of interest (ROI). And performing matching detection on the moving object by using a self-adaptive proportional template matching algorithm based on the preset head and shoulder image characteristics under the depth map. When the matching degree is greater than the threshold value, determining that the head and shoulder features of the detected pedestrian exist in the ROI, and the moving object of the space region is the pedestrian, and executing the step S104; if the head-shoulder feature of the pedestrian is not detected in the ROI, the process returns to the step S101.

S104: if the head and shoulder features are detected to exist in the ROI, detecting and tracking the drive features existing in the ROI, and outputting pedestrian position information according to the moving track of the drive features.

After locating a head-shoulder feature present in the ROI, a dribble feature corresponding to the same moving object as the head-shoulder feature is detected. The reliability of the trunk characteristics is enhanced through the head-shoulder characteristics obtained through real-time detection, and then the trunk characteristics of which the reliability is higher than a threshold value are tracked.

Outputting pedestrian position information according to the movement track of the detected dribble features in the depth map, wherein the pedestrian position information comprises: and outputting the position information corresponding to the pedestrian at each moment. For example, the pedestrian a is at the position point M at time t1, and moves to the position point N at time t 2.

In the embodiment of the invention, the depth map corresponding to the target area is acquired, the depth map is converted into the three-dimensional point cloud map, and the space area containing the moving object is identified in the three-dimensional point cloud map, so that the robot can detect the head and shoulder characteristics and the drive-dry characteristics of pedestrians in a relatively accurate space area, and the accuracy rate of detecting the characteristics of the pedestrians is improved; the pedestrian detection and tracking algorithm based on multi-feature fusion is realized by detecting the head and shoulder features associated with the head and shoulder features in the region of interest and outputting pedestrian position information according to the moving track of the head and shoulder features, so that under the condition that the sensor cannot capture the whole image of a pedestrian, the robot can also continuously track the pedestrian according to the detected head and shoulder features; meanwhile, the detection and false alarm success rate of the human body trunk characteristics is high, and the head and shoulder detection and false alarm success rate is very low, so that the accuracy of pedestrian detection is improved.

As an embodiment of the present invention, fig. 4 shows a specific implementation flow of the pedestrian detection and tracking method S104 provided by the embodiment of the present invention, which is detailed as follows:

s1041: if the head-shoulder feature is detected to exist in the ROI, marking a drive candidate region associated with the head-shoulder feature in the ROI.

S1042: and continuously updating and identifying the dribble characteristics in the dribble candidate area.

S1043: and if the head and shoulder features are detected not to exist in the target area, executing pedestrian tracking based on the updated drying features, and outputting pedestrian position information according to the moving track of the drying features.

In an embodiment of the invention, when the head and shoulder features are detected to be present in the ROI, an image segmentation operation is performed under the depth map. In the depth map, the depth value of the moving object relative to the background object is small, and the depth values continuously exist, so that the trunk characteristic candidate area in the depth map is screened out based on the detection of the depth values of the point clouds.

In the embodiment of the invention, when the head and shoulder features in the ROI are detected successfully, the trunk candidate region corresponding to the head and shoulder position is marked as the authenticated trunk feature. And extracting the depth value of each point cloud in the trunk candidate area and the color feature of the corresponding partial area in the color image. When the head-shoulder features are tracked and detected, the steps S1041 to S1042 are repeatedly executed to continuously update and obtain the depth value and the color feature corresponding to the trunk feature. Performing pedestrian tracking based on the updated dribble features begins only when it is detected that head-shoulder features are no longer present in the target region (e.g., when a pedestrian walks out of the RGBD sensor field of view).

As another embodiment of the present invention, as shown in fig. 5, after S1043, the method further includes:

s1044: and analyzing the pedestrian position information to obtain the drying characteristics corresponding to each position point.

S1045: and carrying out data association processing on the drying characteristics corresponding to each position point so as to remove the repeated pedestrian position information.

After pedestrian position information is output according to the moving track of the drying characteristics, all position points contained in the pedestrian position information are analyzed, and the drying characteristics corresponding to all the position points are read.

In the embodiment of the invention, the drying characteristics and the head-shoulder characteristics corresponding to each position point are compared, and data association processing is executed through a preset algorithm to remove repeated pedestrian position information and obtain non-label pedestrian position information.

In the embodiment of the present invention, the non-tag pedestrian location information means that the output pedestrian location information does not include identification mark information of a pedestrian, and only includes characteristic information uniquely indicating whether the moving object is a pedestrian, that is, whether the moving object is a pedestrian is identified. The pedestrian position information including the tag means that the output pedestrian position information includes identification mark information of the pedestrian, for example, the pedestrian a walks from the position point M to the position point N.

For example, a person walks in the visual field range of the RGBD sensor, and the unlabeled pedestrian position information is that there is a person at a certain position at a certain moment; and the pedestrian position information containing the tag is that a user A is at a certain position at a certain moment.

And continuously tracking the related drying object of the output non-labeled pedestrian position information in a detect-to-track mode to obtain the labeled pedestrian position information. And analyzing and processing the pedestrian position information containing the tag, and outputting the motion state of the pedestrian, including the moving speed, the starting position, the end position and the like of the pedestrian.

In the embodiment of the invention, the efficient multi-feature fusion pedestrian detection and tracking algorithm is utilized, so that the limit condition of the installation position of the RGBD sensor on the robot is reduced, the requirement on the calculation resource of the robot is reduced, and the cost of the service robot for applying the pedestrian detection and tracking algorithm is reduced. When the pedestrian tracking detection is carried out based on the multi-feature fusion mode, the success rate and robustness of the robot for detecting and tracking the pedestrian can be improved, and therefore the application scene of the service robot is expanded.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

Fig. 6 shows a block diagram of a pedestrian detection and tracking device provided in an embodiment of the present invention, corresponding to the pedestrian detection and tracking method provided in an embodiment of the present invention. For convenience of explanation, only the portions related to the present embodiment are shown.

Referring to fig. 6, the apparatus includes:

and the acquisition unit 61 is used for acquiring the depth map corresponding to the target area.

A converting unit 62, configured to convert the depth map into a three-dimensional point cloud map, and identify a spatial region containing a moving object in the three-dimensional point cloud map.

A determining unit 63, configured to determine a region of interest ROI in the depth map corresponding to the spatial region, and detect a head-shoulder feature in the ROI.

And the detection unit 64 is used for detecting and tracking the drying feature existing in the ROI if the head and shoulder feature is detected to exist in the ROI, and outputting pedestrian position information according to the moving track of the drying feature.

Optionally, the detecting unit 64 includes:

and the marking subunit is used for marking a drive candidate region in the ROI, which is associated with the head-shoulder feature, if the head-shoulder feature is detected to exist in the ROI.

And the identification subunit is used for continuously updating and identifying the dribble characteristics in the dribble candidate area.

And the output subunit is used for executing pedestrian tracking based on the updated drying feature if the head and shoulder feature is detected not to exist in the target area, and outputting pedestrian position information according to the moving track of the drying feature.

Optionally, the conversion unit 62 includes:

and the acquisition subunit is used for acquiring the height value of each point cloud in the three-dimensional point cloud picture.

And the determining subunit is used for determining a height interval to which the height value of the point cloud belongs, and outputting the scene category corresponding to the point cloud according to a corresponding relation between a pre-stored scene category and the height interval.

And the screening subunit is used for screening the point clouds of which the scene categories are moving objects, and selecting the spatial regions to which the screened point clouds belong in the three-dimensional point cloud picture.

Optionally, the screening subunit is specifically configured to:

outputting the point cloud with the scene category as a moving object type point cloud;

respectively performing overlook projection and horizontal projection on the moving object point cloud in the three-dimensional point cloud picture to obtain a projection intersection space;

and cutting the projection intersection space according to the height information of each point cloud of the moving object class in the projection intersection space to obtain a space area containing a single moving object.

Optionally, the detecting unit 64 further includes:

and the analysis subunit is used for analyzing the pedestrian position information to obtain the drying characteristics corresponding to each position point.

And the processing subunit is used for performing data association processing on the drying characteristics corresponding to each position point so as to remove the repeated pedestrian position information.

Fig. 7 is a schematic diagram of a terminal device according to an embodiment of the present invention. As shown in fig. 7, the terminal device 7 of this embodiment includes: a processor 70, a memory 71 and a computer program 72, such as an image capturing program, stored in said memory 71 and executable on said processor 70. The processor 70, when executing the computer program 72, implements the steps in each of the above-described embodiments of the pedestrian detection and tracking method, such as the steps 101 to 104 shown in fig. 1. Alternatively, the processor 70, when executing the computer program 72, implements the functions of the modules/units in the above-described device embodiments, such as the functions of the units 61 to 64 shown in fig. 6.

Illustratively, the computer program 72 may be partitioned into one or more modules/units that are stored in the memory 71 and executed by the processor 70 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program 72 in the terminal device 7.

The terminal device 7 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The terminal device may include, but is not limited to, a processor 70, a memory 71. It will be appreciated by those skilled in the art that fig. 7 is merely an example of a terminal device 7 and does not constitute a limitation of the terminal device 7 and may comprise more or less components than shown, or some components may be combined, or different components, for example the terminal device may further comprise input output devices, network access devices, buses, etc.

The Processor 70 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 71 may be an internal storage unit of the terminal device 7, such as a hard disk or a memory of the terminal device 7. The memory 71 may also be an external storage device of the terminal device 7, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 7. Further, the memory 71 may also include both an internal storage unit and an external storage device of the terminal device 7. The memory 71 is used for storing the computer program and other programs and data required by the terminal device. The memory 71 may also be used to temporarily store data that has been output or is to be output.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A pedestrian detection tracking method, comprising:

acquiring a depth map corresponding to a target area;

2. The pedestrian detection and tracking method according to claim 1, wherein the detecting and tracking a dribble feature existing in the ROI if the head-shoulder feature is detected to be present in the ROI, and outputting pedestrian position information according to a moving trajectory of the dribble feature comprises:

if the head-shoulder feature is detected to exist in the ROI, marking a drive-dry candidate region related to the head-shoulder feature in the ROI;

continuously updating and identifying the dring characteristics in the dring candidate area;

and if the head and shoulder features are detected not to exist in the target area, executing pedestrian tracking based on the updated drying features, and outputting pedestrian position information according to the moving track of the drying features.

3. The pedestrian detection tracking method according to claim 1, wherein said converting the depth map into a three-dimensional point cloud map, and identifying a spatial region containing a moving object in the three-dimensional point cloud map comprises:

acquiring a height value of each point cloud in the three-dimensional point cloud picture;

determining a height interval to which the height value of the point cloud belongs, and outputting a scene type corresponding to the point cloud according to a corresponding relation between a pre-stored scene type and the height interval;

and screening each point cloud with the scene category as a moving object, and selecting a space area to which each screened point cloud belongs in the three-dimensional point cloud picture.

4. The pedestrian detection and tracking method according to claim 3, wherein the screening out each point cloud having the scene category as a moving object and selecting a spatial region to which each point cloud screened out belongs in the three-dimensional point cloud map comprises:

5. The pedestrian detection and tracking method according to claim 2, further comprising, after the performing pedestrian tracking based on the updated drivingpart and outputting pedestrian position information according to a movement locus of the drivingpart if the head-shoulder feature is detected to be absent from the target region, the method further comprising:

analyzing the pedestrian position information to obtain the drying characteristics corresponding to each position point;

and carrying out data association processing on the drying characteristics corresponding to each position point so as to remove the repeated pedestrian position information.

6. A pedestrian detection tracking device, comprising:

7. The pedestrian detection tracking device according to claim 6, wherein the detection unit includes:

a marking subunit, configured to mark a dribble candidate region associated with the head-shoulder feature in the ROI if the head-shoulder feature is detected to be present in the ROI;

the identification subunit is used for continuously updating and identifying the stem driving characteristics in the stem driving candidate area;

8. The pedestrian detection tracking device according to claim 6, wherein the conversion unit includes:

the acquiring subunit is used for acquiring a height value of each point cloud in the three-dimensional point cloud picture;

a determining subunit, configured to determine a height interval to which the height value of the point cloud belongs, and output the scene category corresponding to the point cloud according to a correspondence between a pre-stored scene category and the height interval;

9. A terminal device comprising a memory and a processor, the memory storing a computer program operable on the processor, wherein the processor implements the steps of the method according to any one of claims 1 to 5 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.