CN108830246B - Multi-dimensional motion feature visual extraction method for pedestrians in traffic environment - Google Patents
Multi-dimensional motion feature visual extraction method for pedestrians in traffic environment Download PDFInfo
- Publication number
- CN108830246B CN108830246B CN201810661219.4A CN201810661219A CN108830246B CN 108830246 B CN108830246 B CN 108830246B CN 201810661219 A CN201810661219 A CN 201810661219A CN 108830246 B CN108830246 B CN 108830246B
- Authority
- CN
- China
- Prior art keywords
- pedestrian
- image
- posture
- motion
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/50—Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a traffic environment pedestrian multi-dimensional motion feature visual extraction method, which comprises the following steps: step 1: constructing a pedestrian motion database; step 2: extracting pedestrian detection frame images of the same pedestrian in the continuous image frames; and step 3: extracting HOG characteristics of the same pedestrian movement energy map; and 4, step 4: constructing a pedestrian motion posture recognition model based on an Elman neural network; and 5: judging the pedestrian posture in the current video by utilizing a pedestrian motion posture identification model based on an Elman neural network; step 6: calculating to obtain instantaneous speed sequences of the pedestrian in the X-axis direction and the Y-axis direction to obtain the real-time speed of the pedestrian; and 7: according to a three-dimensional scene under an intersection environment, position information of pedestrians in the image is obtained in real time, and real-time motion characteristics of the pedestrians are obtained by combining the postures and the real-time speeds of the pedestrians. The scheme has the characteristics of high identification accuracy and good robustness, is convenient to apply, and has a good application and popularization space.
Description
Technical Field
The invention belongs to the field of traffic monitoring, and particularly relates to a visual extraction method for multi-dimensional motion characteristics of pedestrians in a traffic environment.
Background
In recent years, with the rapid development of scientific technology, more and more intelligent methods are applied to the traffic aspect, especially the field of intelligent driving. Traffic safety is a constant topic, and in collision accidents, collisions between vehicles and pedestrians account for a large proportion. The timely detection and posture identification of pedestrians are the key points in the existing intelligent traffic active protection system. To achieve accurate identification, it is most important to extract the motion characteristics of the pedestrian.
The pedestrian posture identification comprises a global feature method and a local feature method. The global feature mostly adopts a motion history image method, that is, frame difference information of a video sequence is accumulated into one frame image, and the frame difference contains certain motion information but does not contain shape information of a moving human body and is easily interfered by noise. Still another method is to extract the static edge information of the pedestrian in each frame of image, and the inter-frame images need artificial combination, which causes difficulty in identification. At present, the speed detection of pedestrians mostly adopts a radar method and cannot be well combined with visual images.
Chinese patent CN105957103A proposes a method for extracting motion features based on vision, which comprises the following steps: 1. extracting a motion vector of each pixel point based on the continuous frames; 2. extracting characteristic points with the pixel values changing strongly in the direction X, Y, T; 3. constructing a cubic characteristic vector of a direction-amplitude histogram based on the motion vector by taking the characteristic point as a center; 4. and forming a code vector for the local descriptor through a clustering algorithm. This patent has the following problems: 1. when the motion vector of each pixel point is extracted, the pixel points are not effectively screened, the data volume is large, and the calculation is complex; 2. the clustering algorithm applied by the patent is prone to local convergence.
In summary, it is urgently needed to provide a method for extracting pedestrian motion characteristics more accurately in a traffic environment.
Disclosure of Invention
The invention provides a visual extraction method for multi-dimensional motion characteristics of pedestrians in a traffic environment, and aims to accurately extract the postures of the pedestrians on a road, timely pre-warn vehicles on a traffic road and reduce traffic accidents.
A traffic environment pedestrian multi-dimensional motion feature visual extraction method comprises the following steps:
step 1: constructing a pedestrian motion database;
collecting various motion postures of pedestrians in various shooting directions of a depth camera and videos of positions of roads where the pedestrians are located, wherein the shooting directions comprise seven directions facing to the right front direction, the left front direction, the right front direction, the side surface, the right back direction, the left back direction and the right back direction of a lens, and the postures comprise three kinds of walking, running and standing;
step 2: extracting images of videos in a pedestrian motion database, preprocessing the extracted images to obtain a pedestrian detection frame of each frame of image, and extracting pedestrian detection frame images of the same pedestrian in continuous image frames;
and step 3: carrying out graying processing on each pedestrian detection frame image, synthesizing a motion energy map of a grayscale image corresponding to the pedestrian detection frame image of the same pedestrian in a continuous image frame, and extracting the HOG characteristic of the motion energy map;
and 4, step 4: constructing a pedestrian motion posture recognition model based on an Elman neural network;
taking a motion energy map corresponding to each pedestrian in the continuous image frames as input data, taking the posture of the corresponding pedestrian as output data, and training the Elman neural network;
the standing posture output corresponds to [001], the walking posture output corresponds to [010], and the running posture output corresponds to [100 ];
the Elman neural network parameter setting method comprises the steps that the number of nodes of an input layer corresponds to the number x of motion energy image pixels, the number of nodes of a hidden layer is 2x +1, the number of nodes of an output layer is 3, the maximum iteration number is 1500, the learning rate is 0.001, and the threshold value is 0.00001;
and 5: judging the pedestrian posture in the current video by utilizing a pedestrian motion posture identification model based on an Elman neural network;
extracting the pedestrian detection frame images of the same pedestrian in the continuous frame images from the current video according to the step 2, inputting the images into a pedestrian motion posture recognition model based on an Elman neural network to obtain corresponding postures, and distinguishing the postures;
step 6: calculating a pixel coordinate change sequence of the vertex of the lower left corner of the pedestrian detection frame of the same pedestrian in the continuous frame images, and calculating to obtain instantaneous speed sequences of the pedestrian in the X-axis direction and the Y-axis direction to obtain the real-time speed of the pedestrian;
and 7: according to a three-dimensional scene under an intersection environment, position information of pedestrians in the image is obtained in real time, and real-time motion characteristics of the pedestrians are obtained by combining the postures and the real-time speeds of the pedestrians.
The method comprises the steps that a depth camera is adopted by a camera of the intersection, a three-dimensional scene under the intersection environment is built, position information of pedestrians in an image is obtained in real time, the three-dimensional scene is divided into a pedestrian road and a vehicle road according to the actual road condition, when a person enters the three-dimensional scene, an ID is built for each person, and the motion characteristics of the person are judged through continuous frame image information.
Further, optimizing the weight and the threshold of the Elman neural network in the pedestrian motion posture recognition model based on the Elman neural network by using a chicken swarm algorithm, and specifically comprising the following steps of:
step A1: taking the individual positions of the chicken flocks as the weight and the threshold of the Elman neural network, and initializing chicken flock parameters;
the population scale M is [20,100], the search space dimension is j, the value of j is the sum of the weight of the Elman neural network to be optimized and the parameter number of the threshold, the maximum iteration time T is [400,1000], the iteration time is T, the initial value is 0, the proportion Pg of the cock is 20%, the proportion Pm of the hen is 70%, the proportion Px of the chick is 10%, and the hens are randomly selected from the hens, and the proportion Pd is 10%;
step A2: setting a fitness function, and enabling the iteration time t to be 1;
sequentially substituting the weight value and the threshold value corresponding to the individual positions of the chicken flock into the pedestrian motion gesture recognition model based on the Elman neural network, and determining the pedestrian detection frame image of the same pedestrian in the continuous frame images by utilizing the pedestrian motion gesture recognition model based on the Elman neural network determined by the individual positions of the chicken flockThe reciprocal of the difference between the pedestrian posture detection value of the same pedestrian in the pedestrian detection frame image in the continuous frame image and the corresponding actual pedestrian posture value is used as the first fitness function f1(x);
The greater the fitness, the more excellent the individual;
step A3: constructing a chicken flock subgroup;
sorting according to all individual fitness values, selecting chicken individuals with fitness values M × Pg in front of the fitness values to be judged as cocks, wherein each cock is used as the head of a subgroup; selecting chicken flock individuals with the fitness value of M x Px after ranking as chickens; judging other chicken individuals as hens;
dividing the chicken group into subgroups according to the number of the cocks, wherein one subgroup comprises one cock, a plurality of chickens and a plurality of hens, and each chicken randomly selects one hen in the subgroup to construct a hen-offspring relationship;
step A4: updating the individual positions of the chicken flock and calculating the fitness of each individual at present;
wherein the content of the first and second substances,indicates the position of the individual cock i in the j-dimensional space in the t-th iteration,corresponding to the new position of the individual cock in the t +1 iteration, r (0, sigma)2) Subject to a mean of 0 and a standard deviation of σ2Normal distribution of (0, σ)2);
wherein the content of the first and second substances,to locate the hen g in j-dimensional space in the t-th iteration,is the only cock i in the subgroup of hens g in the t-th iteration1The location of the individual is determined by the location of the individual,is a random cock i outside the subgroup of the hen i in the t-th iteration2Individual position, rand (0,1) is a random function, values are uniformly and randomly taken between (0,1), and L1、L2Updating the coefficients, L, for the positions of the hen i affected by the subgroup and other subgroups1Value range of [0.25,0.55 ]],L2Value range of [0.15,0.35 ]];
wherein the content of the first and second substances,to locate chicken/in j-dimensional space in the t-th iteration,for the hen g of the mother generation with the chick l corresponding to the mother-child relationship in the t-th iterationmThe location of the individual is determined by the location of the individual,omega, alpha and beta are the chicken self-update coefficients [0.2,0.7 ] respectively for the unique individual positions of the cocks in the subgroup of the chicks in the t iteration]Coefficient of following hen generation [0.5,0.8 ]]Coefficient of following cock [0.8, 1.5%];
Step A5: and updating the individual optimal position and the chicken swarm whole individual optimal position according to the fitness function, judging whether the maximum iteration times is reached, if so, quitting, otherwise, making t equal to t +1, and turning to the step A3 until the maximum iteration times is met, outputting the weight and the threshold of the Elman neural network corresponding to the optimal chicken swarm individual position, and obtaining the pedestrian motion posture recognition model based on the Elman neural network.
Wherein the content of the first and second substances,andrespectively representing the instantaneous speeds of the pedestrian in the X-axis direction and the Y-axis direction,
ΔWj=k|w2-w1|=k|x2×P-x1×P|,ΔLj=|f(l2)-f(l1)|,l1=(N-y1)×P,
the pixel coordinates of the pedestrian target point in the previous frame image and the current frame image are respectively (x)1,y1) And (x)2,y2);l1And l2Respectively representing the distance between the pedestrian target point and the Y-axis edge of the display screen in two adjacent frames of images;
k represents the ratio of the actual scene distance to the scene imaging distance in the display screen, and M and N respectively represent the number of total pixel points in the X-axis direction and the Y-axis direction in the display screen; p represents the length of each pixel point in the display screen, and MP and NP are the total length of the X axis and the Y axis of the whole screen respectively; Δ WjAnd Δ LjRespectively representing the edges of the pedestrian target point in the two adjacent frames of imagesDisplacement in the X-axis and Y-axis directions;
AB represents the distance from the depth camera to the pedestrian, alpha represents the included angle between the connecting line between the depth camera and the pedestrian and the ground plane, theta is the included angle between the straight line between the depth camera and the pedestrian and the imaging plane, and m is the frame number.
And the values of AB, alpha and theta are obtained by real-time measurement by using a depth camera.
Further, according to the real-time motion characteristics of the pedestrians, carrying out pedestrian behavior level early warning on the vehicles on the traffic road;
the behavior levels comprise three levels of security, threat and danger;
the safety behaviors comprise that pedestrians are in a standing posture beyond one meter away from a traffic road, pedestrians are on a sidewalk and beyond one meter away from the traffic road and in a walking posture along the parallel direction of the traffic road or back to the traffic road, and back to the traffic road is in a running posture;
the threat behaviors comprise that pedestrians are within one meter of a pedestrian road on a sidewalk and a vehicle road, are positioned in the pedestrian road and are in a standing posture, and are within one meter of the pedestrian road and the vehicle road edge and are in a running posture;
the dangerous behaviors include a pedestrian on a sidewalk toward a traffic road direction or a pedestrian in a running posture in the traffic road, and in a walking posture in the traffic road;
when the walking speed of the pedestrian in the threatening behavior is more than 1.9m/s or the running speed is more than 8m/s, the threatening behavior is upgraded to dangerous behavior.
The behavior levels refer to safety conditions of states of pedestrians in the traffic environment, and different behavior levels prompt drivers of vehicles running in the traffic environment to ensure traffic safety;
further, the pedestrian target point is a lower left corner pixel point of the pedestrian detection frame image.
Further, preprocessing a pedestrian image frame, setting a pedestrian detection frame, a pedestrian target identifier and a pedestrian position label vector for the preprocessed image, and constructing a pedestrian track;
the pedestrian detection frame is a minimum circumscribed rectangle of a pedestrian outline in a pedestrian image frame;
the pedestrian target identification is a unique identification P of different pedestrians appearing in all the pedestrian image frames;
the expression form of the pedestrian position label vector is [ t, x, y, a, b ], t represents that the current pedestrian image frame belongs to the t-th frame in the monitoring video, x and y respectively represent the abscissa and the ordinate of the lower left corner of a pedestrian detection frame in the pedestrian image frame, and a and b respectively represent the length and the width of the pedestrian detection frame;
the appearance result of the pedestrian in the previous frame of pedestrian image in the next frame of pedestrian image means that if the pedestrian in the previous frame of pedestrian image appears in the next frame of pedestrian image, the tracking result of the pedestrian is 1, otherwise, the tracking result is 0; and if the pedestrian tracking result is 1, adding the corresponding pedestrian position label vector appearing in the pedestrian image of the next frame into the pedestrian track.
Advantageous effects
The invention provides a traffic environment pedestrian multi-dimensional motion feature visual extraction method, which comprises the following steps: step 1: constructing a pedestrian motion database; step 2: extracting images of videos in a pedestrian motion database, preprocessing the extracted images to obtain a pedestrian detection frame of each frame of image, and extracting pedestrian detection frame images of the same pedestrian in continuous image frames; and step 3: carrying out graying processing on each pedestrian detection frame image, synthesizing a motion energy map of a grayscale image corresponding to the pedestrian detection frame image of the same pedestrian in a continuous image frame, and extracting the HOG characteristic of the motion energy map; and 4, step 4: constructing a pedestrian motion posture recognition model based on an Elman neural network; and 5: judging the pedestrian posture in the current video by utilizing a pedestrian motion posture identification model based on an Elman neural network; step 6: calculating a pixel coordinate change sequence of the vertex of the lower left corner of the pedestrian detection frame of the same pedestrian in the continuous frame images, and calculating to obtain instantaneous speed sequences of the pedestrian in the X-axis direction and the Y-axis direction to obtain the real-time speed of the pedestrian; and 7: according to a three-dimensional scene under an intersection environment, position information of pedestrians in the image is obtained in real time, and real-time motion characteristics of the pedestrians are obtained by combining the postures and the real-time speeds of the pedestrians.
Compared with the prior art, the method has the following advantages:
1. the identification accuracy is high: the HOG characteristics of the synthesized motion energy map extracted by the invention not only comprise the pedestrian motion information of the whole image sequence, but also comprise the motion energy information of the pedestrian, and the characteristics are representative, so that the gesture identification of the pedestrian can be greatly facilitated;
2. the application is convenient: the pedestrian speed calculation method provided by the invention is directly operated based on the visual image, thereby realizing the perfect combination of speed detection and image recognition and facilitating the use of users;
3. the invention realizes the posture identification of the pedestrian in the image and the speed calculation of the pedestrian, has complete network structure and can greatly facilitate users;
4. the robustness is good: the invention uses the neural network, has strong nonlinear fitting capability and has better robustness when dealing with the problems of illumination change, pedestrian shielding and the like.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
fig. 2 is a schematic diagram of a distance relationship between a depth camera and a pedestrian.
Detailed Description
The invention will be further described with reference to the following figures and examples.
As shown in fig. 1, a method for visually extracting multidimensional motion features of pedestrians in traffic environment includes the following steps:
step 1: constructing a pedestrian motion database;
collecting various motion postures of pedestrians in various shooting directions of a depth camera and videos of positions of roads where the pedestrians are located, wherein the shooting directions comprise seven directions facing to the right front direction, the left front direction, the right front direction, the side surface, the right back direction, the left back direction and the right back direction of a lens, and the postures comprise three kinds of walking, running and standing;
step 2: extracting images of videos in a pedestrian motion database, preprocessing the extracted images to obtain a pedestrian detection frame of each frame of image, and extracting pedestrian detection frame images of the same pedestrian in continuous image frames;
and step 3: carrying out graying processing on each pedestrian detection frame image, synthesizing a motion energy map of a grayscale image corresponding to the pedestrian detection frame image of the same pedestrian in a continuous image frame, and extracting the HOG characteristic of the motion energy map;
and 4, step 4: constructing a pedestrian motion posture recognition model based on an Elman neural network;
taking a motion energy map corresponding to each pedestrian in the continuous image frames as input data, taking the posture of the corresponding pedestrian as output data, and training the Elman neural network;
the standing posture output corresponds to [001], the walking posture output corresponds to [010], and the running posture output corresponds to [100 ];
the Elman neural network parameter setting method comprises the steps that the number of nodes of an input layer corresponds to the number x of motion energy image pixels, the number of nodes of a hidden layer is 2x +1, the number of nodes of an output layer is 3, the maximum iteration number is 1500, the learning rate is 0.001, and the threshold value is 0.00001;
optimizing the weight and the threshold of the Elman neural network in the pedestrian motion posture recognition model based on the Elman neural network by using a chicken swarm algorithm, and specifically comprising the following steps:
step A1: taking the individual positions of the chicken flocks as the weight and the threshold of the Elman neural network, and initializing chicken flock parameters;
the population scale M is [20,100], the search space dimension is j, the value of j is the sum of the weight of the Elman neural network to be optimized and the parameter number of the threshold, the maximum iteration time T is [400,1000], the iteration time is T, the initial value is 0, the proportion Pg of the cock is 20%, the proportion Pm of the hen is 70%, the proportion Px of the chick is 10%, and the hens are randomly selected from the hens, and the proportion Pd is 10%;
step A2: setting a fitness function, and enabling the iteration time t to be 1;
sequentially substituting weights and thresholds corresponding to individual positions of chicken flocks into pedestrian motion posture recognition based on the Elman neural networkIn the model, the pedestrian gesture of the same input pedestrian in the pedestrian detection frame images in the continuous frame images is determined by utilizing an Elman neural network-based pedestrian motion gesture recognition model for chicken flock individual position determination, and the reciprocal of the difference between the pedestrian gesture detection value of the pedestrian detection frame image of the same pedestrian in the continuous frame images and the corresponding pedestrian gesture actual value is used as a first fitness function f1(x);
The greater the fitness, the more excellent the individual;
step A3: constructing a chicken flock subgroup;
sorting according to all individual fitness values, selecting chicken individuals with fitness values M × Pg in front of the fitness values to be judged as cocks, wherein each cock is used as the head of a subgroup; selecting chicken flock individuals with the fitness value of M x Px after ranking as chickens; judging other chicken individuals as hens;
dividing the chicken group into subgroups according to the number of the cocks, wherein one subgroup comprises one cock, a plurality of chickens and a plurality of hens, and each chicken randomly selects one hen in the subgroup to construct a hen-offspring relationship;
step A4: updating the individual positions of the chicken flock and calculating the fitness of each individual at present;
wherein the content of the first and second substances,indicates the position of the individual cock i in the j-dimensional space in the t-th iteration,corresponding to the new position of the individual cock in the t +1 iteration, r (0, sigma)2) Subject to a mean of 0 and a standard deviation of σ2Normal distribution of (0, σ)2);
wherein the content of the first and second substances,to locate the hen g in j-dimensional space in the t-th iteration,is the only cock i in the subgroup of hens g in the t-th iteration1The location of the individual is determined by the location of the individual,is a random cock i outside the subgroup of the hen i in the t-th iteration2Individual position, rand (0,1) is a random function, values are uniformly and randomly taken between (0,1), and L1、L2Updating the coefficients, L, for the positions of the hen i affected by the subgroup and other subgroups1Value range of [0.25,0.55 ]],L2Value range of [0.15,0.35 ]];
wherein the content of the first and second substances,to locate chicken/in j-dimensional space in the t-th iteration,for the hen g of the mother generation with the chick l corresponding to the mother-child relationship in the t-th iterationmThe location of the individual is determined by the location of the individual,omega, alpha and beta are the chicken self-update coefficients [0.2,0.7 ] respectively for the unique individual positions of the cocks in the subgroup of the chicks in the t iteration]Coefficient of following hen generation [0.5,0.8 ]]Coefficient of following cock [0.8, 1.5%];
Step A5: and updating the individual optimal position and the chicken swarm whole individual optimal position according to the fitness function, judging whether the maximum iteration times is reached, if so, quitting, otherwise, making t equal to t +1, and turning to the step A3 until the maximum iteration times is met, outputting the weight and the threshold of the Elman neural network corresponding to the optimal chicken swarm individual position, and obtaining the pedestrian motion posture recognition model based on the Elman neural network.
And 5: judging the pedestrian posture in the current video by utilizing a pedestrian motion posture identification model based on an Elman neural network;
extracting the pedestrian detection frame images of the same pedestrian in the continuous frame images from the current video according to the step 2, inputting the images into a pedestrian motion posture recognition model based on an Elman neural network to obtain corresponding postures, and distinguishing the postures;
step 6: calculating a pixel coordinate change sequence of the vertex of the lower left corner of the pedestrian detection frame of the same pedestrian in the continuous frame images, and calculating to obtain instantaneous speed sequences of the pedestrian in the X-axis direction and the Y-axis direction to obtain the real-time speed of the pedestrian;
Wherein the content of the first and second substances,andrespectively representing the instantaneous speeds of the pedestrian in the X-axis direction and the Y-axis direction,
ΔWj=k|w2-w1|=k|x2×P-x1×P|,ΔLj=|f(l2)-f(l1)|,l1=(N-y1)×P,
the pixel coordinates of the pedestrian target point in the previous frame image and the current frame image are respectively (x)1,y1) And (x)2,y2);l1And l2Respectively representing the distance between the pedestrian target point and the Y-axis edge of the display screen in two adjacent frames of images;
k represents the ratio of the actual scene distance to the scene imaging distance in the display screen, and M and N respectively represent the number of total pixel points in the X-axis direction and the Y-axis direction in the display screen; p represents the length of each pixel point in the display screen, and MP and NP are the total length of the X axis and the Y axis of the whole screen respectively; Δ WjAnd Δ LjRespectively representing the displacement of the pedestrian target point in the directions of the X axis and the Y axis in two adjacent frames of images;
as shown in fig. 2, AB represents the distance from the depth camera to the pedestrian, α represents the included angle between the connecting line between the depth camera and the pedestrian and the ground plane, θ is the included angle between the straight line between the depth camera and the pedestrian and the imaging plane, values of AB, α, and θ are obtained by real-time measurement using the depth camera, and m is the frame number.
And 7: according to a three-dimensional scene under an intersection environment, position information of pedestrians in the image is obtained in real time, and real-time motion characteristics of the pedestrians are obtained by combining the postures and the real-time speeds of the pedestrians.
The method comprises the steps that a depth camera is adopted by a camera of the intersection, a three-dimensional scene under the intersection environment is built, position information of pedestrians in an image is obtained in real time, the three-dimensional scene is divided into a pedestrian road and a vehicle road according to the actual road condition, when a person enters the three-dimensional scene, an ID is built for each person, and the motion characteristics of the person are judged through continuous frame image information.
Carrying out pedestrian behavior level early warning on vehicles on a traffic road according to real-time motion characteristics of pedestrians;
the behavior levels comprise three levels of security, threat and danger;
the safety behaviors comprise that pedestrians are in a standing posture beyond one meter away from a traffic road, pedestrians are on a sidewalk and beyond one meter away from the traffic road and in a walking posture along the parallel direction of the traffic road or back to the traffic road, and back to the traffic road is in a running posture;
the threat behaviors comprise that pedestrians are within one meter of a pedestrian road on a sidewalk and a vehicle road, are positioned in the pedestrian road and are in a standing posture, and are within one meter of the pedestrian road and the vehicle road edge and are in a running posture;
the dangerous behaviors include a pedestrian on a sidewalk toward a traffic road direction or a pedestrian in a running posture in the traffic road, and in a walking posture in the traffic road;
when the walking speed of the pedestrian in the threatening behavior is more than 1.9m/s or the running speed is more than 8m/s, the threatening behavior is upgraded to dangerous behavior.
The behavior levels refer to safety conditions of states of pedestrians in the traffic environment, and different behavior levels prompt drivers of vehicles running in the traffic environment to ensure traffic safety;
in this example, the lower left corner pixel point of the pedestrian detection frame image is used as the pedestrian target point.
Preprocessing a pedestrian image frame, setting a pedestrian detection frame, a pedestrian target identifier and a pedestrian position tag vector for the preprocessed image, and constructing a pedestrian track;
the pedestrian detection frame is a minimum circumscribed rectangle of a pedestrian outline in a pedestrian image frame;
the pedestrian target identification is a unique identification P of different pedestrians appearing in all the pedestrian image frames;
the expression form of the pedestrian position label vector is [ t, x, y, a, b ], t represents that the current pedestrian image frame belongs to the t-th frame in the monitoring video, x and y respectively represent the abscissa and the ordinate of the lower left corner of a pedestrian detection frame in the pedestrian image frame, and a and b respectively represent the length and the width of the pedestrian detection frame;
the appearance result of the pedestrian in the previous frame of pedestrian image in the next frame of pedestrian image means that if the pedestrian in the previous frame of pedestrian image appears in the next frame of pedestrian image, the tracking result of the pedestrian is 1, otherwise, the tracking result is 0; and if the pedestrian tracking result is 1, adding the corresponding pedestrian position label vector appearing in the pedestrian image of the next frame into the pedestrian track.
The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.
Claims (6)
1. A traffic environment pedestrian multi-dimensional motion feature visual extraction method is characterized by comprising the following steps:
step 1: constructing a pedestrian motion database;
collecting various motion postures of pedestrians in various shooting directions of a depth camera and videos of positions of roads where the pedestrians are located, wherein the shooting directions comprise seven directions facing to the right front direction, the left front direction, the right front direction, the side surface, the right back direction, the left back direction and the right back direction of a lens, and the postures comprise three kinds of walking, running and standing;
step 2: extracting images of videos in a pedestrian motion database, preprocessing the extracted images to obtain a pedestrian detection frame of each frame of image, and extracting pedestrian detection frame images of the same pedestrian in continuous image frames;
and step 3: carrying out graying processing on each pedestrian detection frame image, synthesizing a motion energy map of a grayscale image corresponding to the pedestrian detection frame image of the same pedestrian in a continuous image frame, and extracting the HOG characteristic of the motion energy map;
and 4, step 4: constructing a pedestrian motion posture recognition model based on an Elman neural network;
taking a motion energy map corresponding to each pedestrian in the continuous image frames as input data, taking the posture of the corresponding pedestrian as output data, and training the Elman neural network;
the standing posture output corresponds to [001], the walking posture output corresponds to [010], and the running posture output corresponds to [100 ];
the Elman neural network parameter setting method comprises the steps that the number of nodes of an input layer corresponds to the number x of motion energy image pixels, the number of nodes of a hidden layer is 2x +1, the number of nodes of an output layer is 3, the maximum iteration number is 1500, the learning rate is 0.001, and the threshold value is 0.00001;
and 5: judging the pedestrian posture in the current video by utilizing a pedestrian motion posture identification model based on an Elman neural network;
extracting the pedestrian detection frame images of the same pedestrian in the continuous frame images from the current video according to the step 2, inputting the images into a pedestrian motion posture recognition model based on an Elman neural network to obtain corresponding postures, and distinguishing the postures;
step 6: calculating a pixel coordinate change sequence of the vertex of the lower left corner of the pedestrian detection frame of the same pedestrian in the continuous frame images, and calculating to obtain instantaneous speed sequences of the pedestrian in the X-axis direction and the Y-axis direction to obtain the real-time speed of the pedestrian;
and 7: according to a three-dimensional scene under an intersection environment, position information of pedestrians in the image is obtained in real time, and real-time motion characteristics of the pedestrians are obtained by combining the postures and the real-time speeds of the pedestrians.
2. The method according to claim 1, wherein a chicken flock algorithm is used for optimizing the weight and threshold of the Elman neural network in the pedestrian motion posture recognition model based on the Elman neural network, and the specific steps are as follows:
step A1: taking the individual positions of the chicken flocks as the weight and the threshold of the Elman neural network, and initializing chicken flock parameters;
the population scale M is [20,100], the search space dimension is j, the value of j is the sum of the weight of the Elman neural network to be optimized and the parameter number of the threshold, the maximum iteration time T is [400,1000], the iteration time is T, the initial value is 0, the proportion Pg of the cock is 20%, the proportion Pm of the hen is 70%, the proportion Px of the chick is 10%, and the hens are randomly selected from the hens, and the proportion Pd is 10%;
step A2: setting a fitness function, and enabling the iteration time t to be 1;
in turn willSubstituting the weight value and the threshold value corresponding to the chicken group individual position into the pedestrian motion posture recognition model based on the Elman neural network, determining the pedestrian posture of the input same pedestrian in the pedestrian detection frame image in the continuous frame image by using the pedestrian motion posture recognition model based on the Elman neural network determined by the chicken group individual position, and taking the reciprocal of the difference between the pedestrian posture detection value of the pedestrian detection frame image of the same pedestrian in the continuous frame image and the corresponding actual pedestrian posture value as a first fitness function f1(x);
Step A3: constructing a chicken flock subgroup;
sorting according to all individual fitness values, selecting chicken individuals with fitness values M × Pg in front of the fitness values to be judged as cocks, wherein each cock is used as the head of a subgroup; selecting chicken flock individuals with the fitness value of M x Px after ranking as chickens; judging other chicken individuals as hens;
dividing the chicken group into subgroups according to the number of the cocks, wherein one subgroup comprises one cock, a plurality of chickens and a plurality of hens, and each chicken randomly selects one hen in the subgroup to construct a hen-offspring relationship;
step A4: updating the individual positions of the chicken flock and calculating the fitness of each individual at present;
wherein the content of the first and second substances,indicates the position of the individual cock i in the j-dimensional space in the t-th iteration,corresponding to the new position of the individual cock in the t +1 iteration, r (0, sigma)2) Subject to a mean of 0 and a standard deviation of σ2Normal distribution of (0, σ)2);
wherein the content of the first and second substances,to locate the hen g in j-dimensional space in the t-th iteration,is the only cock i in the subgroup of hens g in the t-th iteration1The location of the individual is determined by the location of the individual,is a random cock i outside the subgroup of the hen i in the t-th iteration2Individual position, rand (0,1) is a random function, values are uniformly and randomly taken between (0,1), and L1、L2Updating the coefficients, L, for the positions of the hen i affected by the subgroup and other subgroups1Value range of [0.25,0.55 ]],L2Value range of [0.15,0.35 ]];
wherein the content of the first and second substances,to locate chicken/in j-dimensional space in the t-th iteration,for the hen g of the mother generation with the chick l corresponding to the mother-child relationship in the t-th iterationmThe location of the individual is determined by the location of the individual,omega, alpha and beta are the chicken self-update coefficients [0.2,0.7 ] respectively for the unique individual positions of the cocks in the subgroup of the chicks in the t iteration]Coefficient of following hen generation [0.5,0.8 ]]Coefficient of following cock [0.8, 1.5%];
Step A5: and updating the individual optimal position and the chicken swarm whole individual optimal position according to the fitness function, judging whether the maximum iteration times is reached, if so, quitting, otherwise, making t equal to t +1, and turning to the step A3 until the maximum iteration times is met, outputting the weight and the threshold of the Elman neural network corresponding to the optimal chicken swarm individual position, and obtaining the pedestrian motion posture recognition model based on the Elman neural network.
Wherein the content of the first and second substances,andrespectively representing the instantaneous speeds of the pedestrian in the X-axis direction and the Y-axis direction,
the pixel coordinates of the pedestrian target point in the previous frame image and the current frame image are respectively (x)1,y1) And (x)2,y2);l1And l2Respectively representing the distance between the pedestrian target point and the Y-axis edge of the display screen in two adjacent frames of images;
k represents the ratio of the actual scene distance to the scene imaging distance in the display screen, and M and N respectively represent the number of total pixel points in the X-axis direction and the Y-axis direction in the display screen; p represents the length of each pixel point in the display screen, and MP and NP are the total length of the X axis and the Y axis of the whole screen respectively; Δ WjAnd Δ LjRespectively representing the displacement of the pedestrian target point in the directions of the X axis and the Y axis in two adjacent frames of images;
AB represents the distance from the depth camera to the pedestrian, alpha represents the included angle between the connecting line between the depth camera and the pedestrian and the ground plane, theta is the included angle between the straight line between the depth camera and the pedestrian and the imaging plane, and m is the frame number.
4. The method according to any one of claims 1 to 3, characterized in that pedestrian behavior level early warning is performed on vehicles on a traffic road according to real-time motion characteristics of pedestrians;
the behavior levels comprise three levels of security, threat and danger;
the safety behaviors comprise that pedestrians are in a standing posture beyond one meter away from a traffic road, pedestrians are on a sidewalk and beyond one meter away from the traffic road and in a walking posture along the parallel direction of the traffic road or back to the traffic road, and back to the traffic road is in a running posture;
the threat behaviors comprise that pedestrians are within one meter of a pedestrian road on a sidewalk and a vehicle road, are positioned in the pedestrian road and are in a standing posture, and are within one meter of the pedestrian road and the vehicle road edge and are in a running posture;
the dangerous behaviors include a pedestrian on a sidewalk toward a traffic road direction or a pedestrian in a running posture in the traffic road, and in a walking posture in the traffic road;
when the walking speed of the pedestrian in the threatening behavior is more than 1.9m/s or the running speed is more than 8m/s, the threatening behavior is upgraded to dangerous behavior.
5. The method of claim 4, wherein the pedestrian target point is a lower left corner pixel point of a pedestrian detection frame image.
6. The method according to claim 5, characterized in that, the pedestrian image frame is preprocessed, and a pedestrian detection frame, a pedestrian target identification and a pedestrian position label vector are arranged on the preprocessed image to construct a pedestrian track;
the pedestrian detection frame is a minimum circumscribed rectangle of a pedestrian outline in a pedestrian image frame;
the pedestrian target identification is a unique identification P of different pedestrians appearing in all the pedestrian image frames;
the expression form of the pedestrian position label vector is [ t, x, y, a, b ], t represents that the current pedestrian image frame belongs to the t-th frame in the monitoring video, x and y respectively represent the abscissa and the ordinate of the lower left corner of a pedestrian detection frame in the pedestrian image frame, and a and b respectively represent the length and the width of the pedestrian detection frame;
the appearance result of the pedestrian in the previous frame of pedestrian image in the next frame of pedestrian image means that if the pedestrian in the previous frame of pedestrian image appears in the next frame of pedestrian image, the tracking result of the pedestrian is 1, otherwise, the tracking result is 0; and if the pedestrian tracking result is 1, adding the corresponding pedestrian position label vector appearing in the pedestrian image of the next frame into the pedestrian track.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810661219.4A CN108830246B (en) | 2018-06-25 | 2018-06-25 | Multi-dimensional motion feature visual extraction method for pedestrians in traffic environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810661219.4A CN108830246B (en) | 2018-06-25 | 2018-06-25 | Multi-dimensional motion feature visual extraction method for pedestrians in traffic environment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108830246A CN108830246A (en) | 2018-11-16 |
CN108830246B true CN108830246B (en) | 2022-02-15 |
Family
ID=64138303
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810661219.4A Active CN108830246B (en) | 2018-06-25 | 2018-06-25 | Multi-dimensional motion feature visual extraction method for pedestrians in traffic environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108830246B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109558505A (en) * | 2018-11-21 | 2019-04-02 | 百度在线网络技术(北京)有限公司 | Visual search method, apparatus, computer equipment and storage medium |
CN111265218A (en) * | 2018-12-05 | 2020-06-12 | 阿里巴巴集团控股有限公司 | Motion attitude data processing method and device and electronic equipment |
CN110427800A (en) | 2019-06-17 | 2019-11-08 | 平安科技(深圳)有限公司 | Video object acceleration detection method, apparatus, server and storage medium |
CN110632636B (en) * | 2019-09-11 | 2021-10-22 | 桂林电子科技大学 | Carrier attitude estimation method based on Elman neural network |
CN111338344A (en) * | 2020-02-28 | 2020-06-26 | 北京小马慧行科技有限公司 | Vehicle control method and device and vehicle |
CN115092091A (en) * | 2022-07-11 | 2022-09-23 | 中国第一汽车股份有限公司 | Vehicle and pedestrian protection system and method based on Internet of vehicles |
CN116935447B (en) * | 2023-09-19 | 2023-12-26 | 华中科技大学 | Self-adaptive teacher-student structure-based unsupervised domain pedestrian re-recognition method and system |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN206033008U (en) * | 2016-03-09 | 2017-03-22 | 秀景A.I.D 股份有限公司 | Automatic hand track sterilization equipment of power type |
CN106789214A (en) * | 2016-12-12 | 2017-05-31 | 广东工业大学 | It is a kind of based on the just remaining pair network situation awareness method and device of string algorithm |
CN106875424A (en) * | 2017-01-16 | 2017-06-20 | 西北工业大学 | A kind of urban environment driving vehicle Activity recognition method based on machine vision |
CN107122707A (en) * | 2017-03-17 | 2017-09-01 | 山东大学 | Video pedestrian based on macroscopic features compact representation recognition methods and system again |
CN107126224A (en) * | 2017-06-20 | 2017-09-05 | 中南大学 | A kind of real-time monitoring of track train driver status based on Kinect and method for early warning and system |
CN107153800A (en) * | 2017-05-04 | 2017-09-12 | 天津工业大学 | A kind of reader antenna Optimization deployment scheme that alignment system is recognized based on the super high frequency radio frequency for improving chicken group's algorithm |
CN107203753A (en) * | 2017-05-25 | 2017-09-26 | 西安工业大学 | A kind of action identification method based on fuzzy neural network and graph model reasoning |
CN107657232A (en) * | 2017-09-28 | 2018-02-02 | 南通大学 | A kind of pedestrian's intelligent identification Method and its system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9821470B2 (en) * | 2014-09-17 | 2017-11-21 | Brain Corporation | Apparatus and methods for context determination using real time sensor data |
-
2018
- 2018-06-25 CN CN201810661219.4A patent/CN108830246B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN206033008U (en) * | 2016-03-09 | 2017-03-22 | 秀景A.I.D 股份有限公司 | Automatic hand track sterilization equipment of power type |
CN106789214A (en) * | 2016-12-12 | 2017-05-31 | 广东工业大学 | It is a kind of based on the just remaining pair network situation awareness method and device of string algorithm |
CN106875424A (en) * | 2017-01-16 | 2017-06-20 | 西北工业大学 | A kind of urban environment driving vehicle Activity recognition method based on machine vision |
CN107122707A (en) * | 2017-03-17 | 2017-09-01 | 山东大学 | Video pedestrian based on macroscopic features compact representation recognition methods and system again |
CN107153800A (en) * | 2017-05-04 | 2017-09-12 | 天津工业大学 | A kind of reader antenna Optimization deployment scheme that alignment system is recognized based on the super high frequency radio frequency for improving chicken group's algorithm |
CN107203753A (en) * | 2017-05-25 | 2017-09-26 | 西安工业大学 | A kind of action identification method based on fuzzy neural network and graph model reasoning |
CN107126224A (en) * | 2017-06-20 | 2017-09-05 | 中南大学 | A kind of real-time monitoring of track train driver status based on Kinect and method for early warning and system |
CN107657232A (en) * | 2017-09-28 | 2018-02-02 | 南通大学 | A kind of pedestrian's intelligent identification Method and its system |
Non-Patent Citations (4)
Title |
---|
Assessment of human locomotion by using an insole measurement system and artificial neural networks;KuanZhang et al;《Journal of Biomechanics》;20051130;第38卷(第11期);2276-2287 * |
Elman神经网络在区域速度场建模中的应用;聂建亮等;《大地测量与地球动力学》;20171120;第37卷(第10期);1015-1019 * |
基于行走拓扑结构分析的行人检测;左航等;《光电子.激光》;20100531;第21卷(第5期);749-753 * |
基于轮廓特征与多重分形分析的步态识别方法研究;訾春元;《中国优秀硕士学位论文全文数据库 (信息科技辑)》;20171015(第10期);I138-189 * |
Also Published As
Publication number | Publication date |
---|---|
CN108830246A (en) | 2018-11-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108830246B (en) | Multi-dimensional motion feature visual extraction method for pedestrians in traffic environment | |
EP3614308B1 (en) | Joint deep learning for land cover and land use classification | |
CN110175576A (en) | A kind of driving vehicle visible detection method of combination laser point cloud data | |
CN106875424B (en) | A kind of urban environment driving vehicle Activity recognition method based on machine vision | |
CN105260699B (en) | A kind of processing method and processing device of lane line data | |
Guan et al. | Robust traffic-sign detection and classification using mobile LiDAR data with digital images | |
CN108830171B (en) | Intelligent logistics warehouse guide line visual detection method based on deep learning | |
CN106682586A (en) | Method for real-time lane line detection based on vision under complex lighting conditions | |
CN102385690B (en) | Target tracking method and system based on video image | |
CN103049751A (en) | Improved weighting region matching high-altitude video pedestrian recognizing method | |
CN110379168B (en) | Traffic vehicle information acquisition method based on Mask R-CNN | |
CN110232389A (en) | A kind of stereoscopic vision air navigation aid based on green crop feature extraction invariance | |
CN109255298A (en) | Safety cap detection method and system in a kind of dynamic background | |
CN105404857A (en) | Infrared-based night intelligent vehicle front pedestrian detection method | |
CN108428254A (en) | The construction method and device of three-dimensional map | |
Chao et al. | Multi-lane detection based on deep convolutional neural network | |
CN105279769A (en) | Hierarchical particle filtering tracking method combined with multiple features | |
CN107315998A (en) | Vehicle class division method and system based on lane line | |
Zhang et al. | Gc-net: Gridding and clustering for traffic object detection with roadside lidar | |
CN106056078A (en) | Crowd density estimation method based on multi-feature regression ensemble learning | |
CN105335751B (en) | A kind of berth aircraft nose wheel localization method of view-based access control model image | |
CN113092807B (en) | Urban overhead road vehicle speed measuring method based on multi-target tracking algorithm | |
CN113255779B (en) | Multi-source perception data fusion identification method, system and computer readable storage medium | |
CN108805907B (en) | Pedestrian posture multi-feature intelligent identification method | |
CN108830248B (en) | Pedestrian local feature big data hybrid extraction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |