CN108830246B - Multi-dimensional motion feature visual extraction method for pedestrians in traffic environment - Google Patents

Multi-dimensional motion feature visual extraction method for pedestrians in traffic environment Download PDF

Info

Publication number
CN108830246B
CN108830246B CN201810661219.4A CN201810661219A CN108830246B CN 108830246 B CN108830246 B CN 108830246B CN 201810661219 A CN201810661219 A CN 201810661219A CN 108830246 B CN108830246 B CN 108830246B
Authority
CN
China
Prior art keywords
pedestrian
image
posture
motion
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810661219.4A
Other languages
Chinese (zh)
Other versions
CN108830246A (en
Inventor
刘辉
李燕飞
韩宇阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN201810661219.4A priority Critical patent/CN108830246B/en
Publication of CN108830246A publication Critical patent/CN108830246A/en
Application granted granted Critical
Publication of CN108830246B publication Critical patent/CN108830246B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a traffic environment pedestrian multi-dimensional motion feature visual extraction method, which comprises the following steps: step 1: constructing a pedestrian motion database; step 2: extracting pedestrian detection frame images of the same pedestrian in the continuous image frames; and step 3: extracting HOG characteristics of the same pedestrian movement energy map; and 4, step 4: constructing a pedestrian motion posture recognition model based on an Elman neural network; and 5: judging the pedestrian posture in the current video by utilizing a pedestrian motion posture identification model based on an Elman neural network; step 6: calculating to obtain instantaneous speed sequences of the pedestrian in the X-axis direction and the Y-axis direction to obtain the real-time speed of the pedestrian; and 7: according to a three-dimensional scene under an intersection environment, position information of pedestrians in the image is obtained in real time, and real-time motion characteristics of the pedestrians are obtained by combining the postures and the real-time speeds of the pedestrians. The scheme has the characteristics of high identification accuracy and good robustness, is convenient to apply, and has a good application and popularization space.

Description

Multi-dimensional motion feature visual extraction method for pedestrians in traffic environment
Technical Field
The invention belongs to the field of traffic monitoring, and particularly relates to a visual extraction method for multi-dimensional motion characteristics of pedestrians in a traffic environment.
Background
In recent years, with the rapid development of scientific technology, more and more intelligent methods are applied to the traffic aspect, especially the field of intelligent driving. Traffic safety is a constant topic, and in collision accidents, collisions between vehicles and pedestrians account for a large proportion. The timely detection and posture identification of pedestrians are the key points in the existing intelligent traffic active protection system. To achieve accurate identification, it is most important to extract the motion characteristics of the pedestrian.
The pedestrian posture identification comprises a global feature method and a local feature method. The global feature mostly adopts a motion history image method, that is, frame difference information of a video sequence is accumulated into one frame image, and the frame difference contains certain motion information but does not contain shape information of a moving human body and is easily interfered by noise. Still another method is to extract the static edge information of the pedestrian in each frame of image, and the inter-frame images need artificial combination, which causes difficulty in identification. At present, the speed detection of pedestrians mostly adopts a radar method and cannot be well combined with visual images.
Chinese patent CN105957103A proposes a method for extracting motion features based on vision, which comprises the following steps: 1. extracting a motion vector of each pixel point based on the continuous frames; 2. extracting characteristic points with the pixel values changing strongly in the direction X, Y, T; 3. constructing a cubic characteristic vector of a direction-amplitude histogram based on the motion vector by taking the characteristic point as a center; 4. and forming a code vector for the local descriptor through a clustering algorithm. This patent has the following problems: 1. when the motion vector of each pixel point is extracted, the pixel points are not effectively screened, the data volume is large, and the calculation is complex; 2. the clustering algorithm applied by the patent is prone to local convergence.
In summary, it is urgently needed to provide a method for extracting pedestrian motion characteristics more accurately in a traffic environment.
Disclosure of Invention
The invention provides a visual extraction method for multi-dimensional motion characteristics of pedestrians in a traffic environment, and aims to accurately extract the postures of the pedestrians on a road, timely pre-warn vehicles on a traffic road and reduce traffic accidents.
A traffic environment pedestrian multi-dimensional motion feature visual extraction method comprises the following steps:
step 1: constructing a pedestrian motion database;
collecting various motion postures of pedestrians in various shooting directions of a depth camera and videos of positions of roads where the pedestrians are located, wherein the shooting directions comprise seven directions facing to the right front direction, the left front direction, the right front direction, the side surface, the right back direction, the left back direction and the right back direction of a lens, and the postures comprise three kinds of walking, running and standing;
step 2: extracting images of videos in a pedestrian motion database, preprocessing the extracted images to obtain a pedestrian detection frame of each frame of image, and extracting pedestrian detection frame images of the same pedestrian in continuous image frames;
and step 3: carrying out graying processing on each pedestrian detection frame image, synthesizing a motion energy map of a grayscale image corresponding to the pedestrian detection frame image of the same pedestrian in a continuous image frame, and extracting the HOG characteristic of the motion energy map;
and 4, step 4: constructing a pedestrian motion posture recognition model based on an Elman neural network;
taking a motion energy map corresponding to each pedestrian in the continuous image frames as input data, taking the posture of the corresponding pedestrian as output data, and training the Elman neural network;
the standing posture output corresponds to [001], the walking posture output corresponds to [010], and the running posture output corresponds to [100 ];
the Elman neural network parameter setting method comprises the steps that the number of nodes of an input layer corresponds to the number x of motion energy image pixels, the number of nodes of a hidden layer is 2x +1, the number of nodes of an output layer is 3, the maximum iteration number is 1500, the learning rate is 0.001, and the threshold value is 0.00001;
and 5: judging the pedestrian posture in the current video by utilizing a pedestrian motion posture identification model based on an Elman neural network;
extracting the pedestrian detection frame images of the same pedestrian in the continuous frame images from the current video according to the step 2, inputting the images into a pedestrian motion posture recognition model based on an Elman neural network to obtain corresponding postures, and distinguishing the postures;
step 6: calculating a pixel coordinate change sequence of the vertex of the lower left corner of the pedestrian detection frame of the same pedestrian in the continuous frame images, and calculating to obtain instantaneous speed sequences of the pedestrian in the X-axis direction and the Y-axis direction to obtain the real-time speed of the pedestrian;
and 7: according to a three-dimensional scene under an intersection environment, position information of pedestrians in the image is obtained in real time, and real-time motion characteristics of the pedestrians are obtained by combining the postures and the real-time speeds of the pedestrians.
The method comprises the steps that a depth camera is adopted by a camera of the intersection, a three-dimensional scene under the intersection environment is built, position information of pedestrians in an image is obtained in real time, the three-dimensional scene is divided into a pedestrian road and a vehicle road according to the actual road condition, when a person enters the three-dimensional scene, an ID is built for each person, and the motion characteristics of the person are judged through continuous frame image information.
Further, optimizing the weight and the threshold of the Elman neural network in the pedestrian motion posture recognition model based on the Elman neural network by using a chicken swarm algorithm, and specifically comprising the following steps of:
step A1: taking the individual positions of the chicken flocks as the weight and the threshold of the Elman neural network, and initializing chicken flock parameters;
the population scale M is [20,100], the search space dimension is j, the value of j is the sum of the weight of the Elman neural network to be optimized and the parameter number of the threshold, the maximum iteration time T is [400,1000], the iteration time is T, the initial value is 0, the proportion Pg of the cock is 20%, the proportion Pm of the hen is 70%, the proportion Px of the chick is 10%, and the hens are randomly selected from the hens, and the proportion Pd is 10%;
step A2: setting a fitness function, and enabling the iteration time t to be 1;
sequentially substituting the weight value and the threshold value corresponding to the individual positions of the chicken flock into the pedestrian motion gesture recognition model based on the Elman neural network, and determining the pedestrian detection frame image of the same pedestrian in the continuous frame images by utilizing the pedestrian motion gesture recognition model based on the Elman neural network determined by the individual positions of the chicken flockThe reciprocal of the difference between the pedestrian posture detection value of the same pedestrian in the pedestrian detection frame image in the continuous frame image and the corresponding actual pedestrian posture value is used as the first fitness function f1(x);
The greater the fitness, the more excellent the individual;
step A3: constructing a chicken flock subgroup;
sorting according to all individual fitness values, selecting chicken individuals with fitness values M × Pg in front of the fitness values to be judged as cocks, wherein each cock is used as the head of a subgroup; selecting chicken flock individuals with the fitness value of M x Px after ranking as chickens; judging other chicken individuals as hens;
dividing the chicken group into subgroups according to the number of the cocks, wherein one subgroup comprises one cock, a plurality of chickens and a plurality of hens, and each chicken randomly selects one hen in the subgroup to construct a hen-offspring relationship;
step A4: updating the individual positions of the chicken flock and calculating the fitness of each individual at present;
cock position updating formula:
Figure BDA0001706824090000031
wherein the content of the first and second substances,
Figure BDA0001706824090000032
indicates the position of the individual cock i in the j-dimensional space in the t-th iteration,
Figure BDA0001706824090000033
corresponding to the new position of the individual cock in the t +1 iteration, r (0, sigma)2) Subject to a mean of 0 and a standard deviation of σ2Normal distribution of (0, σ)2);
Hen position update formula:
Figure BDA0001706824090000034
wherein the content of the first and second substances,
Figure BDA0001706824090000035
to locate the hen g in j-dimensional space in the t-th iteration,
Figure BDA0001706824090000036
is the only cock i in the subgroup of hens g in the t-th iteration1The location of the individual is determined by the location of the individual,
Figure BDA0001706824090000037
is a random cock i outside the subgroup of the hen i in the t-th iteration2Individual position, rand (0,1) is a random function, values are uniformly and randomly taken between (0,1), and L1、L2Updating the coefficients, L, for the positions of the hen i affected by the subgroup and other subgroups1Value range of [0.25,0.55 ]],L2Value range of [0.15,0.35 ]];
Chicken position update formula:
Figure BDA0001706824090000038
wherein the content of the first and second substances,
Figure BDA0001706824090000039
to locate chicken/in j-dimensional space in the t-th iteration,
Figure BDA00017068240900000310
for the hen g of the mother generation with the chick l corresponding to the mother-child relationship in the t-th iterationmThe location of the individual is determined by the location of the individual,
Figure BDA00017068240900000311
omega, alpha and beta are the chicken self-update coefficients [0.2,0.7 ] respectively for the unique individual positions of the cocks in the subgroup of the chicks in the t iteration]Coefficient of following hen generation [0.5,0.8 ]]Coefficient of following cock [0.8, 1.5%];
Step A5: and updating the individual optimal position and the chicken swarm whole individual optimal position according to the fitness function, judging whether the maximum iteration times is reached, if so, quitting, otherwise, making t equal to t +1, and turning to the step A3 until the maximum iteration times is met, outputting the weight and the threshold of the Elman neural network corresponding to the optimal chicken swarm individual position, and obtaining the pedestrian motion posture recognition model based on the Elman neural network.
Further, the real-time speed of the pedestrian is
Figure BDA00017068240900000312
Figure BDA0001706824090000041
Wherein the content of the first and second substances,
Figure BDA0001706824090000042
and
Figure BDA0001706824090000043
respectively representing the instantaneous speeds of the pedestrian in the X-axis direction and the Y-axis direction,
Figure BDA0001706824090000044
ΔWj=k|w2-w1|=k|x2×P-x1×P|,ΔLj=|f(l2)-f(l1)|,l1=(N-y1)×P,
Figure BDA0001706824090000045
the pixel coordinates of the pedestrian target point in the previous frame image and the current frame image are respectively (x)1,y1) And (x)2,y2);l1And l2Respectively representing the distance between the pedestrian target point and the Y-axis edge of the display screen in two adjacent frames of images;
k represents the ratio of the actual scene distance to the scene imaging distance in the display screen, and M and N respectively represent the number of total pixel points in the X-axis direction and the Y-axis direction in the display screen; p represents the length of each pixel point in the display screen, and MP and NP are the total length of the X axis and the Y axis of the whole screen respectively; Δ WjAnd Δ LjRespectively representing the edges of the pedestrian target point in the two adjacent frames of imagesDisplacement in the X-axis and Y-axis directions;
AB represents the distance from the depth camera to the pedestrian, alpha represents the included angle between the connecting line between the depth camera and the pedestrian and the ground plane, theta is the included angle between the straight line between the depth camera and the pedestrian and the imaging plane, and m is the frame number.
And the values of AB, alpha and theta are obtained by real-time measurement by using a depth camera.
Further, according to the real-time motion characteristics of the pedestrians, carrying out pedestrian behavior level early warning on the vehicles on the traffic road;
the behavior levels comprise three levels of security, threat and danger;
the safety behaviors comprise that pedestrians are in a standing posture beyond one meter away from a traffic road, pedestrians are on a sidewalk and beyond one meter away from the traffic road and in a walking posture along the parallel direction of the traffic road or back to the traffic road, and back to the traffic road is in a running posture;
the threat behaviors comprise that pedestrians are within one meter of a pedestrian road on a sidewalk and a vehicle road, are positioned in the pedestrian road and are in a standing posture, and are within one meter of the pedestrian road and the vehicle road edge and are in a running posture;
the dangerous behaviors include a pedestrian on a sidewalk toward a traffic road direction or a pedestrian in a running posture in the traffic road, and in a walking posture in the traffic road;
when the walking speed of the pedestrian in the threatening behavior is more than 1.9m/s or the running speed is more than 8m/s, the threatening behavior is upgraded to dangerous behavior.
The behavior levels refer to safety conditions of states of pedestrians in the traffic environment, and different behavior levels prompt drivers of vehicles running in the traffic environment to ensure traffic safety;
further, the pedestrian target point is a lower left corner pixel point of the pedestrian detection frame image.
Further, preprocessing a pedestrian image frame, setting a pedestrian detection frame, a pedestrian target identifier and a pedestrian position label vector for the preprocessed image, and constructing a pedestrian track;
the pedestrian detection frame is a minimum circumscribed rectangle of a pedestrian outline in a pedestrian image frame;
the pedestrian target identification is a unique identification P of different pedestrians appearing in all the pedestrian image frames;
the expression form of the pedestrian position label vector is [ t, x, y, a, b ], t represents that the current pedestrian image frame belongs to the t-th frame in the monitoring video, x and y respectively represent the abscissa and the ordinate of the lower left corner of a pedestrian detection frame in the pedestrian image frame, and a and b respectively represent the length and the width of the pedestrian detection frame;
the appearance result of the pedestrian in the previous frame of pedestrian image in the next frame of pedestrian image means that if the pedestrian in the previous frame of pedestrian image appears in the next frame of pedestrian image, the tracking result of the pedestrian is 1, otherwise, the tracking result is 0; and if the pedestrian tracking result is 1, adding the corresponding pedestrian position label vector appearing in the pedestrian image of the next frame into the pedestrian track.
Advantageous effects
The invention provides a traffic environment pedestrian multi-dimensional motion feature visual extraction method, which comprises the following steps: step 1: constructing a pedestrian motion database; step 2: extracting images of videos in a pedestrian motion database, preprocessing the extracted images to obtain a pedestrian detection frame of each frame of image, and extracting pedestrian detection frame images of the same pedestrian in continuous image frames; and step 3: carrying out graying processing on each pedestrian detection frame image, synthesizing a motion energy map of a grayscale image corresponding to the pedestrian detection frame image of the same pedestrian in a continuous image frame, and extracting the HOG characteristic of the motion energy map; and 4, step 4: constructing a pedestrian motion posture recognition model based on an Elman neural network; and 5: judging the pedestrian posture in the current video by utilizing a pedestrian motion posture identification model based on an Elman neural network; step 6: calculating a pixel coordinate change sequence of the vertex of the lower left corner of the pedestrian detection frame of the same pedestrian in the continuous frame images, and calculating to obtain instantaneous speed sequences of the pedestrian in the X-axis direction and the Y-axis direction to obtain the real-time speed of the pedestrian; and 7: according to a three-dimensional scene under an intersection environment, position information of pedestrians in the image is obtained in real time, and real-time motion characteristics of the pedestrians are obtained by combining the postures and the real-time speeds of the pedestrians.
Compared with the prior art, the method has the following advantages:
1. the identification accuracy is high: the HOG characteristics of the synthesized motion energy map extracted by the invention not only comprise the pedestrian motion information of the whole image sequence, but also comprise the motion energy information of the pedestrian, and the characteristics are representative, so that the gesture identification of the pedestrian can be greatly facilitated;
2. the application is convenient: the pedestrian speed calculation method provided by the invention is directly operated based on the visual image, thereby realizing the perfect combination of speed detection and image recognition and facilitating the use of users;
3. the invention realizes the posture identification of the pedestrian in the image and the speed calculation of the pedestrian, has complete network structure and can greatly facilitate users;
4. the robustness is good: the invention uses the neural network, has strong nonlinear fitting capability and has better robustness when dealing with the problems of illumination change, pedestrian shielding and the like.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
fig. 2 is a schematic diagram of a distance relationship between a depth camera and a pedestrian.
Detailed Description
The invention will be further described with reference to the following figures and examples.
As shown in fig. 1, a method for visually extracting multidimensional motion features of pedestrians in traffic environment includes the following steps:
step 1: constructing a pedestrian motion database;
collecting various motion postures of pedestrians in various shooting directions of a depth camera and videos of positions of roads where the pedestrians are located, wherein the shooting directions comprise seven directions facing to the right front direction, the left front direction, the right front direction, the side surface, the right back direction, the left back direction and the right back direction of a lens, and the postures comprise three kinds of walking, running and standing;
step 2: extracting images of videos in a pedestrian motion database, preprocessing the extracted images to obtain a pedestrian detection frame of each frame of image, and extracting pedestrian detection frame images of the same pedestrian in continuous image frames;
and step 3: carrying out graying processing on each pedestrian detection frame image, synthesizing a motion energy map of a grayscale image corresponding to the pedestrian detection frame image of the same pedestrian in a continuous image frame, and extracting the HOG characteristic of the motion energy map;
and 4, step 4: constructing a pedestrian motion posture recognition model based on an Elman neural network;
taking a motion energy map corresponding to each pedestrian in the continuous image frames as input data, taking the posture of the corresponding pedestrian as output data, and training the Elman neural network;
the standing posture output corresponds to [001], the walking posture output corresponds to [010], and the running posture output corresponds to [100 ];
the Elman neural network parameter setting method comprises the steps that the number of nodes of an input layer corresponds to the number x of motion energy image pixels, the number of nodes of a hidden layer is 2x +1, the number of nodes of an output layer is 3, the maximum iteration number is 1500, the learning rate is 0.001, and the threshold value is 0.00001;
optimizing the weight and the threshold of the Elman neural network in the pedestrian motion posture recognition model based on the Elman neural network by using a chicken swarm algorithm, and specifically comprising the following steps:
step A1: taking the individual positions of the chicken flocks as the weight and the threshold of the Elman neural network, and initializing chicken flock parameters;
the population scale M is [20,100], the search space dimension is j, the value of j is the sum of the weight of the Elman neural network to be optimized and the parameter number of the threshold, the maximum iteration time T is [400,1000], the iteration time is T, the initial value is 0, the proportion Pg of the cock is 20%, the proportion Pm of the hen is 70%, the proportion Px of the chick is 10%, and the hens are randomly selected from the hens, and the proportion Pd is 10%;
step A2: setting a fitness function, and enabling the iteration time t to be 1;
sequentially substituting weights and thresholds corresponding to individual positions of chicken flocks into pedestrian motion posture recognition based on the Elman neural networkIn the model, the pedestrian gesture of the same input pedestrian in the pedestrian detection frame images in the continuous frame images is determined by utilizing an Elman neural network-based pedestrian motion gesture recognition model for chicken flock individual position determination, and the reciprocal of the difference between the pedestrian gesture detection value of the pedestrian detection frame image of the same pedestrian in the continuous frame images and the corresponding pedestrian gesture actual value is used as a first fitness function f1(x);
The greater the fitness, the more excellent the individual;
step A3: constructing a chicken flock subgroup;
sorting according to all individual fitness values, selecting chicken individuals with fitness values M × Pg in front of the fitness values to be judged as cocks, wherein each cock is used as the head of a subgroup; selecting chicken flock individuals with the fitness value of M x Px after ranking as chickens; judging other chicken individuals as hens;
dividing the chicken group into subgroups according to the number of the cocks, wherein one subgroup comprises one cock, a plurality of chickens and a plurality of hens, and each chicken randomly selects one hen in the subgroup to construct a hen-offspring relationship;
step A4: updating the individual positions of the chicken flock and calculating the fitness of each individual at present;
cock position updating formula:
Figure BDA0001706824090000071
wherein the content of the first and second substances,
Figure BDA0001706824090000072
indicates the position of the individual cock i in the j-dimensional space in the t-th iteration,
Figure BDA0001706824090000073
corresponding to the new position of the individual cock in the t +1 iteration, r (0, sigma)2) Subject to a mean of 0 and a standard deviation of σ2Normal distribution of (0, σ)2);
Hen position update formula:
Figure BDA0001706824090000074
wherein the content of the first and second substances,
Figure BDA0001706824090000075
to locate the hen g in j-dimensional space in the t-th iteration,
Figure BDA0001706824090000076
is the only cock i in the subgroup of hens g in the t-th iteration1The location of the individual is determined by the location of the individual,
Figure BDA0001706824090000077
is a random cock i outside the subgroup of the hen i in the t-th iteration2Individual position, rand (0,1) is a random function, values are uniformly and randomly taken between (0,1), and L1、L2Updating the coefficients, L, for the positions of the hen i affected by the subgroup and other subgroups1Value range of [0.25,0.55 ]],L2Value range of [0.15,0.35 ]];
Chicken position update formula:
Figure BDA0001706824090000078
wherein the content of the first and second substances,
Figure BDA0001706824090000079
to locate chicken/in j-dimensional space in the t-th iteration,
Figure BDA00017068240900000710
for the hen g of the mother generation with the chick l corresponding to the mother-child relationship in the t-th iterationmThe location of the individual is determined by the location of the individual,
Figure BDA00017068240900000711
omega, alpha and beta are the chicken self-update coefficients [0.2,0.7 ] respectively for the unique individual positions of the cocks in the subgroup of the chicks in the t iteration]Coefficient of following hen generation [0.5,0.8 ]]Coefficient of following cock [0.8, 1.5%];
Step A5: and updating the individual optimal position and the chicken swarm whole individual optimal position according to the fitness function, judging whether the maximum iteration times is reached, if so, quitting, otherwise, making t equal to t +1, and turning to the step A3 until the maximum iteration times is met, outputting the weight and the threshold of the Elman neural network corresponding to the optimal chicken swarm individual position, and obtaining the pedestrian motion posture recognition model based on the Elman neural network.
And 5: judging the pedestrian posture in the current video by utilizing a pedestrian motion posture identification model based on an Elman neural network;
extracting the pedestrian detection frame images of the same pedestrian in the continuous frame images from the current video according to the step 2, inputting the images into a pedestrian motion posture recognition model based on an Elman neural network to obtain corresponding postures, and distinguishing the postures;
step 6: calculating a pixel coordinate change sequence of the vertex of the lower left corner of the pedestrian detection frame of the same pedestrian in the continuous frame images, and calculating to obtain instantaneous speed sequences of the pedestrian in the X-axis direction and the Y-axis direction to obtain the real-time speed of the pedestrian;
the pedestrian real-time speed is
Figure BDA0001706824090000081
Figure BDA0001706824090000082
Wherein the content of the first and second substances,
Figure BDA0001706824090000083
and
Figure BDA0001706824090000084
respectively representing the instantaneous speeds of the pedestrian in the X-axis direction and the Y-axis direction,
Figure BDA0001706824090000085
ΔWj=k|w2-w1|=k|x2×P-x1×P|,ΔLj=|f(l2)-f(l1)|,l1=(N-y1)×P,
Figure BDA0001706824090000086
the pixel coordinates of the pedestrian target point in the previous frame image and the current frame image are respectively (x)1,y1) And (x)2,y2);l1And l2Respectively representing the distance between the pedestrian target point and the Y-axis edge of the display screen in two adjacent frames of images;
k represents the ratio of the actual scene distance to the scene imaging distance in the display screen, and M and N respectively represent the number of total pixel points in the X-axis direction and the Y-axis direction in the display screen; p represents the length of each pixel point in the display screen, and MP and NP are the total length of the X axis and the Y axis of the whole screen respectively; Δ WjAnd Δ LjRespectively representing the displacement of the pedestrian target point in the directions of the X axis and the Y axis in two adjacent frames of images;
as shown in fig. 2, AB represents the distance from the depth camera to the pedestrian, α represents the included angle between the connecting line between the depth camera and the pedestrian and the ground plane, θ is the included angle between the straight line between the depth camera and the pedestrian and the imaging plane, values of AB, α, and θ are obtained by real-time measurement using the depth camera, and m is the frame number.
And 7: according to a three-dimensional scene under an intersection environment, position information of pedestrians in the image is obtained in real time, and real-time motion characteristics of the pedestrians are obtained by combining the postures and the real-time speeds of the pedestrians.
The method comprises the steps that a depth camera is adopted by a camera of the intersection, a three-dimensional scene under the intersection environment is built, position information of pedestrians in an image is obtained in real time, the three-dimensional scene is divided into a pedestrian road and a vehicle road according to the actual road condition, when a person enters the three-dimensional scene, an ID is built for each person, and the motion characteristics of the person are judged through continuous frame image information.
Carrying out pedestrian behavior level early warning on vehicles on a traffic road according to real-time motion characteristics of pedestrians;
the behavior levels comprise three levels of security, threat and danger;
the safety behaviors comprise that pedestrians are in a standing posture beyond one meter away from a traffic road, pedestrians are on a sidewalk and beyond one meter away from the traffic road and in a walking posture along the parallel direction of the traffic road or back to the traffic road, and back to the traffic road is in a running posture;
the threat behaviors comprise that pedestrians are within one meter of a pedestrian road on a sidewalk and a vehicle road, are positioned in the pedestrian road and are in a standing posture, and are within one meter of the pedestrian road and the vehicle road edge and are in a running posture;
the dangerous behaviors include a pedestrian on a sidewalk toward a traffic road direction or a pedestrian in a running posture in the traffic road, and in a walking posture in the traffic road;
when the walking speed of the pedestrian in the threatening behavior is more than 1.9m/s or the running speed is more than 8m/s, the threatening behavior is upgraded to dangerous behavior.
The behavior levels refer to safety conditions of states of pedestrians in the traffic environment, and different behavior levels prompt drivers of vehicles running in the traffic environment to ensure traffic safety;
in this example, the lower left corner pixel point of the pedestrian detection frame image is used as the pedestrian target point.
Preprocessing a pedestrian image frame, setting a pedestrian detection frame, a pedestrian target identifier and a pedestrian position tag vector for the preprocessed image, and constructing a pedestrian track;
the pedestrian detection frame is a minimum circumscribed rectangle of a pedestrian outline in a pedestrian image frame;
the pedestrian target identification is a unique identification P of different pedestrians appearing in all the pedestrian image frames;
the expression form of the pedestrian position label vector is [ t, x, y, a, b ], t represents that the current pedestrian image frame belongs to the t-th frame in the monitoring video, x and y respectively represent the abscissa and the ordinate of the lower left corner of a pedestrian detection frame in the pedestrian image frame, and a and b respectively represent the length and the width of the pedestrian detection frame;
the appearance result of the pedestrian in the previous frame of pedestrian image in the next frame of pedestrian image means that if the pedestrian in the previous frame of pedestrian image appears in the next frame of pedestrian image, the tracking result of the pedestrian is 1, otherwise, the tracking result is 0; and if the pedestrian tracking result is 1, adding the corresponding pedestrian position label vector appearing in the pedestrian image of the next frame into the pedestrian track.
The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims (6)

1. A traffic environment pedestrian multi-dimensional motion feature visual extraction method is characterized by comprising the following steps:
step 1: constructing a pedestrian motion database;
collecting various motion postures of pedestrians in various shooting directions of a depth camera and videos of positions of roads where the pedestrians are located, wherein the shooting directions comprise seven directions facing to the right front direction, the left front direction, the right front direction, the side surface, the right back direction, the left back direction and the right back direction of a lens, and the postures comprise three kinds of walking, running and standing;
step 2: extracting images of videos in a pedestrian motion database, preprocessing the extracted images to obtain a pedestrian detection frame of each frame of image, and extracting pedestrian detection frame images of the same pedestrian in continuous image frames;
and step 3: carrying out graying processing on each pedestrian detection frame image, synthesizing a motion energy map of a grayscale image corresponding to the pedestrian detection frame image of the same pedestrian in a continuous image frame, and extracting the HOG characteristic of the motion energy map;
and 4, step 4: constructing a pedestrian motion posture recognition model based on an Elman neural network;
taking a motion energy map corresponding to each pedestrian in the continuous image frames as input data, taking the posture of the corresponding pedestrian as output data, and training the Elman neural network;
the standing posture output corresponds to [001], the walking posture output corresponds to [010], and the running posture output corresponds to [100 ];
the Elman neural network parameter setting method comprises the steps that the number of nodes of an input layer corresponds to the number x of motion energy image pixels, the number of nodes of a hidden layer is 2x +1, the number of nodes of an output layer is 3, the maximum iteration number is 1500, the learning rate is 0.001, and the threshold value is 0.00001;
and 5: judging the pedestrian posture in the current video by utilizing a pedestrian motion posture identification model based on an Elman neural network;
extracting the pedestrian detection frame images of the same pedestrian in the continuous frame images from the current video according to the step 2, inputting the images into a pedestrian motion posture recognition model based on an Elman neural network to obtain corresponding postures, and distinguishing the postures;
step 6: calculating a pixel coordinate change sequence of the vertex of the lower left corner of the pedestrian detection frame of the same pedestrian in the continuous frame images, and calculating to obtain instantaneous speed sequences of the pedestrian in the X-axis direction and the Y-axis direction to obtain the real-time speed of the pedestrian;
and 7: according to a three-dimensional scene under an intersection environment, position information of pedestrians in the image is obtained in real time, and real-time motion characteristics of the pedestrians are obtained by combining the postures and the real-time speeds of the pedestrians.
2. The method according to claim 1, wherein a chicken flock algorithm is used for optimizing the weight and threshold of the Elman neural network in the pedestrian motion posture recognition model based on the Elman neural network, and the specific steps are as follows:
step A1: taking the individual positions of the chicken flocks as the weight and the threshold of the Elman neural network, and initializing chicken flock parameters;
the population scale M is [20,100], the search space dimension is j, the value of j is the sum of the weight of the Elman neural network to be optimized and the parameter number of the threshold, the maximum iteration time T is [400,1000], the iteration time is T, the initial value is 0, the proportion Pg of the cock is 20%, the proportion Pm of the hen is 70%, the proportion Px of the chick is 10%, and the hens are randomly selected from the hens, and the proportion Pd is 10%;
step A2: setting a fitness function, and enabling the iteration time t to be 1;
in turn willSubstituting the weight value and the threshold value corresponding to the chicken group individual position into the pedestrian motion posture recognition model based on the Elman neural network, determining the pedestrian posture of the input same pedestrian in the pedestrian detection frame image in the continuous frame image by using the pedestrian motion posture recognition model based on the Elman neural network determined by the chicken group individual position, and taking the reciprocal of the difference between the pedestrian posture detection value of the pedestrian detection frame image of the same pedestrian in the continuous frame image and the corresponding actual pedestrian posture value as a first fitness function f1(x);
Step A3: constructing a chicken flock subgroup;
sorting according to all individual fitness values, selecting chicken individuals with fitness values M × Pg in front of the fitness values to be judged as cocks, wherein each cock is used as the head of a subgroup; selecting chicken flock individuals with the fitness value of M x Px after ranking as chickens; judging other chicken individuals as hens;
dividing the chicken group into subgroups according to the number of the cocks, wherein one subgroup comprises one cock, a plurality of chickens and a plurality of hens, and each chicken randomly selects one hen in the subgroup to construct a hen-offspring relationship;
step A4: updating the individual positions of the chicken flock and calculating the fitness of each individual at present;
cock position updating formula:
Figure FDA0001706824080000021
wherein the content of the first and second substances,
Figure FDA0001706824080000022
indicates the position of the individual cock i in the j-dimensional space in the t-th iteration,
Figure FDA0001706824080000023
corresponding to the new position of the individual cock in the t +1 iteration, r (0, sigma)2) Subject to a mean of 0 and a standard deviation of σ2Normal distribution of (0, σ)2);
Hen position update formula:
Figure FDA0001706824080000024
wherein the content of the first and second substances,
Figure FDA0001706824080000025
to locate the hen g in j-dimensional space in the t-th iteration,
Figure FDA0001706824080000026
is the only cock i in the subgroup of hens g in the t-th iteration1The location of the individual is determined by the location of the individual,
Figure FDA0001706824080000027
is a random cock i outside the subgroup of the hen i in the t-th iteration2Individual position, rand (0,1) is a random function, values are uniformly and randomly taken between (0,1), and L1、L2Updating the coefficients, L, for the positions of the hen i affected by the subgroup and other subgroups1Value range of [0.25,0.55 ]],L2Value range of [0.15,0.35 ]];
Chicken position update formula:
Figure FDA0001706824080000028
wherein the content of the first and second substances,
Figure FDA0001706824080000029
to locate chicken/in j-dimensional space in the t-th iteration,
Figure FDA00017068240800000210
for the hen g of the mother generation with the chick l corresponding to the mother-child relationship in the t-th iterationmThe location of the individual is determined by the location of the individual,
Figure FDA00017068240800000211
omega, alpha and beta are the chicken self-update coefficients [0.2,0.7 ] respectively for the unique individual positions of the cocks in the subgroup of the chicks in the t iteration]Coefficient of following hen generation [0.5,0.8 ]]Coefficient of following cock [0.8, 1.5%];
Step A5: and updating the individual optimal position and the chicken swarm whole individual optimal position according to the fitness function, judging whether the maximum iteration times is reached, if so, quitting, otherwise, making t equal to t +1, and turning to the step A3 until the maximum iteration times is met, outputting the weight and the threshold of the Elman neural network corresponding to the optimal chicken swarm individual position, and obtaining the pedestrian motion posture recognition model based on the Elman neural network.
3. The method of claim 1, wherein the pedestrian real-time speed is
Figure FDA0001706824080000031
Figure FDA0001706824080000032
Wherein the content of the first and second substances,
Figure FDA0001706824080000033
and
Figure FDA0001706824080000034
respectively representing the instantaneous speeds of the pedestrian in the X-axis direction and the Y-axis direction,
Figure FDA0001706824080000035
ΔWj=k|w2-w1|=k|x2×P-x1×P|,ΔLj=|f(l2)-f(l1)|,l1=(N-y1)×P,l2=(N-y2)×P,
Figure FDA0001706824080000036
the pixel coordinates of the pedestrian target point in the previous frame image and the current frame image are respectively (x)1,y1) And (x)2,y2);l1And l2Respectively representing the distance between the pedestrian target point and the Y-axis edge of the display screen in two adjacent frames of images;
k represents the ratio of the actual scene distance to the scene imaging distance in the display screen, and M and N respectively represent the number of total pixel points in the X-axis direction and the Y-axis direction in the display screen; p represents the length of each pixel point in the display screen, and MP and NP are the total length of the X axis and the Y axis of the whole screen respectively; Δ WjAnd Δ LjRespectively representing the displacement of the pedestrian target point in the directions of the X axis and the Y axis in two adjacent frames of images;
AB represents the distance from the depth camera to the pedestrian, alpha represents the included angle between the connecting line between the depth camera and the pedestrian and the ground plane, theta is the included angle between the straight line between the depth camera and the pedestrian and the imaging plane, and m is the frame number.
4. The method according to any one of claims 1 to 3, characterized in that pedestrian behavior level early warning is performed on vehicles on a traffic road according to real-time motion characteristics of pedestrians;
the behavior levels comprise three levels of security, threat and danger;
the safety behaviors comprise that pedestrians are in a standing posture beyond one meter away from a traffic road, pedestrians are on a sidewalk and beyond one meter away from the traffic road and in a walking posture along the parallel direction of the traffic road or back to the traffic road, and back to the traffic road is in a running posture;
the threat behaviors comprise that pedestrians are within one meter of a pedestrian road on a sidewalk and a vehicle road, are positioned in the pedestrian road and are in a standing posture, and are within one meter of the pedestrian road and the vehicle road edge and are in a running posture;
the dangerous behaviors include a pedestrian on a sidewalk toward a traffic road direction or a pedestrian in a running posture in the traffic road, and in a walking posture in the traffic road;
when the walking speed of the pedestrian in the threatening behavior is more than 1.9m/s or the running speed is more than 8m/s, the threatening behavior is upgraded to dangerous behavior.
5. The method of claim 4, wherein the pedestrian target point is a lower left corner pixel point of a pedestrian detection frame image.
6. The method according to claim 5, characterized in that, the pedestrian image frame is preprocessed, and a pedestrian detection frame, a pedestrian target identification and a pedestrian position label vector are arranged on the preprocessed image to construct a pedestrian track;
the pedestrian detection frame is a minimum circumscribed rectangle of a pedestrian outline in a pedestrian image frame;
the pedestrian target identification is a unique identification P of different pedestrians appearing in all the pedestrian image frames;
the expression form of the pedestrian position label vector is [ t, x, y, a, b ], t represents that the current pedestrian image frame belongs to the t-th frame in the monitoring video, x and y respectively represent the abscissa and the ordinate of the lower left corner of a pedestrian detection frame in the pedestrian image frame, and a and b respectively represent the length and the width of the pedestrian detection frame;
the appearance result of the pedestrian in the previous frame of pedestrian image in the next frame of pedestrian image means that if the pedestrian in the previous frame of pedestrian image appears in the next frame of pedestrian image, the tracking result of the pedestrian is 1, otherwise, the tracking result is 0; and if the pedestrian tracking result is 1, adding the corresponding pedestrian position label vector appearing in the pedestrian image of the next frame into the pedestrian track.
CN201810661219.4A 2018-06-25 2018-06-25 Multi-dimensional motion feature visual extraction method for pedestrians in traffic environment Active CN108830246B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810661219.4A CN108830246B (en) 2018-06-25 2018-06-25 Multi-dimensional motion feature visual extraction method for pedestrians in traffic environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810661219.4A CN108830246B (en) 2018-06-25 2018-06-25 Multi-dimensional motion feature visual extraction method for pedestrians in traffic environment

Publications (2)

Publication Number Publication Date
CN108830246A CN108830246A (en) 2018-11-16
CN108830246B true CN108830246B (en) 2022-02-15

Family

ID=64138303

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810661219.4A Active CN108830246B (en) 2018-06-25 2018-06-25 Multi-dimensional motion feature visual extraction method for pedestrians in traffic environment

Country Status (1)

Country Link
CN (1) CN108830246B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109558505A (en) * 2018-11-21 2019-04-02 百度在线网络技术(北京)有限公司 Visual search method, apparatus, computer equipment and storage medium
CN111265218A (en) * 2018-12-05 2020-06-12 阿里巴巴集团控股有限公司 Motion attitude data processing method and device and electronic equipment
CN110427800A (en) 2019-06-17 2019-11-08 平安科技(深圳)有限公司 Video object acceleration detection method, apparatus, server and storage medium
CN110632636B (en) * 2019-09-11 2021-10-22 桂林电子科技大学 Carrier attitude estimation method based on Elman neural network
CN111338344A (en) * 2020-02-28 2020-06-26 北京小马慧行科技有限公司 Vehicle control method and device and vehicle
CN115092091A (en) * 2022-07-11 2022-09-23 中国第一汽车股份有限公司 Vehicle and pedestrian protection system and method based on Internet of vehicles
CN116935447B (en) * 2023-09-19 2023-12-26 华中科技大学 Self-adaptive teacher-student structure-based unsupervised domain pedestrian re-recognition method and system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN206033008U (en) * 2016-03-09 2017-03-22 秀景A.I.D 股份有限公司 Automatic hand track sterilization equipment of power type
CN106789214A (en) * 2016-12-12 2017-05-31 广东工业大学 It is a kind of based on the just remaining pair network situation awareness method and device of string algorithm
CN106875424A (en) * 2017-01-16 2017-06-20 西北工业大学 A kind of urban environment driving vehicle Activity recognition method based on machine vision
CN107122707A (en) * 2017-03-17 2017-09-01 山东大学 Video pedestrian based on macroscopic features compact representation recognition methods and system again
CN107126224A (en) * 2017-06-20 2017-09-05 中南大学 A kind of real-time monitoring of track train driver status based on Kinect and method for early warning and system
CN107153800A (en) * 2017-05-04 2017-09-12 天津工业大学 A kind of reader antenna Optimization deployment scheme that alignment system is recognized based on the super high frequency radio frequency for improving chicken group's algorithm
CN107203753A (en) * 2017-05-25 2017-09-26 西安工业大学 A kind of action identification method based on fuzzy neural network and graph model reasoning
CN107657232A (en) * 2017-09-28 2018-02-02 南通大学 A kind of pedestrian's intelligent identification Method and its system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9821470B2 (en) * 2014-09-17 2017-11-21 Brain Corporation Apparatus and methods for context determination using real time sensor data

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN206033008U (en) * 2016-03-09 2017-03-22 秀景A.I.D 股份有限公司 Automatic hand track sterilization equipment of power type
CN106789214A (en) * 2016-12-12 2017-05-31 广东工业大学 It is a kind of based on the just remaining pair network situation awareness method and device of string algorithm
CN106875424A (en) * 2017-01-16 2017-06-20 西北工业大学 A kind of urban environment driving vehicle Activity recognition method based on machine vision
CN107122707A (en) * 2017-03-17 2017-09-01 山东大学 Video pedestrian based on macroscopic features compact representation recognition methods and system again
CN107153800A (en) * 2017-05-04 2017-09-12 天津工业大学 A kind of reader antenna Optimization deployment scheme that alignment system is recognized based on the super high frequency radio frequency for improving chicken group's algorithm
CN107203753A (en) * 2017-05-25 2017-09-26 西安工业大学 A kind of action identification method based on fuzzy neural network and graph model reasoning
CN107126224A (en) * 2017-06-20 2017-09-05 中南大学 A kind of real-time monitoring of track train driver status based on Kinect and method for early warning and system
CN107657232A (en) * 2017-09-28 2018-02-02 南通大学 A kind of pedestrian's intelligent identification Method and its system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Assessment of human locomotion by using an insole measurement system and artificial neural networks;KuanZhang et al;《Journal of Biomechanics》;20051130;第38卷(第11期);2276-2287 *
Elman神经网络在区域速度场建模中的应用;聂建亮等;《大地测量与地球动力学》;20171120;第37卷(第10期);1015-1019 *
基于行走拓扑结构分析的行人检测;左航等;《光电子.激光》;20100531;第21卷(第5期);749-753 *
基于轮廓特征与多重分形分析的步态识别方法研究;訾春元;《中国优秀硕士学位论文全文数据库 (信息科技辑)》;20171015(第10期);I138-189 *

Also Published As

Publication number Publication date
CN108830246A (en) 2018-11-16

Similar Documents

Publication Publication Date Title
CN108830246B (en) Multi-dimensional motion feature visual extraction method for pedestrians in traffic environment
EP3614308B1 (en) Joint deep learning for land cover and land use classification
CN110175576A (en) A kind of driving vehicle visible detection method of combination laser point cloud data
CN106875424B (en) A kind of urban environment driving vehicle Activity recognition method based on machine vision
CN105260699B (en) A kind of processing method and processing device of lane line data
Guan et al. Robust traffic-sign detection and classification using mobile LiDAR data with digital images
CN108830171B (en) Intelligent logistics warehouse guide line visual detection method based on deep learning
CN106682586A (en) Method for real-time lane line detection based on vision under complex lighting conditions
CN102385690B (en) Target tracking method and system based on video image
CN103049751A (en) Improved weighting region matching high-altitude video pedestrian recognizing method
CN110379168B (en) Traffic vehicle information acquisition method based on Mask R-CNN
CN110232389A (en) A kind of stereoscopic vision air navigation aid based on green crop feature extraction invariance
CN109255298A (en) Safety cap detection method and system in a kind of dynamic background
CN105404857A (en) Infrared-based night intelligent vehicle front pedestrian detection method
CN108428254A (en) The construction method and device of three-dimensional map
Chao et al. Multi-lane detection based on deep convolutional neural network
CN105279769A (en) Hierarchical particle filtering tracking method combined with multiple features
CN107315998A (en) Vehicle class division method and system based on lane line
Zhang et al. Gc-net: Gridding and clustering for traffic object detection with roadside lidar
CN106056078A (en) Crowd density estimation method based on multi-feature regression ensemble learning
CN105335751B (en) A kind of berth aircraft nose wheel localization method of view-based access control model image
CN113092807B (en) Urban overhead road vehicle speed measuring method based on multi-target tracking algorithm
CN113255779B (en) Multi-source perception data fusion identification method, system and computer readable storage medium
CN108805907B (en) Pedestrian posture multi-feature intelligent identification method
CN108830248B (en) Pedestrian local feature big data hybrid extraction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant