WO2018059300A1 - 行走行为的预测方法和装置、数据处理装置和电子设备 - Google Patents

行走行为的预测方法和装置、数据处理装置和电子设备 Download PDF

Info

Publication number
WO2018059300A1
WO2018059300A1 PCT/CN2017/102706 CN2017102706W WO2018059300A1 WO 2018059300 A1 WO2018059300 A1 WO 2018059300A1 CN 2017102706 W CN2017102706 W CN 2017102706W WO 2018059300 A1 WO2018059300 A1 WO 2018059300A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
walking behavior
target object
walking
time period
Prior art date
Application number
PCT/CN2017/102706
Other languages
English (en)
French (fr)
Inventor
伊帅
李鸿升
王晓刚
Original Assignee
北京市商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京市商汤科技开发有限公司 filed Critical 北京市商汤科技开发有限公司
Publication of WO2018059300A1 publication Critical patent/WO2018059300A1/zh
Priority to US16/174,852 priority Critical patent/US10817714B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24143Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/046Forward inferencing; Production systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • G06V40/25Recognition of walking or running movements, e.g. gait recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present application relates to computer vision technology, and more particularly to a method and apparatus for predicting walking behavior, a data processing apparatus, and an electronic device.
  • Pedestrian walking behavior models can be applied in many fields, such as: walking behavior prediction, pedestrian detection and tracking, crowd behavior analysis, and abnormal behavior detection.
  • the embodiment of the present application provides a technical solution for performing pedestrian walking behavior prediction.
  • a method for predicting walking behavior including:
  • a device for predicting walking behavior including:
  • a behavior coding unit configured to encode walking behavior information of at least one target object in the target scene in a historical time period M, and obtain a first information for indicating walking behavior information of the at least one target object in the historical time period M Offset matrix
  • a neural network configured to receive the first offset matrix for processing, and output a second offset matrix for indicating walking behavior information of the at least one target object in a future time period M';
  • a behavior decoding unit configured to decode the second offset matrix to obtain walking behavior prediction information of the at least one target object in a future time period M'.
  • a data processing apparatus including a prediction apparatus for walking behavior.
  • an electronic device including the data processing device described in the above embodiments.
  • a computer storage medium for storing a computer readable medium
  • the fetched instruction, the instruction includes:
  • a computer device including:
  • the one or more processors are in communication with the memory to execute the executable instructions to perform the operations corresponding to the predictive method of walking behavior of any of the above-described embodiments of the present application.
  • a method based on deep learning is proposed, in which at least one target object in the target scene is in a history.
  • the walking behavior in the time period M is encoded, and a first offset matrix for indicating the walking behavior of the at least one target object in the historical time period M is obtained and input into the neural network, and obtained for indicating the at least one target object in the future a second offset matrix of the walking behavior in the time period M'; decoding the second offset matrix to obtain the walking behavior of the at least one target object in the future time period M'.
  • the embodiment of the present application considers the influence of the walking behavior of the target object in the past period of time on the walking behavior for a certain period of time, because the walking behavior between the target objects in the same scene may have mutual influence, and the embodiment of the present application simultaneously considers The influence of the walking behavior of other possible target objects (eg, pedestrians) in the same scene on the walking behavior of a certain target object (for example, the current pedestrian who needs to predict the future behavior walking), which may affect the future walking behavior of a certain target object.
  • the factors can be considered at the same time, so that the prediction of the walking behavior of the target object in the future is more accurate and reliable.
  • the embodiment of the present application can simultaneously analyze the walking behavior of at least one target object in the scene, and give the above-mentioned one time.
  • the prediction result of the future walking trajectory of at least one target object is not limited to the prediction of walking behavior of a single target object, and the prediction efficiency is high.
  • FIG. 1 is a flow chart of an embodiment of a method for predicting walking behavior of the present application.
  • FIG. 2 is a flowchart of an embodiment of acquiring a first offset matrix in the embodiment of the present application.
  • FIG. 3 is a flowchart of an embodiment of acquiring a second offset matrix in the embodiment of the present application.
  • FIG. 5 is a flow chart of another embodiment of a method for predicting walking behavior of the present application.
  • FIG. 6 is a flowchart of an embodiment of performing neural network training in an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of an embodiment of a prediction device for walking behavior of the present application.
  • FIG. 8 is a schematic structural diagram of another embodiment of a prediction device for walking behavior according to the present application.
  • FIG. 9 is a schematic structural diagram of an application embodiment of an electronic device according to the present application.
  • Embodiments of the present application can be applied to computer systems/servers that can operate with numerous other general purpose or special purpose computing system environments or configurations.
  • Examples of well-known computing systems, environments, and/or configurations suitable for use with computer systems/servers include, but are not limited to, personal computer systems, server computer systems, thin clients, thick clients, handheld or laptop devices, based on Microprocessor systems, set-top boxes, programmable consumer electronics, networked personal computers, small computer systems, mainframe computer systems, and distributed cloud computing technology environments including any of the above, and the like.
  • the computer system/server can be described in the general context of computer system executable instructions (such as program modules) being executed by a computer system.
  • program modules may include routines, programs, target programs, components, logic, data structures, and the like that perform particular tasks or implement particular abstract data types.
  • the computer system/server can be implemented in a distributed cloud computing environment where tasks are performed by remote processing devices that are linked through a communication network.
  • program modules may be located on a local or remote computing system storage medium including storage devices.
  • the prediction method of the walking behavior of the embodiment includes:
  • a first offset matrix for indicating walking behavior of the at least one target object during the historical time period M is obtained.
  • the target scenario is a scenario in which the target object that needs to predict the walking behavior is located, such as a station, a factory, and the like.
  • the target object of each embodiment of the present application is specifically a pedestrian, and may also be any other object or animal that needs to predict walking behavior, for example, an item in an e-commerce warehouse.
  • Handling devices eg, robots
  • self-driving vehicles and the like.
  • the first offset matrix is input to a deep neural network (for example, but not limited to, a Convolutional Neural Network, CNN), and the first offset matrix is processed by the deep neural network, and the output is used.
  • a second offset matrix representing the walking behavior information of the at least one target object in the future time period M′.
  • a method based on deep learning is proposed, and the walking behavior of at least one target object in the target scene in a historical time period M is encoded to obtain the at least A target object walks the first offset matrix of the behavior within the historical time period M and inputs it into a neural network (such as CNN) to obtain a second offset for indicating the walking behavior of the at least one target object in the future time period M' The quantity matrix; decoding the second offset matrix to obtain the walking behavior of the at least one target object in the future time period M'.
  • a neural network such as CNN
  • the embodiment of the present application considers the influence of the walking behavior of the target object in the past period of time on the walking behavior for a certain period of time, because the walking behavior between the target objects in the same scene may have mutual influence, and the embodiment of the present application simultaneously considers The influence of the walking behavior of other possible target objects (eg, pedestrians) in the same scene on the walking behavior of a certain target object (for example, the current pedestrian who needs to predict the future behavior walking), which may affect the future walking behavior of a certain target object.
  • the factors can be considered at the same time, so that the prediction of the walking behavior of the target object in the future is more accurate and reliable.
  • the embodiment of the present application can simultaneously analyze the walking behavior of at least one target object in the scene, and give the above The prediction result of the future walking trajectory of at least one target object has high prediction efficiency.
  • the at least one target object includes a target object that needs to perform a walking behavior prediction
  • the target object that needs to perform the walking behavior prediction may be one or more
  • the embodiment of the present application can simultaneously predict the walking behavior of multiple target objects in the future time period M′, and complete the prediction task of the walking behavior of multiple target objects at one time, without having to complete multiple times.
  • the prediction behavior of multiple target objects in the future time period M' is predicted with high prediction efficiency.
  • the at least one target object may include a partial target object or all target objects in the target scene.
  • the walking behavior of all other target objects eg, pedestrians
  • a certain target object for example, the current pedestrian who needs to predict the future behavior walking
  • the influence of the walking behavior, while predicting all the target objects in the scene, so that all factors that may affect the future walking behavior of a certain target object can be considered at the same time, so that the prediction of the walking behavior of the target object in the future is more accurate. Reliable; and can complete all target object rows at once
  • the predictive task of the behavior is taken to comprehensively predict the possible walking behavior of each target object in the target scene in the future time period M'.
  • the walking behavior information or the walking behavior prediction information may include, for example but not limited to, any one or more of the following: walking path information, walking direction information, and walking speed information.
  • the walking behavior information encoded in operation 102 may be the same as or different from the decoding obtained walking behavior prediction information in operation 106.
  • the walking behavior information encoded in the operation 102 may be the walking path information
  • the decoding obtaining walking behavior prediction information in the operation 106 may be the walking path information, or the walking direction information or the walking speed information, that is, based on the embodiment of the present application.
  • the walking path information, the walking direction information, and/or the walking speed information of each target object in the target time zone in the future time period M′ may be predicted by the walking behavior information of each target object in the target scene in the historical time period M.
  • the walking behavior information encoded in operation 102 and the walking behavior encoding information obtained by decoding in operation 106 are walking path information as an example. Since all target objects in the target scene include the at least one target object, The walking path information can be collected according to the unit time and includes the direction information.
  • the person skilled in the art can know, according to the description of the embodiment of the present application, the walking behavior information encoded in operation 102 and the walking behavior prediction information decoded in operation 106 as the walking direction.
  • the embodiments of the present application are equally applicable when information or walking speed information is used.
  • FIG. 2 is a flowchart of an embodiment of acquiring a first offset matrix in the embodiment of the present application.
  • the operation 102 may be specifically implemented as follows:
  • the walking behavior information of the target object in the historical time period M is represented by a displacement vector, and the value of the displacement vector is assigned to the current position of the target object.
  • the displacement vector of each target object is integrated to obtain a first offset matrix.
  • the location of the target object is assigned a value of the displacement vector of the target object, in order to distinguish the location of the target object and the target object (ie, the background in the target scene) in the target scene, Selectively increase all the elements in all displacement vectors by 1 to ensure that all the elements of the displacement vector are greater than 0 to distinguish the target object and background in the target scene, which helps to identify the target object from the target scene. .
  • the neural network that processes the first offset matrix may specifically include a first sub CNN, a bit addition unit, and a second sub CNN.
  • FIG. 3 is a flowchart of an embodiment of acquiring a second offset matrix in the embodiment of the present application.
  • the operation 104 may be specifically implemented as follows:
  • the first offset matrix is used as an input of the first sub-CNN, and the walking behavior of the at least one target object in the historical time period M is classified by using the first sub-CNN to obtain a walking behavior feature map.
  • bitwise addition unit Using the bitwise addition unit, adding the position information map of the target scene set in advance and the walking behavior feature map based on the corresponding position to obtain the scene walking behavior information.
  • the location information map includes location information of the spatial structure in the target scene, where the spatial structure may specifically be a spatial structure that affects the walking behavior of the target object in the target scene, for example, the location information of the entrance and exit of the target scene.
  • the position information of the obstacle in the target scene, etc. may also be the entire spatial structure in the target scene, which is obtained based on the training of the target scene sample.
  • the position information map of the target scene and the walking behavior feature map are added according to the corresponding positions, and the obtained scene walking behavior information includes the position information of the entire target scene, thereby taking into account the walking behavior of the target object in each specific scene in the target scene. influences.
  • the scene walking behavior information is used as an input of the second sub-CNN, and the second sub-CNN is used to respectively acquire, for each type of walking behavior of the at least one target object in the historical time period M, the first offset in the future time period M′.
  • the influence information of the quantity matrix, according to which the second offset matrix is determined for example, integrating the various types of walking behavior of the at least one target object in the historical time period M to the first offset matrix in the future time period M'
  • the impact information is obtained by obtaining a second offset matrix.
  • the operation may be further performed by modeling spatial structure information of the target scene in advance to obtain a location information map of the target scene.
  • the first sub CNN may specifically include a plurality of concatenated CNN layers, for example, three CNN layers; among the three CNN layers in the first sub CNN
  • Each CNN layer may include a plurality of convolution filters, for example, 64 convolution filters; each convolution filter may have a size of 3*3; and/or the second sub-CNN may specifically include multiple CNN layer, for example, three CNN layers; each CNN layer in multiple CNN layers in the second sub CNN may also include multiple convolution filters, for example, 64 convolution filters; the size of each convolution filter It can be 3 x 3.
  • the bottom layer CNN layer in the first sub-CNN may roughly divide the walking behavior of the at least one target object, for example, into a target object that goes up and down; a sub-bottom CNN layer, The result of the rough division of the underlying CNN layer is further divided, for example, into the upper left, upper right, and upper right target objects; the upper CNN layer can filter out walking behaviors having different properties, for example, running quickly Pedestrians, pedestrians who turn quickly.
  • the more the top layer the more specific the walking behavior of the CNN layer is.
  • the second sub-CNN can further sort and integrate the classification result of the first sub-CNN, that is, integrate the target object with each type of walking behavior into the target object that needs to predict the walking behavior, and the second sub-CNN
  • Each CNN layer performs information fusion according to each small class of walking behavior, and the degree of integration to the top layer is higher.
  • the underlying CNN layer in the second sub-CNN may combine the effects of all target objects moving up to the left
  • the sub-bottom CNN layer may combine the effects of all target objects going up, top right, and going up.
  • the upper CNN layer may combine the walking behaviors of all target objects in the target scene to obtain the output result of the second sub-CNN.
  • each CNN layer in the first sub-CNN and each CNN layer in the second sub-CNN completes the step-by-step classification of the walking behavior of all the target objects, and then gradually integrates them.
  • the more complex the network structure of a neural network (such as CNN), that is, the more layers and the more parameters, the more difficult it is to train, the network is not converged, and the storage resources are occupied.
  • the simpler the network structure the fewer the number of layers.
  • the fewer the parameters the lower the calculation and analysis capabilities, and the processing performance cannot be guaranteed.
  • each CNN layer includes 64 convolution filters, which can meet the requirements of network processing performance, network structure complexity, and sample number.
  • the above neural network may further include a first pooling unit and a second pooling unit.
  • the first pooling unit ie, Max-Pooling layer
  • the maximum down-sampling is performed to obtain a new walking behavior characteristic map, and the spatial size of the new walking behavior characteristic map is smaller than the walking behavior characteristic map.
  • the second offset unit may also be used to perform convolution upsampling on the second offset matrix to obtain the same size as the first offset matrix. Two offset matrix.
  • the size of the first offset matrix, and the spatial size of the location information map and the walking behavior map may be expressed as X ⁇ Y; an exemplary scale of the maximum downsampling is, for example, 2, then the new walking behavior feature
  • the space size of the graph is X/2 ⁇ Y/2; the scale of the convolution upsampling is correspondingly 2, and the size of the second offset matrix obtained by convolution upsampling is restored to X ⁇ Y.
  • the size of the walking behavior map can be reduced, so that the neural network (such as CNN) can process more walking behavior data; convolution after obtaining the second offset matrix
  • the upsampling can be restored to the second offset matrix of the same size as the original space, so that the walking behavior output result finally obtained by the embodiment of the present application is consistent with the spatial size of the input walking behavior.
  • the position information map is consistent with the size of the walking behavior feature map, and the position information map of the target scene and the walking behavior characteristic map can be added based on the corresponding positions.
  • the maximum sampling result of the walking behavior characteristic map is 1 2 3 4 5 6
  • the position information map is 111111.
  • Figure 4 is a flow diagram of one embodiment of obtaining walking behavior of all target objects in a future time period M' in an embodiment of the present application.
  • the operation 106 may be specifically implemented as follows:
  • FIG. 5 is a flow chart of another embodiment of a method for predicting walking behavior of the present application.
  • the target object is a pedestrian
  • the walking behavior information of all the target objects in the target scene in the historical time period M is taken as the input of the first offset matrix, and the walking behavior information is the walking path information. Specific implementations of the embodiments are further described. As shown in FIG. 5, this embodiment includes:
  • the travel path information of the target object in the historical time period M is represented by a displacement vector.
  • the walking behavior characteristic map is subjected to maximum sampling, and a new walking behavior characteristic map is obtained, and the spatial size of the new walking behavior characteristic map is smaller than the walking behavior characteristic map.
  • the position information map of the target scene set in advance and the walking behavior feature map are added according to the corresponding position by using the bitwise adding unit to obtain the scene walking behavior information.
  • the location information map includes location information of a spatial structure in the target scene that may affect pedestrian walking behavior.
  • the influence information integrates the influence information of the various walking behaviors of all target objects in the historical time period M on the first offset matrix in the future time period M', and obtains the second offset matrix.
  • the network training of the initial neural network may be included to obtain the operation of the neural network (such as CNN).
  • the initial neural network includes the following units: an initial first sub CNN, an initial second sub CNN, an initial first pooling unit and an initial second pooling unit, and an initial bitwise addition unit.
  • the initial first sub CNN, the initial second sub CNN, the initial first pooling unit, the initial second pooling unit, and the initial bitwise adding unit may be iteratively trained in the current training unit. Training result The next unit is iteratively trained when the predetermined convergence condition is met.
  • the training result satisfies the predetermined convergence condition, for example, the deviation between the output result of the current training unit and the preset output result is less than the first preset threshold; and/or the number of times the current training unit performs the iterative training reaches the second Preset threshold.
  • FIG. 6 is a flow chart of an embodiment of performing neural network (eg, CNN) training in an embodiment of the present application.
  • the initial neural network (such as the initial CNN) is specifically trained in the following manner:
  • the first sub-CNN is obtained by the initial first sub-CNN, the network parameter of the first sub-CNN is kept unchanged, and the initial second sub-CNN is started, in response to the training result of the initial first sub-CNN meeting the pre-set convergence condition. Network training.
  • the second sub-CNN is obtained by the initial second sub-CNN, and the network parameters of the first sub-CNN and the second sub-CNN are kept unchanged, and the initial initial is started, in response to the training result of the initial second sub-CNN meeting the pre-set convergence condition.
  • the first pooling unit and the initial second pooling unit perform network training.
  • the first pooling unit and the first pooling unit are respectively obtained by the initial first pooling unit and the initial second pooling unit, respectively, in response to the training result of the initial first pooling unit and the initial second pooling unit satisfying a preset convergence condition.
  • the second pooling unit keeps the network parameters of the first sub CNN, the second sub CNN, the first pooling unit, and the second pooling unit unchanged, and starts network training on the initial bit adding unit.
  • the second pooling unit restores the input information to the same size as the original space, and simultaneously initializes the first pooling unit and the initial second pooling unit.
  • the training of the unit ensures that the output of the walking behavior finally obtained by the embodiment of the present application is consistent with the spatial size of the input walking behavior.
  • the training result of the initial bitwise addition unit satisfies a preset convergence condition, and the bitwise addition unit is obtained by the initial bitwise addition unit, and the first child CNN, the second child CNN, and the first pooling unit are maintained.
  • the network parameters of the second pooling unit and the bitwise adding unit are unchanged, and the network training of the initial neural network (such as the initial CNN) is completed, and a neural network (such as CNN) is obtained.
  • the initial first child CNN, the initial second child CNN, the initial first pooling unit and the initial second pooling unit are sequentially trained, and the initial bitwise adding unit is kept trained after each layer is converged.
  • the network parameters of the network layer are unchanged, and then the training of the next-order network layer is gradually increased.
  • the training sample error rate cannot continue to decline, it indicates that the convergence condition has been met, and further training is required to further reduce the error rate. This makes the training process more stable and does not lead to the destruction of the previously trained network structure after joining the new network layer.
  • the foregoing program may be stored in a computer readable storage medium, and the program is executed when executed.
  • the foregoing steps include the steps of the foregoing method embodiments; and the foregoing storage medium includes: a medium that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
  • FIG. 7 is a schematic structural diagram of an embodiment of a prediction device for walking behavior of the present application.
  • the apparatus of this embodiment can be used to implement the method for predicting the walking behavior described above in the present application.
  • the prediction apparatus of the walking behavior of this embodiment includes a behavior coding unit, a neural network (such as CNN), and a behavior decoding unit. among them:
  • a behavior coding unit configured to encode walking behavior information of at least one target object in the target scene in a historical time period M, to obtain a first offset for indicating walking behavior information of the at least one target object in the historical time period M Quantity matrix.
  • the target scenario is a scenario in which the target object that needs to predict the walking behavior is located, such as a station, a factory, and the like.
  • the at least one target object includes the target object that needs to be predicted by the walking behavior, and the target object that needs to be predicted by the walking behavior may be one or more, that is, the embodiment of the present application can realize simultaneous targeting of multiple target objects in the future.
  • the walking behavior in the time period M' is predicted, and the prediction task for the walking behavior of the plurality of target objects is completed at one time.
  • the at least one target object may include a partial target object or all target objects in the target scene.
  • the target object of each embodiment of the present application is specifically a pedestrian, and may be any other object or animal that needs to predict the walking behavior.
  • the walking behavior information or the walking behavior prediction information may include, for example but not limited to, any one or more of the following: walking path information, walking direction information, and walking speed information.
  • the behavior coding unit may be specifically configured to: respectively acquire walking behavior information of each target object in the target scene in the historical time period M; respectively, for each walking object in the historical time period M, a displacement vector The walking behavior information indicating the target object in the historical time period M; and determining the first offset matrix according to the displacement vector of each target object.
  • a depth neural network for receiving the first offset matrix, and outputting a second offset matrix for indicating walking behavior information of the at least one target object in the future time period M'.
  • the walking behavior prediction information may include, for example but not limited to, any one or more of the following: walking path information, walking direction information, and walking speed information.
  • the behavior decoding unit is specifically configured to: decode the second offset matrix, obtain a displacement vector for indicating walking behavior of the at least one target object in the future time period M′; respectively acquire at least one target object Obtaining behavior information corresponding to the displacement vector of the walking behavior in the future time period M′; and acquiring at least one target object according to the walking behavior information corresponding to the displacement vector of the walking behavior of the at least one target object in the future time period M′ Walking behavior prediction information in the future time period M'.
  • the walking behavior prediction apparatus encodes the walking behavior of at least one target object in the target scene in a historical time period M, and obtains the representation of the at least one target object in the historical time period M. a first offset matrix of the walking behavior and input to a neural network (such as CNN) to obtain a second offset matrix for indicating walking behavior of the at least one target object in the future time period M'; The quantity matrix is decoded to obtain the walking behavior of the at least one target object in the future time period M'.
  • a neural network such as CNN
  • the embodiment of the present application considers the influence of the walking behavior of the target object in the past period of time on the walking behavior for a certain period of time, because the walking behavior between the target objects in the same scene may have mutual influence, and the embodiment of the present application simultaneously considers The influence of the walking behavior of other possible target objects (eg, pedestrians) in the same scene on the walking behavior of a certain target object (for example, the current pedestrian who needs to predict the future behavior walking), which may affect the future walking behavior of a certain target object.
  • the factors can be considered at the same time, making the prediction of the target object's walking behavior more in the future.
  • the embodiment of the present application can simultaneously analyze the walking behavior of at least one target object in the scene, and give the future walking trajectory prediction result of the at least one target object at a time, and is not limited to walking behavior on a single target object. Forecasting, forecasting efficiency is high.
  • FIG. 8 is a schematic structural diagram of another embodiment of a prediction device for walking behavior according to the present application.
  • the neural network (such as CNN) in this embodiment includes: a first sub CNN, a bitwise addition unit and a second sub CNN, as compared with the embodiment shown in FIG. among them:
  • the first sub CNN is configured to receive the first offset matrix, and classify the walking behavior information of the at least one target object in the historical time period M to obtain a walking behavior feature map.
  • the first sub CNN may specifically include multiple CNN layers, for example, three CNN layers; each of the multiple CNN layers in the first sub CNN may include multiple convolution filters, for example, 64 volumes.
  • Product filter, each convolution filter can be 3 ⁇ 3 in size.
  • the bitwise adding unit is configured to add the position information map of the preset target scene and the walking behavior feature map based on the corresponding position to obtain the scene walking behavior information; the position information map includes the position information of the spatial structure in the target scene, where
  • the spatial structure may specifically be a spatial structure that affects the walking behavior of the target object in the target scene, such as obstacles, entrances and exits in the target scene, or all spatial structures in the target scene.
  • a second sub- CNN configured to receive scene walking behavior information, and determine information about impact of each type of walking behavior of the at least one target object in the historical time period M on the first offset matrix in the future time period M′, and A second offset matrix is determined based on the impact information.
  • the second sub CNN may specifically include multiple CNN layers, for example, three CNN layers; each of the multiple CNN layers in the second sub CNN may include multiple convolution filters, for example, 64 Convolution filters, each of which can be 3 x 3 in size.
  • the neural network may further include a first pooling unit and a second pooling unit. among them:
  • the first pooling unit is configured to perform maximum down-sampling on the walking behavior characteristic map obtained by the first sub-CNN, and obtain a new walking behavior characteristic map, where the spatial size of the new walking behavior characteristic map is smaller than the walking behavior characteristic map.
  • a second pooling unit configured to perform convolution upsampling on the second offset matrix after the second sub CNN obtains the second offset matrix, to obtain a second offset that is the same size as the first offset matrix Transfer matrix.
  • the size of the first offset matrix, and the spatial size of the location information map and the walking behavior map may be expressed as X ⁇ Y; in one specific example, the maximum downsampling scale is 2, and the new walking behavior feature
  • the space size of the graph is X/2 ⁇ Y/2; the scale of the convolution upsampling is 2, and the size of the second offset matrix obtained by convolution upsampling is restored to X ⁇ Y.
  • a network training unit may be further included for performing network training on an initial neural network (such as an initial CNN) to obtain a neural network (such as CNN).
  • the initial neural network (such as the initial CNN) includes the following units: an initial first sub CNN, an initial second sub CNN, an initial first pooling unit and an initial second pooling unit, and an initial bitwise addition unit.
  • the network The training unit may be specifically configured to perform iterative training on the initial first sub CNN, the initial second sub CNN, the initial first pooling unit, the initial second pooling unit, and the initial bitwise adding unit, and the training result in the current training unit. When the predetermined convergence condition is satisfied, the next unit is iteratively trained.
  • the network training unit is specifically operable to perform network training on an initial neural network (such as an initial CNN) by the manner shown in FIG. 6.
  • an initial neural network such as an initial CNN
  • the embodiment of the present application further provides a data processing device, including the prediction device for walking behavior provided by any of the foregoing embodiments of the present application.
  • the data processing apparatus in the embodiment of the present application may be any device having a data processing function, and may include, but is not limited to, an advanced reduced instruction set machine (ARM), a central processing unit (CPU), or a graphics processing unit (GPU). )Wait.
  • ARM advanced reduced instruction set machine
  • CPU central processing unit
  • GPU graphics processing unit
  • the data processing device provided by the above-mentioned embodiments of the present application includes the prediction device for walking behavior provided by any of the above embodiments of the present application, and considers the influence of the walking behavior of the target object in a past period of time on the walking behavior for a certain period of time at the same time. Considering the influence of the walking behavior of other possible target objects in the same scene on the walking behavior of a certain target object, and predicting at least one target object in the scene, so that at least one factor that may affect the future walking behavior of a certain target object It can be considered at the same time, so that the prediction of the walking behavior of the target object in the future is more accurate and reliable.
  • the embodiment of the present application simultaneously analyzes the walking behavior of at least one target object in the scene, and can uniformly give at least one target.
  • the prediction result of the future walking trajectory of the object is not based on a single target object for prediction, and the prediction efficiency is high, and the prediction task of walking behavior of at least one target object can be completed at one time.
  • the embodiment of the present application further provides an electronic device, such as a mobile terminal, a personal computer (PC), a tablet computer, a server, etc., which is provided with the data processing device of any of the above embodiments of the present application.
  • an electronic device such as a mobile terminal, a personal computer (PC), a tablet computer, a server, etc., which is provided with the data processing device of any of the above embodiments of the present application.
  • the electronic device provided by the above embodiment of the present application includes the data processing device of the present application, thereby including the prediction device for walking behavior provided by any of the above embodiments of the present application, and considering the walking behavior of the target object in the past period of time.
  • the influence of the walking behavior of time taking into account the influence of the walking behavior of other possible target objects in the same scene on the walking behavior of a certain target object, and predicting at least one target object in the scene, so that at least one may affect a certain
  • the factors of the future walking behavior of a target object can be considered at the same time, so that the prediction of the walking behavior of the target object in the future is more accurate and reliable.
  • the embodiment of the present application simultaneously analyzes the walking behavior of at least one target object in the scene.
  • the prediction result of the future walking trajectory of at least one target object can be unified, and the prediction is not based on a single target object, and the prediction efficiency is high, and the prediction task of walking behavior of at least one target object can be completed at one time.
  • FIG. 9 is a schematic structural diagram of an application embodiment of an electronic device according to the present application.
  • an electronic device for implementing an embodiment of the present application includes a central processing unit (CPU) or a graphics processing unit (GPU), which may be according to executable instructions stored in a read only memory (ROM) or from storage.
  • ROM read only memory
  • RAM random access memory
  • the central processing unit or the graphics processing unit can communicate with the read only memory and/or the random access memory to execute the executable instructions to complete the embodiments of the present application.
  • the operation corresponding to the prediction method of the walking behavior for example, encoding walking behavior information of at least one target object in the target scene in a historical time period M, and obtaining, for indicating that the at least one target object is walking in the historical time period M a first offset matrix of behavior information; inputting the first offset matrix to a depth neural network (such as CNN), and outputting by the neural network (such as CNN) to represent the at least one target object in the future a second offset matrix of the walking behavior information in the time period M'; decoding the second offset matrix to obtain walking behavior prediction information of the at least one target object in the future time period M'.
  • a depth neural network such as CNN
  • the neural network such as CNN
  • the CPU, GPU, ROM, and RAM are connected to each other through a bus.
  • An input/output (I/O) interface is also connected to the bus.
  • the following components are connected to the I/O interface: an input portion including a keyboard, a mouse, and the like; an output portion including a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a speaker; a storage portion including a hard disk or the like; The communication part of the network interface card of the LAN card, modem, etc.
  • the communication section performs communication processing via a network such as the Internet.
  • the drive is also connected to the I/O interface as needed.
  • a removable medium such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like is mounted on the drive as needed so that a computer program read therefrom is installed into the storage portion as needed.
  • an embodiment of the present disclosure includes a computer program product comprising a computer program tangibly embodied on a machine readable medium, the computer program comprising program code for executing the method illustrated in the flowchart, the program code being The instructions corresponding to the steps of the prediction method for performing the walking behavior provided by the embodiment of the present application, for example, encoding the walking behavior information of at least one target object in the target scene in a historical time period M, obtained for representing An instruction of the at least one target object to walk the first offset matrix of the behavior information in the historical time period M; input the first offset matrix to a deep neural network (such as CNN), by the deep neural network (e.g., CNN) outputting an instruction for indicating a second offset matrix of the walking behavior information of the at least one target object in a future time period M'; decoding the second offset matrix to obtain the at least An instruction of the target behavior predictive
  • the computer program can be downloaded and installed from the network via the communication portion, and/or installed from a removable medium.
  • the above-described functions defined in the method of the present application are performed when the computer program is executed by a central processing unit (CPU) or a graphics processing unit (GPU).
  • CPU central processing unit
  • GPU graphics processing unit
  • the embodiment of the present application further provides a computer storage medium, configured to store computer readable instructions, where the instructions include: encoding, for a walking behavior information of at least one target object in a target scene in a historical time period M, Obtaining an instruction for indicating a first offset matrix of the walking behavior information of the at least one target object in the historical time period M; inputting the first offset matrix to a deep neural network (such as CNN) Deriving a depth neural network (such as CNN) to output an instruction for indicating a second offset matrix of walking behavior information of the at least one target object in a future time period M'; decoding the second offset matrix, Obtaining an instruction of the walking behavior prediction information of the at least one target object in the future time period M′.
  • a deep neural network such as CNN
  • a depth neural network such as CNN
  • the embodiment of the present application further provides a computer device, including:
  • the one or more processors are in communication with the memory to execute the executable instructions to perform the operations corresponding to the predictive method of walking behavior of any of the above-described embodiments of the present application.
  • the technical solution for predicting walking behavior in the embodiment of the present application may be applied to, for example, but not limited to, one or more of the following scenarios:
  • the walking behavior prediction result of the embodiment of the present application can be used as an input of a neural network (such as CNN) to predict the walking behavior of all pedestrians in the target scene in a longer period of time;
  • a neural network such as CNN
  • the flow of the prediction method embodiment of each walking behavior of the present application may be iterated, and the output behavior prediction information in the future time period M′ may be further encoded, and then input to the neural network (such as CNN), and then The second offset matrix is decoded and outputted, and the predicted result of the pedestrian walking behavior after a longer time can be obtained.
  • the neural network such as CNN
  • the tracking algorithm often links the walking trajectory of a pedestrian in the past period with the trajectory of the pedestrian in the future.
  • the prediction information of the pedestrian walking trajectory can be used to assist in retrieving the pedestrians currently needing to be tracked.
  • the embodiment of the present application can predict the walking route and the destination of the pedestrian in the future according to the walking route of the pedestrian in the target scene in the past period of time.
  • the pedestrian's true walking route does not match the predicted result, or the pedestrian's destination is inconsistent with the predicted destination, it indicates that the walking behavior of the pedestrian has exceeded expectations, and the pedestrian can be promoted to be abnormal.
  • the behavior for example, a sudden turn, a sudden acceleration run, or a sudden stop.
  • behavior coding In order to apply the framework of deep learning to the modeling of pedestrian behavior, we propose the concept of behavior coding. Using behavior coding, we can encode the pedestrian walking behavior information into the input and output of deep neural network without ambiguity. Can easily be extended to other areas;
  • the methods, systems, devices, and devices of the present application may be implemented in a number of ways.
  • the methods, systems, apparatus, and devices of the present application can be implemented in software, hardware, firmware, or any combination of software, hardware, and firmware.
  • the above-described sequence of steps for the method is for illustrative purposes only, and the steps of the method of the present application are not limited to the order specifically described above unless otherwise specifically stated.
  • the present application can also be implemented as a program recorded in a recording medium, the programs including machine readable instructions for implementing the method according to the present application.
  • the present application also covers a recording medium storing a program for executing the method according to the present application.

Abstract

本申请实施例公开了一种行走行为的预测方法和装置、数据处理装置和电子设备,其中,方法包括:对目标场景中至少一目标对象在一个历史时间段M内的行走行为信息进行编码,获得用于表示所述至少一目标对象在历史时间段M内行走行为信息的第一偏移量矩阵;将所述第一偏移量矩阵输入至神经网络,由所述神经网络输出用于表示所述至少一目标对象在未来时间段M'内行走行为信息的第二偏移量矩阵;对所述第二偏移量矩阵进行解码,获得所述至少一目标对象在未来时间段M'内的行走行为预测信息。

Description

行走行为的预测方法和装置、数据处理装置和电子设备
本公开要求在2016年9月29日提交中国专利局、申请号为201610868343.9、发明名称为“行走行为的预测方法和装置、数据处理装置和电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
技术领域
本申请涉及计算机视觉技术,尤其是一种行走行为的预测方法和装置、数据处理装置和电子设备。
背景技术
行人行走行为的建模是计算机视觉以及智能视频监控领域的重要问题。行人行走行为模型在很多领域可以进行重要应用,例如:应用于行走行为预测、行人检测与跟踪、人群行为分析、以及异常行为的检测等。
发明内容
本申请实施例提供一种用于进行行人行走行为预测的技术方案。
根据本申请实施例的一个方面,提供一种行走行为的预测方法,包括:
对目标场景中至少一目标对象在一个历史时间段M内的行走行为信息进行编码,获得用于表示所述至少一目标对象在历史时间段M内行走行为信息的第一偏移量矩阵;
将所述第一偏移量矩阵输入至神经网络,由所述神经网络输出用于表示所述至少一目标对象在未来时间段M’内行走行为信息的第二偏移量矩阵;
对所述第二偏移量矩阵进行解码,获得所述至少一目标对象在未来时间段M’内的行走行为预测信息。
根据本申请实施例的一个方面,提供一种行走行为的预测装置,包括:
行为编码单元,用于对目标场景中至少一个目标对象在一个历史时间段M内的行走行为信息进行编码,获得用于表示所述至少一个目标对象在历史时间段M内行走行为信息的第一偏移量矩阵;
神经网络,用于接收所述第一偏移量矩阵进行处理,输出用于表示所述至少一个目标对象在未来时间段M’内行走行为信息的第二偏移量矩阵;
行为解码单元,用于对所述第二偏移量矩阵进行解码,获得所述至少一个目标对象在未来时间段M’内的行走行为预测信息。
根据本申请实施例的又一个方面,提供一种数据处理装置,包括行走行为的预测装置。
根据本申请实施例的再一个方面,提供一种电子设备,包括上述实施例所述的数据处理装置。
根据本申请实施例的再一个方面,提供的一种计算机存储介质,用于存储计算机可读 取的指令,所述指令包括:
对目标场景中至少一目标对象在一个历史时间段M内的行走行为信息进行编码,获得用于表示所述至少一目标对象在历史时间段M内行走行为信息的第一偏移量矩阵的指令;
将所述第一偏移量矩阵输入至神经网络,由所述神经网络输出用于表示所述至少一目标对象在未来时间段M’内行走行为信息的第二偏移量矩阵的指令;
对所述第二偏移量矩阵进行解码,获得所述至少一目标对象在未来时间段M’内的行走行为预测信息的指令。
根据本申请实施例的再一个方面,提供一种计算机设备,包括:
存储器,存储可执行指令;
一个或多个处理器,与存储器通信以执行可执行指令从而完成本申请上述任一实施例的行走行为的预测方法对应的操作。
基于本申请上述实施例提供的行走行为的预测方法和装置、数据处理装置和电子设备、计算机存储介质和计算机设备,提出了基于深度学习的方法,对目标场景中的至少一个目标对象在一个历史时间段M内的行走行为进行编码,获得用于表示该至少一个目标对象在历史时间段M内行走行为的第一偏移量矩阵并输入神经网络,获得用于表示上述至少一个目标对象在未来时间段M’内行走行为的第二偏移量矩阵;对第二偏移量矩阵进行解码,获取上述至少一个目标对象在未来时间段M’内的行走行为。
本申请实施例考虑了目标对象过去一段时间的行走行为对其未来一段时间的行走行为的影响,因为同一场景中的各目标对象之间的行走行为可能会产生相互影响,本申请实施例同时考虑了同一场景中其他可能的目标对象(例如:行人)的行走行为对某一个目标对象(例如:需要预测未来行为行走的当前行人)的行走行为的影响,使得可能影响某一目标对象未来行走行为的因素能够同时被考虑,使得对目标对象在未来一段时间内行走行为的预测更加准确、可靠;另,本申请实施例可以同时对场景内至少一个目标对象的行走行为进行分析,一次给出上述至少一个目标对象未来的行走轨迹预测结果,并不仅限于对单个目标对象进行行走行为预测,预测效率高。
下面通过附图和实施例,对本申请的技术方案做进一步的详细描述。
附图说明
构成说明书的一部分的附图描述了本申请的实施例,并且连同描述一起用于解释本申请的原理。
参照附图,根据下面的详细描述,可以更加清楚地理解本申请,其中:
图1为本申请行走行为的预测方法一个实施例的流程图。
图2为本申请实施例中获取第一偏移量矩阵的一个实施例的流程图。
图3为本申请实施例中获取第二偏移量矩阵的一个实施例的流程图。
图4为本申请实施例中获取所有目标对象在未来时间段M’内的行走行为的一个实施 例的流程图。
图5为本申请行走行为的预测方法另一个实施例的流程图。
图6为本申请实施例中进行神经网络训练一个实施例的流程图。
图7为本申请行走行为的预测装置一个实施例的结构示意图。
图8为本申请行走行为的预测装置另一个实施例的结构示意图。
图9为本申请电子设备一个应用实施例的结构示意图。
具体实施例
现在将参照附图来详细描述本申请的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本申请的范围。
同时,应当明白,为了便于描述,附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制的。
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本申请及其应用或使用的任何限制。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
本申请实施例可以应用于计算机系统/服务器,其可与众多其它通用或专用计算系统环境或配置一起操作。适于与计算机系统/服务器一起使用的众所周知的计算系统、环境和/或配置的例子包括但不限于:个人计算机系统、服务器计算机系统、瘦客户机、厚客户机、手持或膝上设备、基于微处理器的系统、机顶盒、可编程消费电子产品、网络个人电脑、小型计算机系统﹑大型计算机系统和包括上述任何系统的分布式云计算技术环境,等等。
计算机系统/服务器可以在由计算机系统执行的计算机系统可执行指令(诸如程序模块)的一般语境下描述。通常,程序模块可以包括例程、程序、目标程序、组件、逻辑、数据结构等等,它们执行特定的任务或者实现特定的抽象数据类型。计算机系统/服务器可以在分布式云计算环境中实施,分布式云计算环境中,任务是由通过通信网络链接的远程处理设备执行的。在分布式云计算环境中,程序模块可以位于包括存储设备的本地或远程计算系统存储介质上。
图1为本申请行走行为的预测方法一个实施例的流程图。如图1所示,该实施例行走行为的预测方法包括:
102,对目标场景中至少一目标对象在一个历史时间段M内的行走行为信息进行编码, 获得用于表示该至少一个目标对象在历史时间段M内行走行为的第一偏移量矩阵。
其中,目标场景为需要进行行走行为预测的目标对象所在的场景,例如车站、工厂等。
作为本申请各实施例的一个具体示例而非限制,本申请各实施例的目标对象具体是行人,另外也可以是其他一切需要进行行走行为预测的物体或动物,例如,电商仓库中的商品搬运装置(例如,机器人)、自动驾驶的交通工具等。
104,将第一偏移量矩阵输入至深度神经网络(例如可以但不限于卷积神经网络,Convolutional Neural Network,CNN),由该深度神经网络对该第一偏移量矩阵进行处理,输出用于表示上述至少一个目标对象在未来时间段M’内行走行为信息的第二偏移量矩阵。
106,对第二偏移量矩阵进行解码,获取上述至少一个目标对象在未来时间段M’内的行走行为预测信息。
基于本申请上述实施例提供的行走行为的预测方法,提出了基于深度学习的方法,对目标场景中的至少一个目标对象在一个历史时间段M内的行走行为进行编码,获得用于表示该至少一个目标对象在历史时间段M内行走行为的第一偏移量矩阵并输入神经网络(如CNN),获得用于表示上述至少一个目标对象在未来时间段M’内行走行为的第二偏移量矩阵;对第二偏移量矩阵进行解码,获取上述至少一个目标对象在未来时间段M’内的行走行为。本申请实施例考虑了目标对象过去一段时间的行走行为对其未来一段时间的行走行为的影响,因为同一场景中的各目标对象之间的行走行为可能会产生相互影响,本申请实施例同时考虑了同一场景中其他可能的目标对象(例如:行人)的行走行为对某一个目标对象(例如:需要预测未来行为行走的当前行人)的行走行为的影响,使得可能影响某一目标对象未来行走行为的因素能够同时被考虑,使得对目标对象在未来一段时间内行走行为的预测更加准确、可靠;另外,本申请实施例可以同时对场景内至少一个目标对象的行走行为进行分析,一次给出上述至少一个目标对象未来的行走轨迹预测结果,预测效率高。
在本申请各行走行为的预测方法实施例的另一个具体示例中,上述至少一目标对象包括需要进行行走行为预测的目标对象,需要进行行走行为预测的目标对象具体可以是一个,也可以是多个,即:本申请实施例可以实现同时对多个目标对象在未来时间段M’内的行走行为预测,一次完成对多个目标对象行走行为的预测任务,而不需要分别通过多次来完成多个目标对象在未来时间段M’内的行走行为预测,预测效率高。
另外,上述至少一个目标对象可以包括目标场景中的部分目标对象或者所有目标对象。上述至少一个目标对象包括目标场景中的所有目标对象时,同时考虑了同一场景中其他所有目标对象(例如:行人)的行走行为对某一个目标对象(例如:需要预测未来行为行走的当前行人)的行走行为的影响,同时对场景中的所有目标对象进行预测,使得所有可能影响某一目标对象未来行走行为的因素能够同时被考虑,使得对目标对象在未来一段时间内行走行为的预测更加准确、可靠;并且可以一次完成对所有目标对象行 走行为的预测任务,从而全面预测目标场景中的每一个目标对象在未来时间段M’内可能的行走行为。作为本申请行走行为的预测方法实施例的又一个具体示例,行走行为信息或行走行为预测信息例如可以包括但不限于以下任意一种或多种:行走路径信息、行走方向信息、行走速度信息。其中,操作102中编码的行走行为信息与操作106中解码获得行走行为预测信息可以相同,也可以不同。例如,操作102中编码的行走行为信息可以是行走路径信息,而操作106中解码获得行走行为预测信息可以是行走路径信息,也可以是行走方向信息或行走速度信息,即:基于本申请实施例,可以通过目标场景中各目标对象在历史时间段M内行走行为信息预测目标场景中各目标对象在未来时间段M’内的行走路径信息、行走方向信息和/或行走速度信息。本申请如下实施例中,以操作102中编码的行走行为信息和操作106中解码获得的行走行为编码信息是行走路径信息为例进行说明,由于目标场景中所有目标对象包括上述至少一个目标对象,行走路径信息可以按照单位时间采集,并且包含了方向信息,本领域技术人员基于本申请实施例的记载可以知悉,操作102中编码的行走行为信息和操作106中解码获得行走行为预测信息为行走方向信息或行走速度信息时,本申请实施例同样适用。
基于本申请实施例获取的目标对象在未来时间段M’内的行走路径信息、行走方向信息、行走速度信息,可以获知目标场景中各目标对象的行走轨迹,在不同时刻的行走快慢、何时转弯等信息。图2为本申请实施例中获取第一偏移量矩阵的一个实施例的流程图。如图2所示,作为本申请行走行为的预测方法实施例的一个示例,操作102具体可以通过如下方式实现:
202,获取目标场景中各目标对象在历史时间段M内的行走路径信息。
204,分别针对各目标对象在历史时间段M内的行走路径信息,以一个位移向量表示目标对象在历史时间段M内的行走行为信息,将该位移向量的值赋予该目标对象当前的位置。
206,根据各目标对象的位移向量确定第一偏移量矩阵。
例如,综合各目标对象的位移向量,获得第一偏移量矩阵。
在本申请的另一个实施例中,目标对象所在的位置被赋予目标对象的位移向量的值,为了区分目标场景中有目标对象和无目标对象(即:目标场景中的背景)的位置,可以选择性地将所有的位移向量中所有的元素都加1,以保证所有位移向量的元素都大于0,以区分目标场景中的目标对象和背景,有助于后续从目标场景中识别出目标对象。
在本申请各行走行为的预测方法实施例的又一个具体示例中,上述对第一偏移量矩阵进行处理的神经网络具体可以包括第一子CNN、按位相加单元和第二子CNN。
图3为本申请实施例中获取第二偏移量矩阵的一个实施例的流程图。如图3所示,作为本申请行走行为的预测方法实施例的一个示例,操作104具体可以通过如下方式实现:
302,将第一偏移量矩阵作为第一子CNN的输入,利用第一子CNN对上述至少一个目标对象在历史时间段M内的行走行为进行分类,获得行走行为特征图。
304,利用按位相加单元,将预先设置的目标场景的位置信息图与行走行为特征图基于对应位置相加,获得场景行走行为信息。
其中,位置信息图包括目标场景中空间结构的位置信息,此处的空间结构具体可以是对目标场景中目标对象的行走行为产生影响的空间结构,例如,目标场景的入口、出口的位置信息,目标场景中的障碍物的位置信息等,另外也可以是目标场景中的全部空间结构,该位置信息图基于对目标场景样本的训练获得。
将目标场景的位置信息图与行走行为特征图基于对应位置相加,获得的场景行走行为信息便包含了整个目标场景的位置信息,从而考虑了目标场景中的各具体场景对目标对象行走行为的影响。
306,将场景行走行为信息作为第二子CNN的输入,利用第二子CNN分别获取上述至少一个目标对象在历史时间段M内的各类行走行为在未来时间段M’内对第一偏移量矩阵的影响信息,根据该信息确定第二偏移量矩阵,例如,综合上述至少一个目标对象在历史时间段M内的各类行走行为在未来时间段M’内对第一偏移量矩阵的影响信息,获得第二偏移量矩阵。
基于上述图3所示实施例的进一步实施例中,还可以包括预先对目标场景的空间结构信息进行建模,获得目标场景的位置信息图的操作。
在基于上述本申请各行走行为的预测方法实施例的进一步具体示例中,第一子CNN具体可以包括级联的多个CNN层,例如三个CNN层;第一子CNN中三个CNN层中的各CNN层可以分别包括多个卷积滤波器,例如64个卷积滤波器;每个卷积滤波器的大小可以是3*3;和/或,第二子CNN具体也可以包括多个CNN层,例如三个CNN层;第二子CNN中多个CNN层中的各CNN层也可以分别包括多个卷积滤波器,例如64个卷积滤波器;每个卷积滤波器的大小可以是3×3。
示例性地,第一子CNN中的底层CNN层,可以对上述至少一个目标对象的行走行为进行粗略的划分,例如,划分为向上走的和向下走的目标对象;次底层CNN层,可以对底层CNN层粗略划分的结果进行更进一步的划分,例如,划分为向左上、向正上、向右上的目标对象;上层CNN层,可以筛选出具有不同性质的行走行为,例如:迅速奔跑的行人、迅速拐弯的行人。第一子CNN中,越向顶层,CNN层筛选出来的行走行为就会越具体。
第二子CNN可以对第一子CNN的分类结果进行进一步的整理、整合,即:将具有每类行走行为的目标对象对需要进行行走行为预测的目标对象的影响进行整合,第二子CNN中的各CNN层按照每一小类行走行为进行信息融合,越向顶层融合度越高。例如,第二子CNN中的底层CNN层,可能会把所有向左上走的目标对象的影响综合起来,次底层CNN层可能会把所有向左上、右上、正上走的目标对象的影响综合起来,上层CNN层可能会把目标场景中所有目标对象的行走行为综合起来,得到第二子CNN的输出结果。
即,第一子CNN中的各CNN层和第二子CNN中的各CNN层,完成了将所有目标对象的行走行为的逐步细分类,之后再逐步整合起来。
神经网络(如CNN)的网络结构越复杂,即层数越多、参数越多,训练起来就会困难,容易导致网络不收敛,且占用存储资源;而网络结构越简单,即层数越少、参数越少,计算、分析能力就会下降,无法保证处理性能。通过实验尝试发现,第一子CNN和第二子CNN采用三个CNN层时,可以同时保证网络模型训练效果和处理性能,实现二者之间的均衡。
一般来说,卷积滤波器的个数是2的整数次幂,例如32、64、128。滤波器个数越多,网络越复杂,处理能力越强,但是网络训练时对样本数量的要求也会比较多。本申请实施例中,各CNN层分别包括64个卷积滤波器,可以同时满足网络处理性能、网络结构的复杂性以及样本数量方面的要求。
另外,上述神经网络(如CNN)还可以包括第一池化单元和第二池化单元。在基于图3所示行走行为的预测方法的又一个实施例中,通过操作302获得行走行为特征图之后,还可以利用第一池化单元(即:Max-Pooling层),对行走行为特征图进行最大值下采样,获得新行走行为特征图,该新行走行为特征图的空间大小小于行走行为特征图。
相应地,通过操作306获得第二偏移量矩阵之后,还可以利用第二池化单元,对第二偏移量矩阵进行卷积上采样,获得与第一偏移量矩阵的大小相同的第二偏移量矩阵。
示例性地,第一偏移量矩阵的大小、以及位置信息图与行走行为特征图的空间大小可以表示为X×Y;最大值下采样的一个示例性尺度例如是2,则新行走行为特征图的空间大小为X/2×Y/2;卷积上采样的尺度相应也是2,进行卷积上采样获得的第二偏移量矩阵的大小又恢复为X×Y。
通过对行走行为特征图进行最大值下采样,可以缩小行走行为特征图的大小,从而使得神经网络(如CNN)可以处理更多的行走行为数据;在获得第二偏移量矩阵后进行卷积上采样,可以恢复为与原始空间大小相同的第二偏移量矩阵,从而使得本申请实施例最终获得的行走行为输出结果和输入行走行为的空间大小一致。
位置信息图与行走行为特征图的大小一致,可以实现将目标场景的位置信息图与行走行为特征图基于对应位置的相加。例如:行走行为特征图的最大值上采样结果是1 2 3 4 5 6,位置信息图是111111,则二者相加的结果为:1+1 2+1 3+1 4+1 5+1 6+1=2 3 4 5 6 7。
图4为本申请实施例中获取所有目标对象在未来时间段M’内的行走行为的一个实施例的流程图。如图4所示,在基于上述本申请各行走行为的预测方法实施例的又一个具体示例中,操作106具体可以通过如下方式实现:
402,对第二偏移量矩阵进行解码,获得用于表示上述至少一个目标对象在未来时间段M’内的行走行为的位移向量。
404,分别获取表示上述至少一个目标对象在未来时间段M’内的行走行为的位移向量对应的行走路径信息。
406,分别根据上述至少一个目标对象在未来时间段M’内的行走行为的位移向量对 应的行走路径信息,获取上述至少一个目标对象在未来时间段M’内的行走行为。
图5为本申请行走行为的预测方法另一个实施例的流程图。本申请实施例以目标对象为行人、将目标场景中所有目标对象在历史时间段M内行走行为信息作为第一偏移量矩阵的输入、行走行为信息为行走路径信息为例,对本申请上述各实施例的具体实现进行进一步说明。如图5所示,该实施例包括:
502,分别获取目标场景中各目标对象在历史时间段M内的行走路径信息。
504,分别针对各目标对象在历史时间段M内的行走路径信息,以一个位移向量表示目标对象在历史时间段M内的行走路径信息。
506,综合各目标对象的位移向量,获得第一偏移量矩阵。
508,将第一偏移量矩阵输入第一子CNN,利用第一子CNN对所有目标对象在历史时间段M内的行走路径信息进行分类,获得行走行为特征图。
510,利用第一池化单元,对行走行为特征图进行最大值下采样,获得新行走行为特征图,该新行走行为特征图的空间大小小于行走行为特征图。
512,利用按位相加单元,将预先设置的目标场景的位置信息图与行走行为特征图基于对应位置相加,获得场景行走行为信息。
其中,位置信息图包括目标场景中可能对行人行走行为有影响的空间结构的位置信息。
514,将场景行走行为信息输入第二子CNN,利用第二子CNN,分别确定所有目标对象在历史时间段M内的各类行走行为在未来时间段M’内对第一偏移量矩阵的影响信息,综合所有目标对象在历史时间段M内的各类行走行为在未来时间段M’内对第一偏移量矩阵的影响信息,获得第二偏移量矩阵。
516,利用第二池化单元,对第二偏移量矩阵进行卷积上采样,获得与第一偏移量矩阵的大小相同的第二偏移量矩阵。
518,对第二偏移量矩阵进行解码,获得用于表示所有目标对象在未来时间段M’内的行走行为的位移向量。
520,分别获取表示所有目标对象在未来时间段M’内的行走行为的位移向量对应的行走路径信息。
522,分别根据所有目标对象在未来时间段M’内的行走行为的位移向量对应的行走路径信息,获取目标场景中的所有目标对象在未来时间段M’内的行走路径。
进一步地,在本申请行走行为的预测方法的又一个实施例中,还可以包括对初始神经网络(如初始CNN)进行网络训练,获得上述神经网络(如CNN)的操作。其中的初始神经网络(如初始CNN)包括如下单元:初始第一子CNN、初始第二子CNN、初始第一池化单元和初始第二池化单元、初始按位相加单元。
在一个具体实例中,可以依次对初始第一子CNN、初始第二子CNN、初始第一池化单元和初始第二池化单元、初始按位相加单元进行迭代训练,在当前训练单元的训练结果 满足预定收敛条件时对下一单元进行迭代训练。
其中,训练结果满足预定收敛条件,例如可以是:当前训练单元的输出结果与预设输出结果之间的偏差小于第一预设阈值;和/或,当前训练单元进行迭代训练的次数达到第二预设阈值。
图6为本申请实施例中进行神经网络(如CNN)训练一个实施例的流程图。如图6所示,该实施例中,具体采用如下方式对初始神经网络(如初始CNN)进行网络训练:
602,对初始CNN中的初始第一子CNN进行网络训练。
604,响应于初始第一子CNN的训练结果满足预先设置的收敛条件,由初始第一子CNN获得第一子CNN,保持第一子CNN的网络参数不变,开始对初始第二子CNN进行网络训练。
606,响应于初始第二子CNN的训练结果满足预先设置的收敛条件,由初始第二子CNN获得第二子CNN,保持第一子CNN和第二子CNN的网络参数不变,开始对初始第一池化单元和初始第二池化单元进行网络训练。
608,响应于初始第一池化单元和初始第二池化单元的训练结果满足预先设置的收敛条件,分别由初始第一池化单元和初始第二池化单元获得第一池化单元和第二池化单元,保持第一子CNN、第二子CNN、第一池化单元和第二池化单元的网络参数不变,开始对初始按位相加单元进行网络训练。
由于第一池化单元会将行走行为特征图的大小减小,第二池化单元会将输入信息恢复为与原始空间大小相同的信息,同时对初始第一池化单元和初始第二池化单元进行训练,保证了本申请实施例最终获得的行走行为输出结果和输入行走行为的空间大小一致。
610,响应于初始按位相加单元的训练结果满足预先设置的收敛条件,由初始按位相加单元获得按位相加单元,保持第一子CNN、第二子CNN、第一池化单元、第二池化单元和按位相加单元的网络参数不变,完成对初始神经网络(如初始CNN)的网络训练,获得神经网络(如CNN)。
通过上述实施例,依次训练初始第一子CNN,初始第二子CNN,初始第一池化单元和初始第二池化单元,初始按位相加单元,在每一层收敛后保持已训练好的网络层的网络参数不变,再逐步增加对下一次序网络层的训练,在训练样本错误率已经不能继续下降时,说明已经收敛条件,需要进行下一步训练,才能使得错误率进一步下降,使得训练过程更加稳定,不会导致加入新的网络层后破坏之前训练好的网络结构。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
图7为本申请行走行为的预测装置一个实施例的结构示意图。该实施例的装置可用于实现本申请上述各行走行为的预测方法实施例。如图7所示,该实施例的行走行为的预测装置包括:行为编码单元,神经网络(如CNN)和行为解码单元。其中:
行为编码单元,用于对目标场景中至少一个目标对象在一个历史时间段M内的行走行为信息进行编码,获得用于表示至少一个目标对象在历史时间段M内行走行为信息的第一偏移量矩阵。
其中,目标场景为需要进行行走行为预测的目标对象所在的场景,例如车站、工厂等。上述至少一目标对象包括需要进行行走行为预测的目标对象,需要进行行走行为预测的目标对象具体可以是一个,也可以是多个,即:本申请实施例可以实现同时对多个目标对象在未来时间段M’内的行走行为预测,一次完成对多个目标对象行走行为的预测任务。另外,上述至少一个目标对象可以包括目标场景中的部分目标对象或者所有目标对象。作为本申请各实施例的一个具体示例而非限制,本申请各实施例的目标对象具体是行人,另外也可以是其他一切需要进行行走行为预测的物体或动物。其中的行走行为信息或者行走行为预测信息例如可以包括但不限于以下任意一种或多种:行走路径信息、行走方向信息、行走速度信息。示例性地,行为编码单元具体可用于:分别获取目标场景中各目标对象在历史时间段M内的行走行为信息;分别针对各目标对象在历史时间段M内的行走行为信息,以一个位移向量表示目标对象在历史时间段M内的行走行为信息;以及根据各目标对象的位移向量,确定第一偏移量矩阵。
深度神经网络,用于接收第一偏移量矩阵,输出用于表示至少一个目标对象在未来时间段M’内行走行为信息的第二偏移量矩阵。
行为解码单元,用于对第二偏移量矩阵进行解码,获取至少一个目标对象在未来时间段M’内的行走行为预测信息。其中的行走行为预测信息例如可以包括但不限于以下任意一种或多种:行走路径信息、行走方向信息、行走速度信息。
示例性地,行为解码单元具体可用于:对第二偏移量矩阵进行解码,获得用于表示至少一个目标对象在未来时间段M’内的行走行为的位移向量;分别获取表示至少一个目标对象在未来时间段M’内的行走行为的位移向量对应的行走行为信息;以及分别根据至少一个目标对象在未来时间段M’内的行走行为的位移向量对应的行走行为信息,获取至少一个目标对象在未来时间段M’内的行走行为预测信息。
基于本申请上述实施例提供的行走行为的预测装置,对目标场景中的至少一个目标对象在一个历史时间段M内的行走行为进行编码,获得用于表示该至少一个目标对象在历史时间段M内行走行为的第一偏移量矩阵并输入神经网络(如CNN),获得用于表示上述至少一个目标对象在未来时间段M’内行走行为的第二偏移量矩阵;对第二偏移量矩阵进行解码,获取上述至少一个目标对象在未来时间段M’内的行走行为。本申请实施例考虑了目标对象过去一段时间的行走行为对其未来一段时间的行走行为的影响,因为同一场景中的各目标对象之间的行走行为可能会产生相互影响,本申请实施例同时考虑了同一场景中其他可能的目标对象(例如:行人)的行走行为对某一个目标对象(例如:需要预测未来行为行走的当前行人)的行走行为的影响,使得可能影响某一目标对象未来行走行为的因素能够同时被考虑,使得对目标对象在未来一段时间内行走行为的预测更加 准确、可靠;另,本申请实施例可以同时对场景内至少一个目标对象的行走行为进行分析,一次给出上述至少一个目标对象未来的行走轨迹预测结果,并不仅限于对单个目标对象进行行走行为预测,预测效率高。
图8为本申请行走行为的预测装置另一个实施例的结构示意图。如图8所示,与图7所示实施例相比,该实施例中的神经网络(如CNN)包括:第一子CNN,按位相加单元和第二子CNN。其中:
第一子CNN,用于接收第一偏移量矩阵,对上述至少一个目标对象在历史时间段M内的行走行为信息进行分类,获得行走行为特征图。
示例性地,第一子CNN具体可以包括多个CNN层,例如三个CNN层;第一子CNN中多个CNN层中的各CNN层可以分别包括多个卷积滤波器,例如64个卷积滤波器,每个卷积滤波器的大小可以为3×3。
按位相加单元,用于将预先设置的目标场景的位置信息图与行走行为特征图基于对应位置相加,获得场景行走行为信息;位置信息图包括目标场景中空间结构的位置信息,此处的空间结构具体可以是对目标场景中目标对象的行走行为产生影响的空间结构,例如目标场景中的障碍物、出入口等;也可以是目标场景中的全部空间结构。
第二子CNN,用于接收场景行走行为信息,分别确定上述至少一个目标对象在历史时间段M内的各类行走行为在未来时间段M’内对第一偏移量矩阵的影响信息,并根据该影响信息确定第二偏移量矩阵。
示例性地,第二子CNN具体也可以包括多个CNN层,例如三个CNN层;第二子CNN中多个CNN层中的各CNN层可以分别包括多个卷积滤波器,例如64个卷积滤波器,每个卷积滤波器的大小可以为3×3。
进一步地,再参见图8,在本申请行走行为的预测装置的又一个实施例中,神经网络(如CNN)还可以包括第一池化单元和第二池化单元。其中:
第一池化单元,用于对第一子CNN获得的行走行为特征图进行最大值下采样,获得新行走行为特征图,该新行走行为特征图的空间大小小于行走行为特征图。
第二池化单元,用于在第二子CNN获得第二偏移量矩阵之后,对第二偏移量矩阵进行卷积上采样,获得与第一偏移量矩阵的大小相同的第二偏移量矩阵。
例如,第一偏移量矩阵的大小、以及位置信息图与行走行为特征图的空间大小可以表示为X×Y;在一个具体示例中,最大值下采样的尺度是2,则新行走行为特征图的空间大小为X/2×Y/2;卷积上采样的尺度是2,进行卷积上采样获得的第二偏移量矩阵的大小又恢复为X×Y。
进一步地,在本申请上述各行走行为的预测装置的再一个实施例中,还可以包括网络训练单元,用于对初始神经网络(如初始CNN)进行网络训练,获得神经网络(如CNN)。其中的初始神经网络(如初始CNN)包括如下单元:初始第一子CNN、初始第二子CNN、初始第一池化单元和初始第二池化单元、初始按位相加单元。在一个具体实例中,网络 训练单元具体可用于依次对初始第一子CNN、初始第二子CNN、初始第一池化单元和初始第二池化单元、初始按位相加单元进行迭代训练,在当前训练单元的训练结果满足预定收敛条件时,对下一单元进行迭代训练。
进一步示例性地,该网络训练单元具体可用于通过图6所示的方式对初始神经网络(如初始CNN)进行网络训练。
本申请实施例还提供了一种数据处理装置,包括本申请上述任一实施例提供的行走行为的预测装置。
具体地,本申请实施例的数据处理装置可以是任意具有数据处理功能的装置,例如可以包括但不限于:进阶精简指令集机器(ARM)、中央处理单元(CPU)或图形处理单元(GPU)等。
基于本申请上述实施例提供的数据处理装置,包括本申请上述任一实施例提供的行走行为的预测装置,考虑了目标对象过去一段时间的行走行为对其未来一段时间的行走行为的影响,同时考虑了同一场景中其他可能的目标对象的行走行为对某一个目标对象的行走行为的影响,同时对场景中的至少一个目标对象进行预测,使得至少一个可能影响某一目标对象未来行走行为的因素能够同时被考虑,使得对目标对象在未来一段时间内行走行为的预测更加准确、可靠;另,本申请实施例同时对场景内至少一个目标对象的行走行为进行分析,能够统一给出至少一个目标对象未来的行走轨迹预测结果,并不是基于单个目标对象进行预测的,预测效率高,能够一次完成对至少一个目标对象行走行为的预测任务。
另外,本申请实施例还提供了一种电子设备,例如可以是移动终端、个人计算机(PC)、平板电脑、服务器等,该电子设备设置有本申请上述任一实施例的数据处理装置。
基于本申请上述实施例提供的电子设备,包括本申请上述数据处理装置,从而包括本申请上述任一实施例提供的行走行为的预测装置,考虑了目标对象过去一段时间的行走行为对其未来一段时间的行走行为的影响,同时考虑了同一场景中其他可能的目标对象的行走行为对某一个目标对象的行走行为的影响,同时对场景中的至少一个目标对象进行预测,使得至少一个可能影响某一目标对象未来行走行为的因素能够同时被考虑,使得对目标对象在未来一段时间内行走行为的预测更加准确、可靠;另,本申请实施例同时对场景内至少一个目标对象的行走行为进行分析,能够统一给出至少一个目标对象未来的行走轨迹预测结果,并不是基于单个目标对象进行预测的,预测效率高,能够一次完成对至少一个目标对象行走行为的预测任务。
图9为本申请电子设备一个应用实施例的结构示意图。如图9所示,用于实现本申请实施例的电子设备包括中央处理单元(CPU)或者图形处理单元(GPU),其可以根据存储在只读存储器(ROM)中的可执行指令或者从存储部分加载到随机访问存储器(RAM)中的可执行指令而执行各种适当的动作和处理。中央处理单元或者图形处理单元可与只读存储器和/或随机访问存储器中通信以执行可执行指令从而完成本申请实施例提供的 行走行为的预测方法对应的操作,例如:对目标场景中至少一目标对象在一个历史时间段M内的行走行为信息进行编码,获得用于表示所述至少一目标对象在历史时间段M内行走行为信息的第一偏移量矩阵;将所述第一偏移量矩阵输入至深度神经网络(如CNN),由所述神经网络(如CNN)输出用于表示所述至少一目标对象在未来时间段M’内行走行为信息的第二偏移量矩阵;对所述第二偏移量矩阵进行解码,获得所述至少一目标对象在未来时间段M’内的行走行为预测信息。
此外,在RAM中,还可存储有系统操作所需的各种程序和数据。CPU、GPU、ROM以及RAM通过总线彼此相连。输入/输出(I/O)接口也连接至总线。
以下部件连接至I/O接口:包括键盘、鼠标等的输入部分;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分;包括硬盘等的存储部分;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分。通信部分经由诸如因特网的网络执行通信处理。驱动器也根据需要连接至I/O接口。可拆卸介质,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器上,以便于从其上读出的计算机程序根据需要被安装入存储部分。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括有形地包含在机器可读介质上的计算机程序,计算机程序包含用于执行流程图所示的方法的程序代码,所述程序代码可包括对应执行本申请实施例提供的任一项行走行为的预测方法步骤对应的指令,例如,对目标场景中至少一目标对象在一个历史时间段M内的行走行为信息进行编码,获得用于表示所述至少一目标对象在历史时间段M内行走行为信息的第一偏移量矩阵的指令;将所述第一偏移量矩阵输入至深度神经网络(如CNN),由所述深度神经网络(如CNN)输出用于表示所述至少一目标对象在未来时间段M’内行走行为信息的第二偏移量矩阵的指令;对所述第二偏移量矩阵进行解码,获得所述至少一目标对象在未来时间段M’内的行走行为预测信息的指令。该计算机程序可以通过通信部分从网络上被下载和安装,和/或从可拆卸介质被安装。在该计算机程序被中央处理单元(CPU)或图形处理单元(GPU)执行时,执行本申请的方法中限定的上述功能。
本申请实施例还提供了一种计算机存储介质,用于存储计算机可读取的指令,所述指令包括:对目标场景中至少一目标对象在一个历史时间段M内的行走行为信息进行编码,获得用于表示所述至少一目标对象在历史时间段M内行走行为信息的第一偏移量矩阵的指令;将所述第一偏移量矩阵输入至深度神经网络(如CNN),由所述深度神经网络(如CNN)输出用于表示所述至少一目标对象在未来时间段M’内行走行为信息的第二偏移量矩阵的指令;对所述第二偏移量矩阵进行解码,获得所述至少一目标对象在未来时间段M’内的行走行为预测信息的指令。
另外,本申请实施例还提供了一种计算机设备,包括:
存储器,存储可执行指令;
一个或多个处理器,与存储器通信以执行可执行指令从而完成本申请上述任一实施例的行走行为的预测方法对应的操作。
本申请实施例进行行走行为预测的技术方案,例如可以应用于但不限于如下一种或多种场景:
1,可用于对视频监控下的场景中,所有行人的未来的行走行为进行预测;
2,可将本申请实施例的行走行为预测结果作为神经网络(如CNN)的输入,预测更长时间内目标场景中所有行人的行走行为;
具体来说,可以迭代本申请各行走行为的预测方法实施例的流程,将输出的在未来时间段M’内的行走行为预测信息进一步编码,之后再输入到神经网络(如CNN),再对第二偏移量矩阵进行解码输出,就能得到更长时间之后的行人行走行为的预测结果。
3,可利用本申请实施例的行走行为预测结果,估计一段时间之后目标场景中所有行人的位置信息;
4,可以利用一段时间之后目标场景中所有行人的位置信息,可以利用该信息修正跟踪算法中出现的错误,帮助取得更好的跟踪结果;
具体来说,因为大部分跟踪算法在跟踪算法可信度不高时,是依据行人的外观信息进行匹配并找到未来的行人的。跟踪算法经常会把过去一段时间内a行人的行走轨迹与未来一段时间内b行人的轨迹错误的链接起来。借助于本申请实施例对行人行走路径的预测,可以综合考虑将行人的外观以及行走路径的预测结果,使得结果更加准确。当跟踪算法的结果可信度不高时,可以利用行人行走轨迹的预测信息辅助找回当前需要跟踪的行人。
5,可以利用本申请,对场景中发生的一些异常行为进行检测。
由于本申请实施例可以根据过去一段时间内目标场景中行人的行走路线,对这些行人未来的行走路线以及目的地进行预测。当发现这个行人真正的行走路线与预测结果不符合的时候,或者这个行人的目的地与预测的目的地很不一致的时候,说明这个行人的行走行为已经超出了预期,可以推行这个行人出现了异常的行为,例如,突然转弯,突然加速跑,或者突然停止等。
本申请实施例具有以下有益技术效果:
为了能将深度学习的框架应用到行人行为的建模中来,我们提出了行为编码概念,利用行为编码,可以将行人行走行为信息没有歧义的编码成深度神经网络的输入输出,这种编码方式能够很容易的扩展到其他的领域;
利用深度学习技术,使得行人行走行为预测的结果更加准确,能够更好地对各种影响因素进行综合分析;
另外,现有的很多方法只能进行单目标的行人行为预测,本申请实施例能够同时对目标场景内的至少一个行人、甚至全部行人的行走行为进行预测分析。
本说明书中各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其它实 施例的不同之处,各个实施例之间相同或相似的部分相互参见即可。对于系统、装置、设备实施例而言,由于其与方法实施例基本对应,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
可能以许多方式来实现本申请的方法、系统、装置和设备。例如,可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本申请的方法、系统、装置和设备。用于所述方法的步骤的上述顺序仅是为了进行说明,本申请的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施例中,还可将本申请实施为记录在记录介质中的程序,这些程序包括用于实现根据本申请的方法的机器可读指令。因而,本申请还覆盖存储用于执行根据本申请的方法的程序的记录介质。
本申请的描述是为了示例和描述起见而给出的,而并不是无遗漏的或者将本申请限于所公开的形式。很多修改和变化对于本领域的普通技术人员而言是显然的。选择和描述实施例是为了更好说明本申请的原理和实际应用,并且使本领域的普通技术人员能够理解本申请从而设计适于特定用途的带有各种修改的各种实施例。

Claims (27)

  1. 一种行走行为的预测方法,其特征在于,包括:
    对目标场景中至少一目标对象在一个历史时间段M内的行走行为信息进行编码,获得用于表示所述至少一目标对象在历史时间段M内行走行为信息的第一偏移量信息;
    将所述第一偏移量信息输入至神经网络,由所述神经网络输出用于表示所述至少一目标对象在未来时间段M’内行走行为信息的第二偏移量信息;
    对所述第二偏移量信息进行解码,获得所述至少一目标对象在未来时间段M’内的行走行为预测信息。
  2. 根据权利要求1所述的方法,其特征在于,所述目标场景为需要进行行走行为预测的目标对象所在的场景;
    所述至少一目标对象包括所述目标场景中的部分目标对象或者所有目标对象,所述至少一目标对象包括所述需要进行行走行为预测的目标对象。
  3. 根据权利要求1或2所述的方法,其特征在于,所述目标对象包括行人。
  4. 根据权利要求1至3任意一项所述的方法,其特征在于,所述行走行为信息或所述行走行为预测信息包括以下任意一种或多种:行走路径信息、行走方向信息、行走速度信息。
  5. 根据权利要求1至4任意一项所述的方法,其特征在于,所述对目标场景中至少一目标对象在一个历史时间段M内的行走行为信息进行编码,获得第一偏移量信息包括:
    分别获取所述目标场景中各目标对象在历史时间段M内的行走行为信息;
    分别针对各目标对象在历史时间段M内的行走行为信息,以一个位移向量表示目标对象在历史时间段M内的行走行为信息;
    根据各目标对象的位移向量确定作为所述第一偏移量信息的第一偏移量矩阵。
  6. 根据权利要求1至5任意一项所述的方法,其特征在于,所述神经网络包括第一子CNN、按位相加单元和第二子CNN;
    所述将所述第一偏移量信息输入至神经网络,由所述神经网络输出第二偏移量信息包括:
    将作为所述第一偏移量信息的第一偏移量矩阵作为第一子CNN的输入,利用所述第一子CNN对所述至少一目标对象在历史时间段M内的行走行为信息进行分类,获得行走行为特征图;
    利用按位相加单元,将预先设置的所述目标场景的位置信息图与所述行走行为特征图基于对应位置相加,获得场景行走行为信息;所述位置信息图包括所述目标场景中空间结构的位置信息;
    将所述场景行走行为信息作为所述第二子CNN的输入,利用所述第二子CNN分别确定 所述至少一目标对象在历史时间段M内的各类行走行为在未来时间段M’内对所述第一偏移量矩阵的影响信息,并根据所述影响信息确定作为所述第二偏移量信息的第二偏移量矩阵。
  7. 根据权利要求6所述的方法,其特征在于,还包括:
    预先根据所述目标场景的空间结构信息确定所述目标场景的位置信息图。
  8. 根据权利要求6或7所述的方法,其特征在于,所述第一子CNN包括级联的多个CNN层;所述第一子CNN中的每个CNN层分别包括多个卷积滤波器;和/或
    所述第二子CNN包括级联的多个CNN层;所述第二子CNN中的每个CNN层分别包括多个卷积滤波器。
  9. 根据权利要求6至8任意一项所述的方法,其特征在于,所述神经网络还包括第一池化单元和第二池化单元;
    所述获得行走行为特征图之后,还包括:利用第一池化单元,对所述行走行为特征图进行最大值下采样,获得新行走行为特征图,所述新行走行为特征图的空间大小小于所述行走行为特征图;
    获得所述第二偏移量矩阵之后,还包括:利用第二池化单元,对所述第二偏移量矩阵进行卷积上采样,获得与所述第一偏移量矩阵的大小相同的第二偏移量矩阵。
  10. 根据权利要求1至9任意一项所述的方法,其特征在于,对所述第二偏移量信息进行解码,获得所述至少一目标对象在未来时间段M’内的行走行为预测信息包括:
    对所述第二偏移量信息进行解码,获得用于表示所述至少一目标对象在未来时间段M’内的行走行为的位移向量;
    分别获取表示所述至少一目标对象在未来时间段M’内的行走行为的位移向量对应的行走行为信息;
    分别根据所述至少一目标对象在未来时间段M’内的行走行为的位移向量对应的行走行为信息,获取至少一目标对象在未来时间段M’内的行走行为的预测信息。
  11. 根据权利要求9或10所述的方法,其特征在于,还包括:
    预先对初始神经网络进行网络训练,获得所述神经网络,所述初始神经网络包括:初始第一子CNN、初始第二子CNN、初始第一池化单元和初始第二池化单元、初始按位相加单元。
  12. 根据权利要求11所述的方法,其特征在于,所述对初始神经网络进行网络训练,获得所述神经网络包括:
    依次对初始第一子CNN、初始第二子CNN、初始第一池化单元和初始第二池化单元、初始按位相加单元进行迭代训练,在当前训练单元的训练结果满足预定收敛条件时对下一单元进行迭代训练。
  13. 根据权利要求12所述的方法,其特征在于,所述训练结果满足预定收敛条件包括:
    当前训练单元的输出结果与预设输出结果之间的偏差小于第一预设阈值;和/或
    当前训练单元进行迭代训练的次数达到第二预设阈值。
  14. 一种行走行为的预测装置,其特征在于,包括:
    行为编码单元,用于对目标场景中至少一个目标对象在一个历史时间段M内的行走行为信息进行编码,获得用于表示所述至少一个目标对象在历史时间段M内行走行为信息的第一偏移量信息;
    神经网络,用于接收所述第一偏移量信息,输出用于表示所述至少一个目标对象在未来时间段M’内行走行为信息的第二偏移量信息;
    行为解码单元,用于对所述第二偏移量信息进行解码,获得所述至少一个目标对象在未来时间段M’内的行走行为预测信息。
  15. 根据权利要求14所述的装置,其特征在于,所述行走行为信息或所述行走行为预测信息包括以下任意一种或多种:行走路径信息、行走方向信息、行走速度信息。
  16. 根据权利要求14或15所述的装置,其特征在于,所述行为编码单元具体用于:
    分别获取所述目标场景中各目标对象在历史时间段M内的行走行为信息;
    分别针对各目标对象在历史时间段M内的行走行为信息,以一个位移向量表示目标对象在历史时间段M内的行走行为;
    根据各目标对象的位移向量,确定作为所述第一偏移量信息的第一偏移量矩阵。
  17. 根据权利要求14至16任意一项所述的装置,其特征在于,所述神经网络包括:
    所述第一子CNN,用于接收作为所述第一偏移量信息的第一偏移量矩阵,对所述至少一个目标对象在历史时间段M内的行走行为信息进行分类,获得行走行为特征图;
    所述按位相加单元,用于将预先设置的所述目标场景的位置信息图与所述行走行为特征图基于对应位置相加,获得场景行走行为信息;所述位置信息图包括所述目标场景中空间结构的位置信息;
    所述第二子CNN,用于接收所述场景行走行为信息,分别确定所述至少一个目标对象在历史时间段M内的各类行走行为在未来时间段M’内对所述第一偏移量矩阵的影响信息,并根据所述影响信息确定作为所述第二偏移量信息的第二偏移量矩阵。
  18. 根据权利要求17所述的装置,其特征在于,所述第一子CNN包括级联的多个CNN层;所述第一子CNN中每个CNN层分别包括多个卷积滤波器;和/或
    所述第二子CNN包括级联的多个CNN层;所述第二子CNN中每个CNN层分别包括多个卷积滤波器。
  19. 根据权利要求17或18所述的装置,其特征在于,所述神经网络还包括:
    第一池化单元,用于对所述第一子CNN获得的所述行走行为特征图进行最大值下采样,获得新行走行为特征图,所述新行走行为特征图的空间大小小于所述行走行为特征图;
    第二池化单元,用于在所述第二子CNN获得所述第二偏移量矩阵之后,对所述第二偏移量矩阵进行卷积上采样,获得与所述第一偏移量矩阵的大小相同的第二偏移量矩阵。
  20. 根据权利要求14至19任意一项所述的装置,其特征在于,所述行为解码单元具体用于:
    对所述第二偏移量信息进行解码,获得用于表示所述至少一个目标对象在未来时间段M’内的行走行为的位移向量;
    分别获取表示所述至少一个目标对象在未来时间段M’内的行走行为的位移向量对应的行走行为信息;
    分别根据至少一个目标对象在未来时间段M’内的行走行为的位移向量对应的行走行为信息,获取至少一个目标对象在未来时间段M’内的行走行为预测信息。
  21. 根据权利要求19或20所述的装置,其特征在于,还包括:
    网络训练单元,用于对初始神经网络进行网络训练,获得所述神经网络,所述初始神经网络包括:初始第一子CNN、初始第二子CNN、初始第一池化单元和初始第二池化单元、初始按位相加单元。
  22. 根据权利要求21所述的装置,其特征在于,所述网络训练单元具体用于:
    依次对初始第一子CNN、初始第二子CNN、初始第一池化单元和初始第二池化单元、初始按位相加单元进行迭代训练,在当前训练单元的训练结果满足预定收敛条件时对下一单元进行迭代训练。
  23. 一种数据处理装置,其特征在于,包括权利要求14至22任意一项所述的行走行为的预测装置。
  24. 根据权利要求23所述的装置,其特征在于,所述数据处理装置包括进阶精简指令集机器ARM、中央处理单元CPU或图形处理单元GPU。
  25. 一种电子设备,其特征在于,设置有权利要求23或24所述的数据处理装置。
  26. 一种计算机程序,包括计算机可读代码,当所述计算机可读代码在设备中运行时,所述设备中的处理器执行用于实现权利要求1-13中的任一权利要求所述的行走行为的预测方法中的步骤的可执行指令。
  27. 一种计算机可读介质,存储有计算机可读代码,当所述计算机可读代码在设备中运行时,所述设备中的处理器执行用于实现权利要求1-13中的任一权利要求所述的行走行为的预测方法中的步骤的可执行指令。
PCT/CN2017/102706 2016-09-29 2017-09-21 行走行为的预测方法和装置、数据处理装置和电子设备 WO2018059300A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/174,852 US10817714B2 (en) 2016-09-29 2018-10-30 Method and apparatus for predicting walking behaviors, data processing apparatus, and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610868343.9 2016-09-29
CN201610868343.9A CN106504266B (zh) 2016-09-29 2016-09-29 行走行为的预测方法和装置、数据处理装置和电子设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/174,852 Continuation US10817714B2 (en) 2016-09-29 2018-10-30 Method and apparatus for predicting walking behaviors, data processing apparatus, and electronic device

Publications (1)

Publication Number Publication Date
WO2018059300A1 true WO2018059300A1 (zh) 2018-04-05

Family

ID=58290085

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/102706 WO2018059300A1 (zh) 2016-09-29 2017-09-21 行走行为的预测方法和装置、数据处理装置和电子设备

Country Status (3)

Country Link
US (1) US10817714B2 (zh)
CN (1) CN106504266B (zh)
WO (1) WO2018059300A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027487A (zh) * 2019-12-11 2020-04-17 山东大学 基于多卷积核残差网络的行为识别系统、方法、介质及设备
CN111127510A (zh) * 2018-11-01 2020-05-08 杭州海康威视数字技术股份有限公司 一种目标对象位置的预测方法及装置
CN113128772A (zh) * 2021-04-24 2021-07-16 中新国际联合研究院 一种基于序列到序列模型的人群数量预测方法以及装置
CN113362367A (zh) * 2021-07-26 2021-09-07 北京邮电大学 一种基于多精度交互的人群轨迹预测方法

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106504266B (zh) 2016-09-29 2019-06-14 北京市商汤科技开发有限公司 行走行为的预测方法和装置、数据处理装置和电子设备
EP3552069B1 (en) * 2016-12-12 2022-03-30 Alarm.com Incorporated Drone pre-surveillance
EP3495220B1 (en) * 2017-12-11 2024-04-03 Volvo Car Corporation Path prediction for a vehicle
US10593321B2 (en) * 2017-12-15 2020-03-17 Mitsubishi Electric Research Laboratories, Inc. Method and apparatus for multi-lingual end-to-end speech recognition
US10937310B2 (en) * 2017-12-29 2021-03-02 Intel IP Corporation Control device and method for controlling a vehicle
CN108960160B (zh) * 2018-07-10 2021-03-09 深圳地平线机器人科技有限公司 基于非结构化预测模型来预测结构化状态量的方法和装置
US11636681B2 (en) * 2018-11-21 2023-04-25 Meta Platforms, Inc. Anticipating future video based on present video
CN109878512A (zh) * 2019-01-15 2019-06-14 北京百度网讯科技有限公司 自动驾驶控制方法、装置、设备及计算机可读存储介质
CN109815969A (zh) * 2019-03-05 2019-05-28 上海骏聿数码科技有限公司 一种基于人工智能图像识别的特征提取方法及装置
CN109948528B (zh) * 2019-03-18 2023-04-07 南京砺剑光电技术研究院有限公司 一种基于视频分类的机器人行为识别方法
CN110751325B (zh) * 2019-10-16 2023-04-18 中国民用航空总局第二研究所 一种建议生成方法、交通枢纽部署方法、装置及存储介质
US11878684B2 (en) * 2020-03-18 2024-01-23 Toyota Research Institute, Inc. System and method for trajectory prediction using a predicted endpoint conditioned network
CN111524318B (zh) * 2020-04-26 2022-03-01 熵基华运(厦门)集成电路有限公司 一种基于行为识别的健康状况智能监控方法和系统
CN111639624B (zh) * 2020-06-10 2023-09-29 深圳市时海科技有限公司 一种基于人工智能的课堂教学及时强化能力评估方法及系统
CN112785075B (zh) * 2021-01-31 2022-11-25 江苏商贸职业学院 一种基于rfid定位的行人行为预测方法及系统
CN113052401A (zh) * 2021-04-26 2021-06-29 青岛大学 盲人行走轨迹预测方法、电子设备及存储介质
CN113869170B (zh) * 2021-09-22 2024-04-23 武汉大学 一种基于图划分卷积神经网络的行人轨迹预测方法
CN115394024B (zh) * 2022-08-10 2024-02-23 武汉烽理光电技术有限公司 一种基于光栅阵列的步行监测和预测方法及装置
CN115512479B (zh) * 2022-09-09 2024-04-09 北海市冠标智慧声谷科技有限责任公司 管理接待信息的方法以及后端设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103907146A (zh) * 2011-09-20 2014-07-02 丰田自动车株式会社 行人行动预测装置以及行人行动预测方法
CN104915628A (zh) * 2014-03-14 2015-09-16 株式会社理光 基于车载相机的场景建模进行运动行人预测的方法和装置
CN105488794A (zh) * 2015-11-26 2016-04-13 中山大学 一种基于空间定位和聚类的动作预测方法及系统
CN105976400A (zh) * 2016-05-10 2016-09-28 北京旷视科技有限公司 基于神经网络模型的目标跟踪方法及装置
CN106504266A (zh) * 2016-09-29 2017-03-15 北京市商汤科技开发有限公司 行走行为的预测方法和装置、数据处理装置和电子设备

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7330566B2 (en) * 2003-05-15 2008-02-12 Microsoft Corporation Video-based gait recognition
US7212651B2 (en) * 2003-06-17 2007-05-01 Mitsubishi Electric Research Laboratories, Inc. Detecting pedestrians using patterns of motion and appearance in videos
WO2005036371A2 (en) * 2003-10-09 2005-04-21 Honda Motor Co., Ltd. Moving object detection using low illumination depth capable computer vision
JP4623135B2 (ja) * 2008-05-08 2011-02-02 株式会社デンソー 画像認識装置
JP6242563B2 (ja) * 2011-09-09 2017-12-06 株式会社メガチップス 物体検出装置
JP5964108B2 (ja) * 2012-03-30 2016-08-03 株式会社メガチップス 物体検出装置
JP6184877B2 (ja) * 2014-01-09 2017-08-23 クラリオン株式会社 車両用外界認識装置
JP6454554B2 (ja) * 2015-01-20 2019-01-16 クラリオン株式会社 車両用外界認識装置およびそれを用いた車両挙動制御装置
CN104850846B (zh) * 2015-06-02 2018-08-24 深圳大学 一种基于深度神经网络的人体行为识别方法及识别系统
CN105069413B (zh) * 2015-07-27 2018-04-06 电子科技大学 一种基于深度卷积神经网络的人体姿势识别方法
CN105740773B (zh) * 2016-01-25 2019-02-01 重庆理工大学 基于深度学习和多尺度信息的行为识别方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103907146A (zh) * 2011-09-20 2014-07-02 丰田自动车株式会社 行人行动预测装置以及行人行动预测方法
CN104915628A (zh) * 2014-03-14 2015-09-16 株式会社理光 基于车载相机的场景建模进行运动行人预测的方法和装置
CN105488794A (zh) * 2015-11-26 2016-04-13 中山大学 一种基于空间定位和聚类的动作预测方法及系统
CN105976400A (zh) * 2016-05-10 2016-09-28 北京旷视科技有限公司 基于神经网络模型的目标跟踪方法及装置
CN106504266A (zh) * 2016-09-29 2017-03-15 北京市商汤科技开发有限公司 行走行为的预测方法和装置、数据处理装置和电子设备

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111127510A (zh) * 2018-11-01 2020-05-08 杭州海康威视数字技术股份有限公司 一种目标对象位置的预测方法及装置
CN111127510B (zh) * 2018-11-01 2023-10-27 杭州海康威视数字技术股份有限公司 一种目标对象位置的预测方法及装置
CN111027487A (zh) * 2019-12-11 2020-04-17 山东大学 基于多卷积核残差网络的行为识别系统、方法、介质及设备
CN111027487B (zh) * 2019-12-11 2023-04-28 山东大学 基于多卷积核残差网络的行为识别系统、方法、介质及设备
CN113128772A (zh) * 2021-04-24 2021-07-16 中新国际联合研究院 一种基于序列到序列模型的人群数量预测方法以及装置
CN113128772B (zh) * 2021-04-24 2023-01-17 中新国际联合研究院 一种基于序列到序列模型的人群数量预测方法以及装置
CN113362367A (zh) * 2021-07-26 2021-09-07 北京邮电大学 一种基于多精度交互的人群轨迹预测方法
CN113362367B (zh) * 2021-07-26 2021-12-14 北京邮电大学 一种基于多精度交互的人群轨迹预测方法

Also Published As

Publication number Publication date
CN106504266B (zh) 2019-06-14
US10817714B2 (en) 2020-10-27
CN106504266A (zh) 2017-03-15
US20190073524A1 (en) 2019-03-07

Similar Documents

Publication Publication Date Title
WO2018059300A1 (zh) 行走行为的预测方法和装置、数据处理装置和电子设备
JP7203769B2 (ja) ボクセルベースのグランド平面推定およびオブジェクト区分化
US11216971B2 (en) Three-dimensional bounding box from two-dimensional image and point cloud data
US10902616B2 (en) Scene embedding for visual navigation
Pfeiffer et al. A data-driven model for interaction-aware pedestrian motion prediction in object cluttered environments
CN108780522B (zh) 用于视频理解的使用基于运动的注意力的递归网络
Zhou et al. Deep learning in next-frame prediction: A benchmark review
US10002309B2 (en) Real-time object analysis with occlusion handling
Weng et al. PTP: Parallelized tracking and prediction with graph neural networks and diversity sampling
US20210192358A1 (en) Graph neural network systems for behavior prediction and reinforcement learning in multple agent environments
US20190101919A1 (en) Trajectory Generation Using Temporal Logic and Tree Search
CN110646787A (zh) 自运动估计方法和设备以及模型训练方法和设备
US10599975B2 (en) Scalable parameter encoding of artificial neural networks obtained via an evolutionary process
CN111052128B (zh) 用于检测和定位视频中的对象的描述符学习方法
KR20230035403A (ko) 준-지도된(semi-supervised) 키포인트 기반 모델
CN110955965A (zh) 一种考虑交互作用的行人运动预测方法及系统
CN113516227A (zh) 一种基于联邦学习的神经网络训练方法及设备
Tawiah A review of algorithms and techniques for image-based recognition and inference in mobile robotic systems
EP4137997A1 (en) Methods and system for goal-conditioned exploration for object goal navigation
Syed et al. Semantic scene upgrades for trajectory prediction
Borkar et al. Path planning design for a wheeled robot: a generative artificial intelligence approach
Wang Res-FLNet: human-robot interaction and collaboration for multi-modal sensing robot autonomous driving tasks based on learning control algorithm
Li et al. Understanding Pedestrian Trajectory Prediction from the Planning Perspective
Parimi et al. Dynamic speed estimation of moving objects from camera data
Nguyen et al. Uncertainty-aware visually-attentive navigation using deep neural networks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17854750

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17854750

Country of ref document: EP

Kind code of ref document: A1