CN116935280A - Behavior prediction method and system based on video analysis - Google Patents

Behavior prediction method and system based on video analysis Download PDF

Info

Publication number
CN116935280A
CN116935280A CN202310934107.2A CN202310934107A CN116935280A CN 116935280 A CN116935280 A CN 116935280A CN 202310934107 A CN202310934107 A CN 202310934107A CN 116935280 A CN116935280 A CN 116935280A
Authority
CN
China
Prior art keywords
typical
network
user
mining
behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310934107.2A
Other languages
Chinese (zh)
Inventor
曹国强
王武
肖彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Zhongge Technology Co ltd
Original Assignee
Shenyang Zhongge Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Zhongge Technology Co ltd filed Critical Shenyang Zhongge Technology Co ltd
Priority to CN202310934107.2A priority Critical patent/CN116935280A/en
Publication of CN116935280A publication Critical patent/CN116935280A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a behavior prediction method and system based on video analysis, and relates to the technical field of data processing. In the invention, a target user monitoring video is collected; performing network optimization operation to form an optimized behavior feature mining network; loading the target user monitoring video to an optimized behavior feature mining network, mining out a user behavior aggregation description vector corresponding to the target user monitoring video by utilizing the optimized behavior feature mining network, wherein the user behavior aggregation description vector corresponding to the target user monitoring video is formed by aggregating description vectors of at least two dimensions; and carrying out behavior prediction operation on the target user based on the user behavior aggregation description vector so as to output a target behavior prediction result corresponding to the target user. Based on the method, the reliability of behavior prediction can be improved.

Description

Behavior prediction method and system based on video analysis
Technical Field
The invention relates to the technical field of data processing, in particular to a behavior prediction method and system based on video analysis.
Background
Video monitoring is an important monitoring means, and is utilized in various application scenes, so that a user monitoring video can be formed by performing video monitoring. The formed user monitoring video has more functions, for example, the user monitoring video can be analyzed to realize behavior prediction and the like of the user in the user monitoring video. However, in the prior art, there is a problem that reliability is not high in a process of performing behavior prediction based on a user monitoring video.
Disclosure of Invention
In view of the above, the present invention is directed to a behavior prediction method and system based on video analysis, so as to improve reliability of behavior prediction.
In order to achieve the above purpose, the embodiment of the present invention adopts the following technical scheme:
a behavior prediction method based on video analysis, comprising:
the method comprises the steps that target user monitoring videos are collected, wherein the target user monitoring videos are formed by monitoring the target users through videos and comprise multi-frame target user monitoring video frames;
performing network optimization operation to form an optimized behavior feature mining network;
loading the target user monitoring video to the optimized behavior feature mining network, mining user behavior aggregation description vectors corresponding to the target user monitoring video by utilizing the optimized behavior feature mining network, wherein the user behavior aggregation description vectors corresponding to the target user monitoring video are formed by aggregating description vectors of at least two dimensions;
and carrying out behavior prediction operation on the target user based on the user behavior aggregation description vector so as to output a target behavior prediction result corresponding to the target user, wherein the target behavior prediction result is used for reflecting the predicted behavior of the target user.
In some preferred embodiments, in the behavior prediction method based on video analysis, the step of forming an optimized behavior feature mining network by performing a network optimization operation includes:
determining a first cluster of typical videos, and loading the first cluster of typical videos into a candidate behavior feature mining network, wherein the first cluster of typical videos comprises a plurality of first sub-clusters of typical videos, the first sub-clusters of typical videos comprise a plurality of typical user monitoring videos, and typical users in each of the typical user monitoring videos have corresponding typical user identification data and typical user related identification data, the typical user identification data is used for reflecting user behavior type information of the typical users, and the typical user related identification data is used for reflecting information of video frame parts related to the typical users in the typical user monitoring videos;
performing user related information mining operation on the typical user monitoring videos to output typical user related description vectors corresponding to each typical user monitoring video, performing user information mining operation on the typical user monitoring videos to output typical user description vectors corresponding to each typical user monitoring video, and performing aggregation operation on the same typical user description vectors corresponding to the typical user monitoring videos and the typical user related description vectors to form user behavior aggregation description vectors corresponding to each typical user monitoring video;
Determining relevant mining error parameters according to typical user related description vectors and typical user related identification data corresponding to the same typical user monitoring video, determining user mining error parameters according to typical user description vectors and typical user identification data corresponding to the same typical user monitoring video, and determining aggregation error parameters according to each user behavior aggregation description vector corresponding to the same typical video first sub-cluster;
performing network optimization operation on the candidate behavior feature mining network according to the related mining error parameters, the user mining error parameters and the aggregation error parameters to form a pending behavior feature mining network corresponding to the candidate behavior feature mining network;
and taking the undetermined behavior feature mining network as a current candidate behavior feature mining network, executing the determined typical video first cluster in a revolving way, loading the typical video first cluster to load the determined typical video first cluster into the candidate behavior feature mining network, and marking the currently output undetermined behavior feature mining network as an optimized behavior feature mining network when the step is not executed in a revolving way.
In some preferred embodiments, in the behavior prediction method based on video analysis, in a case that the description vector to be processed belongs to a typical user-related description vector, the identification data to be processed belongs to typical user-related identification data, and the error parameter to be processed belongs to a related mining error parameter; under the condition that the description vector to be processed belongs to the typical user description vector, the identification data to be processed belongs to the typical user identification data, and the error parameter to be processed belongs to the user mining error parameter;
the operation of determining the error parameter to be processed comprises the following steps:
determining a reference typical user description vector cluster corresponding to each piece of identification data to be processed; the reference typical user description vector cluster comprises a plurality of reference typical user description vectors corresponding to the same type of to-be-processed identification data, the reference typical user description vectors are center representative description vectors which are extracted from each to-be-processed description vector corresponding to the to-be-processed identification data and are most similar to each to-be-processed description vector corresponding to the to-be-processed identification data, and the center representative description vectors are formed by aggregating each to-be-processed description vector corresponding to the same type of to-be-processed identification data in the to-be-processed description vector set;
In each reference typical user description vector cluster, determining a reference typical user description vector which is most similar to a to-be-processed description vector corresponding to a to-be-processed typical user monitoring video, and marking the reference typical user description vector as a first reference typical user description vector corresponding to the to-be-processed typical user monitoring video;
and determining corresponding error parameters to be processed according to the difference between the description vector to be processed corresponding to the same typical user monitoring video and the first reference typical user description vector.
In some preferred embodiments, in the behavior prediction method based on video analysis described above, the vector sizes of the description vector to be processed and the first reference representative user description vector are target vector sizes; and determining a corresponding error parameter to be processed according to the difference between the corresponding description vector to be processed of the same typical user monitoring video and the first reference typical user description vector, including:
determining any description vector with vector size equal to the target vector size, and marking distribution coordinates corresponding to vector parameters which are larger than or equal to predetermined reference vector parameters in the any description vector, so as to be the target distribution coordinates;
And determining corresponding error parameters to be processed according to the difference between the corresponding description vector to be processed of the same typical user monitoring video and the vector parameters of the first reference typical user description vector on the same target distribution coordinates.
In some preferred embodiments, in the behavior prediction method based on video analysis, the candidate behavior feature mining network includes a first sub-network, a second sub-network and an aggregation sub-network, where the first sub-network is used for mining typical user related description vectors, the second sub-network is used for mining typical user description vectors, and the aggregation sub-network is used for aggregating to form user behavior aggregation description vectors; the step of performing network optimization operation on the candidate behavior feature mining network according to the related mining error parameter, the user mining error parameter and the aggregation error parameter to form a pending behavior feature mining network corresponding to the candidate behavior feature mining network, includes:
and optimizing the parameters of the first sub-network, the second sub-network and the aggregation sub-network according to the relevant mining error parameters and on the basis of relevant learning parameters, optimizing the parameters of the second sub-network according to the user mining error parameters and on the basis of user learning parameters, and optimizing the parameters of the first sub-network, the second sub-network and the aggregation sub-network according to the aggregation error parameters and on the basis of aggregation learning parameters to form a pending behavior feature mining network corresponding to the candidate behavior feature mining network, wherein the relevant learning parameters and the user learning parameters are all larger than the aggregation learning parameters.
In some preferred embodiments, in the behavior prediction method based on video analysis, before the step of determining a first cluster of typical videos and loading the first cluster of typical videos to load into a candidate behavior feature mining network, the step of forming an optimized behavior feature mining network by performing a network optimization operation further includes:
determining a second cluster of typical videos, loading the second cluster of the typical videos to load the second cluster of the typical videos into a constructed original behavior feature mining network, mining original typical user description vectors corresponding to each typical user monitoring video in the second cluster of the typical videos, wherein the original behavior feature mining network comprises a key information mining unit and a linear integration unit;
according to each original representative user description vector corresponding to the same representative video sub-cluster in the second representative video cluster, original mining error parameters are calculated;
performing network optimization operation on the original behavior feature mining network according to the original mining error parameters, and forming an optimized original behavior feature mining network when network optimization is finished;
And performing parameter configuration operation on the key information mining unit in the target behavior feature mining network according to the optimized key information mining unit in the original behavior feature mining network, and performing parameter configuration operation on the linear integration unit in the target behavior feature mining network according to the optimized linear integration unit in the original behavior feature mining network to form a corresponding candidate behavior feature mining network.
In some preferred embodiments, in the behavior prediction method based on video analysis, the step of performing parameter configuration operation on the key information mining unit in the target behavior feature mining network according to the key information mining unit in the optimized original behavior feature mining network, and performing parameter configuration operation on the linear integration unit in the target behavior feature mining network according to the linear integration unit in the optimized original behavior feature mining network to form a corresponding candidate behavior feature mining network includes:
performing parameter configuration operation on the key information mining unit shared between the first sub-network and the second sub-network in the target behavior feature mining network according to the key information mining unit in the optimized original behavior feature mining network, and performing parameter configuration operation on the linear integration unit of the first sub-network and the linear integration unit of the second sub-network in the target behavior feature mining network according to the linear integration unit in the optimized original behavior feature mining network to form a corresponding candidate behavior feature mining network.
The embodiment of the invention also provides a behavior prediction system based on video analysis, which comprises the following steps:
the monitoring video acquisition module is used for acquiring a target user monitoring video, wherein the target user monitoring video is formed by carrying out video monitoring on a target user and comprises multi-frame target user monitoring video frames;
the network optimization module is used for forming an optimized behavior feature mining network by performing network optimization operation;
the behavior feature mining module is used for loading the target user monitoring video to the optimized behavior feature mining network, mining user behavior aggregation description vectors corresponding to the target user monitoring video by utilizing the optimized behavior feature mining network, wherein the user behavior aggregation description vectors corresponding to the target user monitoring video are formed by aggregating description vectors of at least two dimensions;
and the behavior prediction module is used for performing behavior prediction operation on the target user based on the user behavior aggregation description vector so as to output a target behavior prediction result corresponding to the target user, wherein the target behavior prediction result is used for reflecting the predicted behavior of the target user.
In some preferred embodiments, in the behavior prediction system based on video analysis, the network optimization module is specifically configured to:
determining a first cluster of typical videos, and loading the first cluster of typical videos into a candidate behavior feature mining network, wherein the first cluster of typical videos comprises a plurality of first sub-clusters of typical videos, the first sub-clusters of typical videos comprise a plurality of typical user monitoring videos, and typical users in each of the typical user monitoring videos have corresponding typical user identification data and typical user related identification data, the typical user identification data is used for reflecting user behavior type information of the typical users, and the typical user related identification data is used for reflecting information of video frame parts related to the typical users in the typical user monitoring videos;
performing user related information mining operation on the typical user monitoring videos to output typical user related description vectors corresponding to each typical user monitoring video, performing user information mining operation on the typical user monitoring videos to output typical user description vectors corresponding to each typical user monitoring video, and performing aggregation operation on the same typical user description vectors corresponding to the typical user monitoring videos and the typical user related description vectors to form user behavior aggregation description vectors corresponding to each typical user monitoring video;
Determining relevant mining error parameters according to typical user related description vectors and typical user related identification data corresponding to the same typical user monitoring video, determining user mining error parameters according to typical user description vectors and typical user identification data corresponding to the same typical user monitoring video, and determining aggregation error parameters according to each user behavior aggregation description vector corresponding to the same typical video first sub-cluster;
performing network optimization operation on the candidate behavior feature mining network according to the related mining error parameters, the user mining error parameters and the aggregation error parameters to form a pending behavior feature mining network corresponding to the candidate behavior feature mining network;
and taking the undetermined behavior feature mining network as a current candidate behavior feature mining network, executing the determined typical video first cluster in a revolving way, loading the typical video first cluster to load the determined typical video first cluster into the candidate behavior feature mining network, and marking the currently output undetermined behavior feature mining network as an optimized behavior feature mining network when the step is not executed in a revolving way.
In some preferred embodiments, in the above behavior prediction system based on video analysis, the network optimization module is specifically further configured to:
determining a second cluster of typical videos, loading the second cluster of the typical videos to load the second cluster of the typical videos into a constructed original behavior feature mining network, mining original typical user description vectors corresponding to each typical user monitoring video in the second cluster of the typical videos, wherein the original behavior feature mining network comprises a key information mining unit and a linear integration unit;
according to each original representative user description vector corresponding to the same representative video sub-cluster in the second representative video cluster, original mining error parameters are calculated;
performing network optimization operation on the original behavior feature mining network according to the original mining error parameters, and forming an optimized original behavior feature mining network when network optimization is finished;
and performing parameter configuration operation on the key information mining unit in the target behavior feature mining network according to the optimized key information mining unit in the original behavior feature mining network, and performing parameter configuration operation on the linear integration unit in the target behavior feature mining network according to the optimized linear integration unit in the original behavior feature mining network to form a corresponding candidate behavior feature mining network.
The behavior prediction method and the behavior prediction system based on video analysis provided by the embodiment of the invention can be used for firstly acquiring the monitoring video of the target user; performing network optimization operation to form an optimized behavior feature mining network; loading the target user monitoring video to an optimized behavior feature mining network, mining out a user behavior aggregation description vector corresponding to the target user monitoring video by utilizing the optimized behavior feature mining network, wherein the user behavior aggregation description vector corresponding to the target user monitoring video is formed by aggregating description vectors of at least two dimensions; and carrying out behavior prediction operation on the target user based on the user behavior aggregation description vector so as to output a target behavior prediction result corresponding to the target user. Based on the foregoing, since the user behavior aggregation description vector corresponding to the target user monitoring video mined by the optimized behavior feature mining network is formed by aggregating the description vectors of at least two dimensions, the information of the user behavior aggregation description vector is more abundant, so that the reliability of behavior prediction operation performed based on the user behavior aggregation description vector, namely, the reliability of behavior prediction is improved, the reliability of the obtained target behavior prediction result is ensured, and the defects in the prior art are overcome.
In order to make the above objects, features and advantages of the present invention more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
Fig. 1 is a block diagram of a behavior prediction platform based on video analysis according to an embodiment of the present invention.
Fig. 2 is a flowchart illustrating steps included in a behavior prediction method based on video analysis according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of each module included in a behavior prediction system based on video analysis according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, the embodiment of the invention provides a behavior prediction platform based on video analysis. Wherein the behavior prediction platform based on video analysis may comprise a memory and a processor.
In detail, the memory and the processor are electrically connected directly or indirectly to realize transmission or interaction of data. For example, electrical connection may be made to each other via one or more communication buses or signal lines. The memory may store at least one software functional module (computer program) that may exist in the form of software or firmware. The processor may be configured to execute an executable computer program stored in the memory, thereby implementing a behavior prediction method based on video analysis provided by an embodiment of the present invention (as described below).
Alternatively, in some embodiments, the Memory may be, but is not limited to, random access Memory (Random Access Memory, RAM), read Only Memory (ROM), programmable Read Only Memory (Programmable Read-Only Memory, PROM), erasable Read Only Memory (Erasable Programmable Read-Only Memory, EPROM), electrically erasable Read Only Memory (Electric Erasable Programmable Read-Only Memory, EEPROM), and the like. The processor may be a general purpose processor including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), a System on Chip (SoC), etc.; but also Digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
Alternatively, in some embodiments, the video analysis-based behavior prediction platform may be a server with data processing capabilities.
With reference to fig. 2, the embodiment of the invention further provides a behavior prediction method based on video analysis, which can be applied to the behavior prediction platform based on video analysis. The method steps defined by the flow related to the behavior prediction method based on the video analysis can be realized by the behavior prediction platform based on the video analysis. The specific flow shown in fig. 2 will be described in detail.
Step S110, a target user monitoring video is acquired.
In the embodiment of the invention, the behavior prediction platform based on video analysis can acquire the target user monitoring video. The target user monitoring video is formed by carrying out video monitoring on a target user and comprises a plurality of target user monitoring video frames.
Step S120, forming an optimized behavior feature mining network by performing network optimization operation.
In the embodiment of the invention, the behavior prediction platform based on video analysis can form an optimized behavior feature mining network by performing network optimization operation.
And step S130, loading the target user monitoring video to load the target user monitoring video into the optimized behavior feature mining network, and mining out a user behavior aggregation description vector corresponding to the target user monitoring video by utilizing the optimized behavior feature mining network.
In the embodiment of the invention, the behavior prediction platform based on video analysis can load the target user monitoring video so as to load the target user monitoring video into the optimized behavior feature mining network, and the user behavior aggregation description vector corresponding to the target user monitoring video is mined by utilizing the optimized behavior feature mining network. And the user behavior aggregation description vector corresponding to the target user monitoring video is formed by aggregating the description vectors of at least two dimensions.
Step S140, performing a behavior prediction operation on the target user based on the user behavior aggregate description vector, so as to output a target behavior prediction result corresponding to the target user.
In the embodiment of the invention, the behavior prediction platform based on video analysis can perform behavior prediction operation on the target user based on the user behavior aggregation description vector so as to output a target behavior prediction result corresponding to the target user. The target behavior prediction result is used for reflecting the predicted behavior of the target user. For example, the user behavior aggregate description vector may be loaded into a target behavior prediction network to predict a corresponding target behavior prediction result, where network optimization of the target behavior prediction network may be performed together with or separately from the optimization behavior feature mining network.
Based on the foregoing, since the user behavior aggregation description vector corresponding to the target user monitoring video mined by the optimized behavior feature mining network is formed by aggregating the description vectors of at least two dimensions, the information of the user behavior aggregation description vector is more abundant, so that the reliability of behavior prediction operation performed based on the user behavior aggregation description vector, namely, the reliability of behavior prediction is improved, the reliability of the obtained target behavior prediction result is ensured, and the defects in the prior art are overcome.
Optionally, in some embodiments, the step of forming the optimized behavior feature mining network by performing a network optimization operation may further include the following:
determining a first cluster of typical videos, and loading the first cluster of typical videos to load into a candidate behavioral feature mining network, where the first cluster of typical videos includes a plurality of first sub-clusters of typical videos, where the first sub-cluster of typical videos includes a plurality of typical user monitoring videos (for example, 3 typical user monitoring videos may be included, a similarity between the first typical user monitoring video and the second typical user monitoring video may be higher, e.g., greater than a first similarity, a similarity between the first typical user monitoring video and the third typical user monitoring video may be lower, e.g., less than a second similarity, less than the first similarity, and a similarity between the second typical user monitoring video and the third typical user monitoring video may also be lower), where typical users in each of the typical user monitoring videos have corresponding typical user identification data and typical user-related identification data, where the typical user-related identification data is used to reflect user behavior information of the typical user in the typical user monitoring videos, and the typical user-related identification data is used to reflect user behavior information of the typical user in the typical user monitoring videos in the video, such as a region around a video frame of the user in the video;
Performing user related information mining operation on the typical user monitoring videos to output typical user related description vectors corresponding to each typical user monitoring video, performing user information mining operation on the typical user monitoring videos to output typical user description vectors corresponding to each typical user monitoring video, and performing aggregation operation, such as splicing or superposition operation, on the same typical user description vectors corresponding to the typical user monitoring videos and the typical user related description vectors to form user behavior aggregation description vectors corresponding to each typical user monitoring video;
determining a relevant mining error parameter according to (distinction between) typical user related description vectors and typical user related identification data corresponding to the same typical user monitoring video, determining a user mining error parameter according to typical user description vectors and typical user identification data corresponding to the same typical user monitoring video, and determining an aggregate error parameter according to each user behavior aggregate description vector corresponding to the same typical video first sub-cluster;
performing network optimization operation on the candidate behavior feature mining network according to the related mining error parameters, the user mining error parameters and the aggregation error parameters to form a pending behavior feature mining network corresponding to the candidate behavior feature mining network;
And when the step is not executed in a revolving way (the number of times of executing the step is larger than a preset number of times or the determined error parameter is smaller than a preset value), marking the currently output undetermined behavior feature mining network as an optimized behavior feature mining network.
Optionally, in some embodiments, in a case that the description vector to be processed belongs to a typical user-related description vector, the identification data to be processed belongs to typical user-related identification data, and the error parameter to be processed belongs to a related mining error parameter; in the case that the description vector to be processed belongs to the typical user description vector, the identification data to be processed belongs to the typical user identification data, and the error parameter to be processed belongs to the user mining error parameter, based on this, the determining operation of the error parameter to be processed may further include the following contents:
determining a reference typical user description vector cluster corresponding to each piece of identification data to be processed; the reference typical user description vector cluster comprises a plurality of reference typical user description vectors corresponding to the same type of to-be-processed identification data, the reference typical user description vectors are center representative description vectors which are drawn from each to-be-processed description vector corresponding to the to-be-processed identification data and correspond to the to-be-processed identification data, the center representative description vectors are formed by aggregating each to-be-processed description vector corresponding to the same type of to-be-processed identification data in the to-be-processed description vector set, namely, clustering centers;
In each reference typical user description vector cluster, determining a reference typical user description vector which is most similar to a to-be-processed description vector corresponding to a to-be-processed typical user monitoring video, and marking the reference typical user description vector as a first reference typical user description vector corresponding to the to-be-processed typical user monitoring video;
and determining corresponding error parameters to be processed according to the difference between the description vector to be processed corresponding to the same typical user monitoring video and the first reference typical user description vector.
Optionally, in some embodiments, the vector sizes of the to-be-processed description vector and the first reference typical user description vector are target vector sizes, based on which the step of determining the corresponding to-be-processed error parameter according to the difference between the to-be-processed description vector and the first reference typical user description vector corresponding to the same typical user monitoring video may further include the following:
determining any description vector with vector size equal to the target vector size, marking distribution coordinates corresponding to vector parameters which are larger than or equal to predetermined reference vector parameters in the any description vector, and taking the marks as target distribution coordinates, wherein the any description vector can be adjusted and updated in different optimization stages;
And determining corresponding error parameters to be processed according to the difference between the corresponding description vector to be processed of the same typical user monitoring video and the vector parameters of the first reference typical user description vector on the same target distribution coordinates.
Optionally, in some embodiments, the candidate behavior feature mining network includes a first sub-network, a second sub-network, and an aggregation sub-network, where the first sub-network is used to mine out typical user-related description vectors, the second sub-network is used to mine out typical user description vectors, and the aggregation sub-network is used to aggregate to form user behavior aggregate description vectors; the step of performing network optimization operation on the candidate behavior feature mining network according to the related mining error parameter, the user mining error parameter and the aggregation error parameter to form a pending behavior feature mining network corresponding to the candidate behavior feature mining network may further include the following contents:
and optimizing the parameters of the first sub-network, the second sub-network and the aggregation learning parameters according to the relevant mining error parameters and based on relevant learning parameters (i.e. learning rate), optimizing the parameters of the first sub-network, the second sub-network and the aggregation sub-network according to the user mining error parameters and based on user learning parameters, and optimizing the parameters of the first sub-network, the second sub-network and the aggregation sub-network according to the aggregation error parameters to form a pending behavior feature mining network corresponding to the candidate behavior feature mining network, wherein the relevant learning parameters and the user learning parameters are both larger than the aggregation learning parameters.
Optionally, in some embodiments, before the step of determining the first cluster of typical videos and loading the first cluster of typical videos to load into the candidate behavior feature mining network, the step of forming an optimized behavior feature mining network by performing a network optimization operation may further include the following:
determining a second cluster of typical videos, loading the second cluster of the typical videos to load the second cluster of the typical videos into a constructed original behavior feature mining network, mining original typical user description vectors corresponding to each typical user monitoring video in the second cluster of the typical videos, wherein the original behavior feature mining network comprises a key information mining unit (coding unit) and a linear integration unit (MLP);
according to each original representative user description vector corresponding to the same representative video sub-cluster in the second representative video cluster, original mining error parameters are calculated;
performing network optimization operation on the original behavior feature mining network according to the original mining error parameters, and forming an optimized original behavior feature mining network when network optimization is finished;
And performing parameter configuration operation on the key information mining unit in the target behavior feature mining network according to the optimized key information mining unit in the original behavior feature mining network, and performing parameter configuration operation on the linear integration unit in the target behavior feature mining network according to the optimized linear integration unit in the original behavior feature mining network to form a corresponding candidate behavior feature mining network.
Optionally, in some embodiments, the step of performing parameter configuration operation on the key information mining unit in the target behavior feature mining network according to the key information mining unit in the optimized original behavior feature mining network, and performing parameter configuration operation on the linear integration unit in the target behavior feature mining network according to the linear integration unit in the optimized original behavior feature mining network to form a corresponding candidate behavior feature mining network may further include the following contents:
according to the optimized key information mining unit in the original behavior feature mining network, performing parameter configuration operation (parameter sharing between the first sub-network and the second sub-network) on the key information mining unit shared between the first sub-network and the second sub-network in the target behavior feature mining network, and according to the optimized linear integration unit in the original behavior feature mining network, performing parameter configuration operation on the linear integration unit of the first sub-network and the linear integration unit of the second sub-network in the target behavior feature mining network to form a corresponding candidate behavior feature mining network.
With reference to fig. 3, the embodiment of the invention further provides a behavior prediction system based on video analysis, which can be applied to the behavior prediction platform based on video analysis. Wherein, the behavior prediction system based on video analysis can comprise the following modules:
the monitoring video acquisition module is used for acquiring a target user monitoring video, wherein the target user monitoring video is formed by carrying out video monitoring on a target user and comprises multi-frame target user monitoring video frames;
the network optimization module is used for forming an optimized behavior feature mining network by performing network optimization operation;
the behavior feature mining module is used for loading the target user monitoring video to the optimized behavior feature mining network, mining user behavior aggregation description vectors corresponding to the target user monitoring video by utilizing the optimized behavior feature mining network, wherein the user behavior aggregation description vectors corresponding to the target user monitoring video are formed by aggregating description vectors of at least two dimensions;
and the behavior prediction module is used for performing behavior prediction operation on the target user based on the user behavior aggregation description vector so as to output a target behavior prediction result corresponding to the target user, wherein the target behavior prediction result is used for reflecting the predicted behavior of the target user.
Optionally, in some embodiments, the network optimization module is specifically configured to:
determining a first cluster of typical videos, and loading the first cluster of typical videos into a candidate behavior feature mining network, wherein the first cluster of typical videos comprises a plurality of first sub-clusters of typical videos, the first sub-clusters of typical videos comprise a plurality of typical user monitoring videos, and typical users in each of the typical user monitoring videos have corresponding typical user identification data and typical user related identification data, the typical user identification data is used for reflecting user behavior type information of the typical users, and the typical user related identification data is used for reflecting information of video frame parts related to the typical users in the typical user monitoring videos;
performing user related information mining operation on the typical user monitoring videos to output typical user related description vectors corresponding to each typical user monitoring video, performing user information mining operation on the typical user monitoring videos to output typical user description vectors corresponding to each typical user monitoring video, and performing aggregation operation on the same typical user description vectors corresponding to the typical user monitoring videos and the typical user related description vectors to form user behavior aggregation description vectors corresponding to each typical user monitoring video;
Determining relevant mining error parameters according to typical user related description vectors and typical user related identification data corresponding to the same typical user monitoring video, determining user mining error parameters according to typical user description vectors and typical user identification data corresponding to the same typical user monitoring video, and determining aggregation error parameters according to each user behavior aggregation description vector corresponding to the same typical video first sub-cluster;
performing network optimization operation on the candidate behavior feature mining network according to the related mining error parameters, the user mining error parameters and the aggregation error parameters to form a pending behavior feature mining network corresponding to the candidate behavior feature mining network;
and taking the undetermined behavior feature mining network as a current candidate behavior feature mining network, executing the determined typical video first cluster in a revolving way, loading the typical video first cluster to load the determined typical video first cluster into the candidate behavior feature mining network, and marking the currently output undetermined behavior feature mining network as an optimized behavior feature mining network when the step is not executed in a revolving way.
Optionally, in some embodiments, the network optimization module is specifically further configured to:
determining a second cluster of typical videos, loading the second cluster of the typical videos to load the second cluster of the typical videos into a constructed original behavior feature mining network, mining original typical user description vectors corresponding to each typical user monitoring video in the second cluster of the typical videos, wherein the original behavior feature mining network comprises a key information mining unit and a linear integration unit;
according to each original representative user description vector corresponding to the same representative video sub-cluster in the second representative video cluster, original mining error parameters are calculated;
performing network optimization operation on the original behavior feature mining network according to the original mining error parameters, and forming an optimized original behavior feature mining network when network optimization is finished;
and performing parameter configuration operation on the key information mining unit in the target behavior feature mining network according to the optimized key information mining unit in the original behavior feature mining network, and performing parameter configuration operation on the linear integration unit in the target behavior feature mining network according to the optimized linear integration unit in the original behavior feature mining network to form a corresponding candidate behavior feature mining network.
In summary, the behavior prediction method and system based on video analysis provided by the invention can collect the monitoring video of the target user first; performing network optimization operation to form an optimized behavior feature mining network; loading the target user monitoring video to an optimized behavior feature mining network, mining out a user behavior aggregation description vector corresponding to the target user monitoring video by utilizing the optimized behavior feature mining network, wherein the user behavior aggregation description vector corresponding to the target user monitoring video is formed by aggregating description vectors of at least two dimensions; and carrying out behavior prediction operation on the target user based on the user behavior aggregation description vector so as to output a target behavior prediction result corresponding to the target user. Based on the foregoing, since the user behavior aggregation description vector corresponding to the target user monitoring video mined by the optimized behavior feature mining network is formed by aggregating the description vectors of at least two dimensions, the information of the user behavior aggregation description vector is more abundant, so that the reliability of behavior prediction operation performed based on the user behavior aggregation description vector, namely, the reliability of behavior prediction is improved, the reliability of the obtained target behavior prediction result is ensured, and the defects in the prior art are overcome.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A behavior prediction method based on video analysis, comprising:
the method comprises the steps that target user monitoring videos are collected, wherein the target user monitoring videos are formed by monitoring the target users through videos and comprise multi-frame target user monitoring video frames;
performing network optimization operation to form an optimized behavior feature mining network;
loading the target user monitoring video to the optimized behavior feature mining network, mining user behavior aggregation description vectors corresponding to the target user monitoring video by utilizing the optimized behavior feature mining network, wherein the user behavior aggregation description vectors corresponding to the target user monitoring video are formed by aggregating description vectors of at least two dimensions;
and carrying out behavior prediction operation on the target user based on the user behavior aggregation description vector so as to output a target behavior prediction result corresponding to the target user, wherein the target behavior prediction result is used for reflecting the predicted behavior of the target user.
2. The behavior prediction method based on video analysis according to claim 1, wherein the step of forming an optimized behavior feature mining network by performing a network optimization operation includes:
determining a first cluster of typical videos, and loading the first cluster of typical videos into a candidate behavior feature mining network, wherein the first cluster of typical videos comprises a plurality of first sub-clusters of typical videos, the first sub-clusters of typical videos comprise a plurality of typical user monitoring videos, and typical users in each of the typical user monitoring videos have corresponding typical user identification data and typical user related identification data, the typical user identification data is used for reflecting user behavior type information of the typical users, and the typical user related identification data is used for reflecting information of video frame parts related to the typical users in the typical user monitoring videos;
performing user related information mining operation on the typical user monitoring videos to output typical user related description vectors corresponding to each typical user monitoring video, performing user information mining operation on the typical user monitoring videos to output typical user description vectors corresponding to each typical user monitoring video, and performing aggregation operation on the same typical user description vectors corresponding to the typical user monitoring videos and the typical user related description vectors to form user behavior aggregation description vectors corresponding to each typical user monitoring video;
Determining relevant mining error parameters according to typical user related description vectors and typical user related identification data corresponding to the same typical user monitoring video, determining user mining error parameters according to typical user description vectors and typical user identification data corresponding to the same typical user monitoring video, and determining aggregation error parameters according to each user behavior aggregation description vector corresponding to the same typical video first sub-cluster;
performing network optimization operation on the candidate behavior feature mining network according to the related mining error parameters, the user mining error parameters and the aggregation error parameters to form a pending behavior feature mining network corresponding to the candidate behavior feature mining network;
and taking the undetermined behavior feature mining network as a current candidate behavior feature mining network, executing the determined typical video first cluster in a revolving way, loading the typical video first cluster to load the determined typical video first cluster into the candidate behavior feature mining network, and marking the currently output undetermined behavior feature mining network as an optimized behavior feature mining network when the step is not executed in a revolving way.
3. The behavior prediction method based on video analysis according to claim 2, wherein in the case where the description vector to be processed belongs to a typical user-related description vector, the identification data to be processed belongs to typical user-related identification data, and the error parameter to be processed belongs to a related mining error parameter; under the condition that the description vector to be processed belongs to the typical user description vector, the identification data to be processed belongs to the typical user identification data, and the error parameter to be processed belongs to the user mining error parameter;
the operation of determining the error parameter to be processed comprises the following steps:
determining a reference typical user description vector cluster corresponding to each piece of identification data to be processed; the reference typical user description vector cluster comprises a plurality of reference typical user description vectors corresponding to the same type of to-be-processed identification data, the reference typical user description vectors are center representative description vectors which are extracted from each to-be-processed description vector corresponding to the to-be-processed identification data and are most similar to each to-be-processed description vector corresponding to the to-be-processed identification data, and the center representative description vectors are formed by aggregating each to-be-processed description vector corresponding to the same type of to-be-processed identification data in the to-be-processed description vector set;
In each reference typical user description vector cluster, determining a reference typical user description vector which is most similar to a to-be-processed description vector corresponding to a to-be-processed typical user monitoring video, and marking the reference typical user description vector as a first reference typical user description vector corresponding to the to-be-processed typical user monitoring video;
and determining corresponding error parameters to be processed according to the difference between the description vector to be processed corresponding to the same typical user monitoring video and the first reference typical user description vector.
4. A video analysis based behavior prediction method according to claim 3, wherein the vector sizes of the description vector to be processed and the first reference representative user description vector are target vector sizes; and determining a corresponding error parameter to be processed according to the difference between the corresponding description vector to be processed of the same typical user monitoring video and the first reference typical user description vector, including:
determining any description vector with vector size equal to the target vector size, and marking distribution coordinates corresponding to vector parameters which are larger than or equal to predetermined reference vector parameters in the any description vector, so as to be the target distribution coordinates;
And determining corresponding error parameters to be processed according to the difference between the corresponding description vector to be processed of the same typical user monitoring video and the vector parameters of the first reference typical user description vector on the same target distribution coordinates.
5. The behavior prediction method based on video analysis according to claim 2, wherein the candidate behavior feature mining network comprises a first sub-network, a second sub-network and an aggregation sub-network, the first sub-network is used for mining typical user related description vectors, the second sub-network is used for mining typical user description vectors, and the aggregation sub-network is used for aggregating to form user behavior aggregation description vectors; the step of performing network optimization operation on the candidate behavior feature mining network according to the related mining error parameter, the user mining error parameter and the aggregation error parameter to form a pending behavior feature mining network corresponding to the candidate behavior feature mining network, includes:
and optimizing the parameters of the first sub-network, the second sub-network and the aggregation sub-network according to the relevant mining error parameters and on the basis of relevant learning parameters, optimizing the parameters of the second sub-network according to the user mining error parameters and on the basis of user learning parameters, and optimizing the parameters of the first sub-network, the second sub-network and the aggregation sub-network according to the aggregation error parameters and on the basis of aggregation learning parameters to form a pending behavior feature mining network corresponding to the candidate behavior feature mining network, wherein the relevant learning parameters and the user learning parameters are all larger than the aggregation learning parameters.
6. The method of behavior prediction based on video analysis of claim 2, wherein prior to the steps of determining a first cluster of typical videos and loading the first cluster of typical videos for loading into a candidate behavior feature mining network, the step of forming an optimized behavior feature mining network by performing a network optimization operation, further comprising:
determining a second cluster of typical videos, loading the second cluster of the typical videos to load the second cluster of the typical videos into a constructed original behavior feature mining network, mining original typical user description vectors corresponding to each typical user monitoring video in the second cluster of the typical videos, wherein the original behavior feature mining network comprises a key information mining unit and a linear integration unit;
according to each original representative user description vector corresponding to the same representative video sub-cluster in the second representative video cluster, original mining error parameters are calculated;
performing network optimization operation on the original behavior feature mining network according to the original mining error parameters, and forming an optimized original behavior feature mining network when network optimization is finished;
And performing parameter configuration operation on the key information mining unit in the target behavior feature mining network according to the optimized key information mining unit in the original behavior feature mining network, and performing parameter configuration operation on the linear integration unit in the target behavior feature mining network according to the optimized linear integration unit in the original behavior feature mining network to form a corresponding candidate behavior feature mining network.
7. The behavior prediction method based on video analysis according to claim 6, wherein the steps of performing parameter configuration operation on the key information mining unit in the target behavior feature mining network according to the key information mining unit in the optimized original behavior feature mining network, and performing parameter configuration operation on the linear integration unit in the target behavior feature mining network according to the linear integration unit in the optimized original behavior feature mining network to form a corresponding candidate behavior feature mining network include:
performing parameter configuration operation on the key information mining unit shared between the first sub-network and the second sub-network in the target behavior feature mining network according to the key information mining unit in the optimized original behavior feature mining network, and performing parameter configuration operation on the linear integration unit of the first sub-network and the linear integration unit of the second sub-network in the target behavior feature mining network according to the linear integration unit in the optimized original behavior feature mining network to form a corresponding candidate behavior feature mining network.
8. A behavior prediction system based on video analysis, comprising:
the monitoring video acquisition module is used for acquiring a target user monitoring video, wherein the target user monitoring video is formed by carrying out video monitoring on a target user and comprises multi-frame target user monitoring video frames;
the network optimization module is used for forming an optimized behavior feature mining network by performing network optimization operation;
the behavior feature mining module is used for loading the target user monitoring video to the optimized behavior feature mining network, mining user behavior aggregation description vectors corresponding to the target user monitoring video by utilizing the optimized behavior feature mining network, wherein the user behavior aggregation description vectors corresponding to the target user monitoring video are formed by aggregating description vectors of at least two dimensions;
and the behavior prediction module is used for performing behavior prediction operation on the target user based on the user behavior aggregation description vector so as to output a target behavior prediction result corresponding to the target user, wherein the target behavior prediction result is used for reflecting the predicted behavior of the target user.
9. The video analysis-based behavior prediction system of claim 8, wherein the network optimization module is specifically configured to:
determining a first cluster of typical videos, and loading the first cluster of typical videos into a candidate behavior feature mining network, wherein the first cluster of typical videos comprises a plurality of first sub-clusters of typical videos, the first sub-clusters of typical videos comprise a plurality of typical user monitoring videos, and typical users in each of the typical user monitoring videos have corresponding typical user identification data and typical user related identification data, the typical user identification data is used for reflecting user behavior type information of the typical users, and the typical user related identification data is used for reflecting information of video frame parts related to the typical users in the typical user monitoring videos;
performing user related information mining operation on the typical user monitoring videos to output typical user related description vectors corresponding to each typical user monitoring video, performing user information mining operation on the typical user monitoring videos to output typical user description vectors corresponding to each typical user monitoring video, and performing aggregation operation on the same typical user description vectors corresponding to the typical user monitoring videos and the typical user related description vectors to form user behavior aggregation description vectors corresponding to each typical user monitoring video;
Determining relevant mining error parameters according to typical user related description vectors and typical user related identification data corresponding to the same typical user monitoring video, determining user mining error parameters according to typical user description vectors and typical user identification data corresponding to the same typical user monitoring video, and determining aggregation error parameters according to each user behavior aggregation description vector corresponding to the same typical video first sub-cluster;
performing network optimization operation on the candidate behavior feature mining network according to the related mining error parameters, the user mining error parameters and the aggregation error parameters to form a pending behavior feature mining network corresponding to the candidate behavior feature mining network;
and taking the undetermined behavior feature mining network as a current candidate behavior feature mining network, executing the determined typical video first cluster in a revolving way, loading the typical video first cluster to load the determined typical video first cluster into the candidate behavior feature mining network, and marking the currently output undetermined behavior feature mining network as an optimized behavior feature mining network when the step is not executed in a revolving way.
10. The video analysis-based behavior prediction system of claim 9, wherein the network optimization module is further specifically configured to:
determining a second cluster of typical videos, loading the second cluster of the typical videos to load the second cluster of the typical videos into a constructed original behavior feature mining network, mining original typical user description vectors corresponding to each typical user monitoring video in the second cluster of the typical videos, wherein the original behavior feature mining network comprises a key information mining unit and a linear integration unit;
according to each original representative user description vector corresponding to the same representative video sub-cluster in the second representative video cluster, original mining error parameters are calculated;
performing network optimization operation on the original behavior feature mining network according to the original mining error parameters, and forming an optimized original behavior feature mining network when network optimization is finished;
and performing parameter configuration operation on the key information mining unit in the target behavior feature mining network according to the optimized key information mining unit in the original behavior feature mining network, and performing parameter configuration operation on the linear integration unit in the target behavior feature mining network according to the optimized linear integration unit in the original behavior feature mining network to form a corresponding candidate behavior feature mining network.
CN202310934107.2A 2023-07-27 2023-07-27 Behavior prediction method and system based on video analysis Pending CN116935280A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310934107.2A CN116935280A (en) 2023-07-27 2023-07-27 Behavior prediction method and system based on video analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310934107.2A CN116935280A (en) 2023-07-27 2023-07-27 Behavior prediction method and system based on video analysis

Publications (1)

Publication Number Publication Date
CN116935280A true CN116935280A (en) 2023-10-24

Family

ID=88380405

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310934107.2A Pending CN116935280A (en) 2023-07-27 2023-07-27 Behavior prediction method and system based on video analysis

Country Status (1)

Country Link
CN (1) CN116935280A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117152668A (en) * 2023-10-30 2023-12-01 成都方顷科技有限公司 Intelligent logistics implementation method, device and equipment based on Internet of things

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117152668A (en) * 2023-10-30 2023-12-01 成都方顷科技有限公司 Intelligent logistics implementation method, device and equipment based on Internet of things
CN117152668B (en) * 2023-10-30 2024-02-06 成都方顷科技有限公司 Intelligent logistics implementation method, device and equipment based on Internet of things

Similar Documents

Publication Publication Date Title
CN116935280A (en) Behavior prediction method and system based on video analysis
CN115603973B (en) Heterogeneous security monitoring method and system based on government information network
CN116109630B (en) Image analysis method and system based on sensor acquisition and artificial intelligence
CN116664335B (en) Intelligent monitoring-based operation analysis method and system for semiconductor production system
CN116310914B (en) Unmanned aerial vehicle monitoring method and system based on artificial intelligence
CN116109988B (en) Anomaly monitoring method and system based on artificial intelligence and unmanned aerial vehicle
CN116681350A (en) Intelligent factory fault detection method and system
CN117197748A (en) Behavior management method and system based on video analysis
CN115185724A (en) Fault processing method, device, electronic equipment and storage medium
CN116996403B (en) Network traffic diagnosis method and system applying AI model
CN117115741A (en) User monitoring method and system based on intelligent building
CN115909215B (en) Edge intrusion early warning method and system based on target detection
CN117150123A (en) Resource allocation method and system based on cloud computing
CN116628181B (en) User control preference sensing method and system based on Internet of things
CN117218594A (en) Security monitoring data processing method and system
CN117253172A (en) Object association method and system based on video processing
CN117201149A (en) Data access method and system based on cloud computing
CN117671554A (en) Security monitoring method and system
CN117218597A (en) Behavior guiding method and system based on security monitoring
CN117278586A (en) Control method and system for Internet of things equipment
CN117152070A (en) Integral detection method and system for building engineering
CN117194525A (en) Data analysis method and system for multi-source service data
CN117422302A (en) Information prediction method and system based on wind control model
CN117221134A (en) State analysis method and system based on Internet
CN117351422A (en) Intelligent building monitoring method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination