CN113467740B - Video monitoring array display optimization method and device based on joint coding - Google Patents
Video monitoring array display optimization method and device based on joint coding Download PDFInfo
- Publication number
- CN113467740B CN113467740B CN202110802969.0A CN202110802969A CN113467740B CN 113467740 B CN113467740 B CN 113467740B CN 202110802969 A CN202110802969 A CN 202110802969A CN 113467740 B CN113467740 B CN 113467740B
- Authority
- CN
- China
- Prior art keywords
- encoder
- global
- monitoring
- sequence
- local
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012544 monitoring process Methods 0.000 title claims abstract description 154
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000005457 optimization Methods 0.000 title claims abstract description 21
- 230000006399 behavior Effects 0.000 claims abstract description 56
- 230000006870 function Effects 0.000 claims abstract description 28
- 238000013528 artificial neural network Methods 0.000 claims abstract description 25
- 125000004122 cyclic group Chemical group 0.000 claims abstract description 21
- 238000013135 deep learning Methods 0.000 claims abstract description 16
- 230000008520 organization Effects 0.000 claims description 39
- 239000013598 vector Substances 0.000 claims description 36
- 238000010276 construction Methods 0.000 claims description 21
- 239000011159 matrix material Substances 0.000 claims description 19
- 230000004913 activation Effects 0.000 claims description 17
- 238000006243 chemical reaction Methods 0.000 claims description 13
- 230000007246 mechanism Effects 0.000 claims description 12
- 230000000694 effects Effects 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000013473 artificial intelligence Methods 0.000 abstract description 3
- 238000005516 engineering process Methods 0.000 description 8
- 238000007689 inspection Methods 0.000 description 6
- 238000012163 sequencing technique Methods 0.000 description 6
- 230000002159 abnormal effect Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000000306 recurrent effect Effects 0.000 description 3
- 241001522296 Erithacus rubecula Species 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/14—Digital output to display device ; Cooperation and interconnection of the display device with other functional units
- G06F3/1407—General aspects irrespective of display type, e.g. determination of decimal point position, display with fixed or driving decimal point, suppression of non-significant zeros
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The invention discloses a video monitoring array display optimization method and device based on joint coding, which belong to the technical field of artificial intelligence, and are implemented by constructing a global encoder and a local encoder; constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by using a deep learning cyclic neural network structure; calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item; optimizing video surveillance array display ordering based on probability values that each surveillance picture appears next; the behavior of the monitoring personnel is visually analyzed by constructing a joint coding monitoring strategy recommendation model containing a global encoder and a local encoder, and then the optimized behavior of the monitoring personnel is automatically captured and summarized by utilizing a cyclic neural network structure.
Description
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a video monitoring array display optimization method and device based on joint coding.
Background
In recent years, with the development of technology and the progress of society, video monitoring has rapidly developed, and is increasingly applied to the traditional and non-traditional security fields. The video monitoring system is one of the most important security measures in the current security. Along with the increase of video monitoring points, the number of required monitoring videos is far greater than the displayable number of monitoring screens of a command center. The supervision personnel can carry out video round inspection through the manual mode, the working strength is high, the efficiency is low, and effective management and control are difficult to realize. With the development of computer vision and artificial intelligence technology, the round-robin mechanism of the intelligent monitoring system lightens the working intensity of supervisory personnel to a certain extent and improves the working efficiency of security management, but the current round-robin mechanism can cause serious information loss.
The existing monitoring camera array sequencing display technology mainly has two thinking directions of fixed regular sequencing display and abnormal picture sequencing display for the display of the monitoring video, the abnormal picture sequencing display method can use a method for calculating video weights based on image comparison, firstly, the weight value of each terminal is calculated based on the difference between the front video and the rear video of a single video acquisition terminal, and then the weight value of each terminal is used as a basis to screen and determine the playing sequence of a plurality of video streams on a monitor screen. The method has good effect on the monitor picture round in a long-term 'dynamic static' state, but has little effect on the monitor picture which continuously and dynamically changes. And secondly, judging whether personnel, abnormal equipment and the like invade or not through a background extraction technology, and carrying out important monitoring camera polling, so that the method has higher requirements on a moving object detection technology, and has higher false alarm rate due to the working environment of the monitoring camera and the reasons of the moving object detection technology. According to the fixed rule ordering display method, fixed picture round inspection is performed at fixed intervals according to the existing experience of monitoring personnel, the monitoring personnel are required to be familiar with risk easily-occurring areas and time, and round inspection monitoring at fixed time and fixed points cannot be performed aiming at high risk areas in different time periods due to fixed round inspection sequences.
The two video monitoring array sequencing display methods have great disadvantages: the round robin sequencing technology based on fixed rules has higher requirement on the experience of monitoring personnel, and can not accurately round different risk areas at different times; the round inspection technology based on the abnormal images is affected by the accuracy of the intelligent image analysis technology, the false alarm rate is high, and the judgment of monitoring personnel on risks is affected.
The video monitoring array ordering display method is theoretically clustered into a recommendation algorithm, and the problem can be effectively solved by using the recommendation algorithm. Monitoring operator information is not of interest in the monitoring system log because the monitoring camera ordering display order is more important to the monitoring system than monitoring operator information. Therefore, the information which can be effectively utilized only has the viewing sequence and the corresponding viewing time of the operator, in this case, the recommendation result which is displayed in the next order is often inaccurate by using the traditional recommendation method, and the recommended result has the problems of hysteresis and repeatability, however, the session-based recommendation system can effectively solve the problems.
Disclosure of Invention
The invention provides a video monitoring array display optimization method and device based on joint coding, which are characterized in that a joint coding monitoring strategy recommendation model containing a global coder and a local coder is constructed, the behaviors of monitoring personnel are visually analyzed, and then the behaviors of the monitoring personnel after optimization are automatically captured and summarized by utilizing a circulating neural network structure.
The specific technical scheme provided by the invention is as follows:
in one aspect, the invention provides a video monitoring array display optimization method based on joint coding, which comprises the following steps:
by taking the whole monitoring sequence as the input of the global encoder, the behavior characteristics of monitoring personnel in the monitoring sequence are taken as the output of the global encoder, and the global encoder is constructed;
dynamically selecting and linearly combining different parts of the input sequence by adopting an object-level attention mechanism to construct a local encoder;
constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by using a deep learning cyclic neural network structure;
calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item;
the video surveillance array display ordering is optimized based on the probability value that each surveillance picture appears next.
Optionally, the building the global encoder specifically includes:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
using ordered data sets according to the formulaCalculating reset gate r t Wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W r And U r Is a weight vector;
according to the formulaCalculating candidate behavior->Wherein r is t For resetting the door +.>Output data of t-1 th time of global encoder, x t For the t-th input data of the global encoder, W and U are weight vectors respectively, and as a Hadamard product;
according to the formulaCalculating an update gate, wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W z And U z Is a weight vector;
according to the formulaCalculating candidate behavior->Behavior h before it t-1 Wherein z is t For updating the door->For candidate behavior, ++>Candidate behavior for output data of the global encoder t-1 th time>Behavior h before it t-1 Relation of (1)>Is an output sequence of operations characteristic of the global encoder.
Optionally, the constructing a local encoder specifically includes:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
according to the formulaCalculating global encoder hidden layer output +.>And a local encoder hidden layer vector representation +.>Wherein matrix A is 1 For taking->Conversion to a potential space, matrix A 2 For taking->Converting into a potential space, wherein sigma is a Sigmoid activation function, v T Is a dimension conversion matrix;
according to the formulaCalculating a weighting factor alpha, wherein ∈>Is the global encoder hidden layer output; />Is a local encoder hidden layer vector representation;
according to the formulaCalculating the intention coefficient of the monitoring person in the monitoring sequence, wherein a tj Is a weighting factor; />Is a local encoder hidden layer vector representation.
Optionally, the constructing a joint coding monitoring policy recommendation model including a global encoder and a local encoder specifically includes:
and constructing a joint coding monitoring strategy recommendation model containing a global encoder and a local encoder by using a deep learning cyclic neural network structure, wherein the global encoder is used for summarizing the whole monitoring sequence, and the local encoder is used for adaptively selecting important items in the current session.
Optionally, in the process of constructing the joint coding monitoring policy recommendation model, a global encoderIs integrated into c t In order to provide a sequential behavior representation of the joint coding monitoring policy recommendation model, global encoder +.>A hidden state is different from the effect of the local encoder, local encoder +.>For calculating the attention weight in the previous hidden state, and the global encoder +.>Is used to encode the entire sequence behavior.
On the other hand, the invention also provides a video monitoring array display optimizing device based on joint coding, which comprises the following steps:
the global construction module is used for constructing a global encoder by taking the whole monitoring sequence as the input of the global encoder and taking the behavior characteristics of monitoring personnel in the monitoring sequence as the output of the global encoder;
the local construction module is used for dynamically selecting and linearly combining different parts of the input sequence by adopting an object-level attention mechanism to construct a local encoder;
the model construction module is used for constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by utilizing a deep learning cyclic neural network structure;
the similarity calculation module is used for calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item;
and the display ordering module is used for optimizing the display ordering of the video monitoring array based on the probability value of each monitoring picture appearing next.
Optionally, the global building module is specifically configured to:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
using ordered data sets according to the formulaCalculating reset gate r t Wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W r And U r Is a weight vector;
according to the formulaCalculating candidate behavior->Wherein r is t For resetting the door +.>Output data of t-1 th time of global encoder, x t For the t-th input data of the global encoder, W and U are weight vectors respectively, and as a Hadamard product;
according to the formulaCalculating an update gate, wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W z And U z Is a weight vector;
according to the formulaCalculating candidate behavior->Behavior h before it t-1 Wherein z is t For updating the door->For candidate behavior, ++>Candidate behavior for output data of the global encoder t-1 th time>Behavior h before it t-1 Relation of (1)>Is an output sequence of operations characteristic of the global encoder.
Optionally, the local construction module is specifically configured to:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
according to the formulaCalculating global encoder hidden layer output +.>And a local encoder hidden layer vector representation +.>Wherein matrix A is 1 For taking->Conversion to a potential space, matrix A 2 For taking->Converting into a potential space, wherein sigma is a Sigmoid activation function, v T Is a dimension conversion matrix;
according to the formulaCalculating a weighting factor alpha, wherein ∈>Is the global encoder hidden layer output; />Is a local encoder hidden layer vector representation;
according to the formulaCalculating the intention coefficient of the monitoring person in the monitoring sequence, wherein a tj Is a weighting factor; />Is a local encoder hidden layer vector representation.
Optionally, the model building module is specifically configured to:
and constructing a joint coding monitoring strategy recommendation model containing a global encoder and a local encoder by using a deep learning cyclic neural network structure, wherein the global encoder is used for summarizing the whole monitoring sequence, and the local encoder is used for adaptively selecting important items in the current session.
Optionally, in the process of constructing the joint coding monitoring policy recommendation model, a global encoderIs integrated into c t In order to provide a sequential behavior representation of the joint coding monitoring policy recommendation model, global encoder +.>A hidden state is different from the effect of the local encoder, local encoder +.>For calculating the attention weight in the previous hidden state, and the global encoder +.>Is used to encode the entire sequence behavior.
The beneficial effects of the invention are as follows:
the video monitoring array display optimization method based on joint coding provided by the embodiment of the invention comprises the steps of constructing a global coder by taking the whole monitoring sequence as the input of the global coder and taking the behavior characteristics of monitoring personnel in the monitoring sequence as the output of the global coder; dynamically selecting and linearly combining different parts of the input sequence by adopting an object-level attention mechanism to construct a local encoder; constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by using a deep learning cyclic neural network structure; calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item; optimizing video surveillance array display ordering based on probability values that each surveillance picture appears next; the behavior of the monitoring personnel is visually analyzed by constructing a joint coding monitoring strategy recommendation model containing a global encoder and a local encoder, and then the optimized behavior of the monitoring personnel is automatically captured and summarized by utilizing a cyclic neural network structure.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a video surveillance array display optimization method based on joint coding according to an embodiment of the present invention;
FIG. 2 is a block diagram of a video surveillance array display optimization method device based on joint coding according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a global encoder provided by an embodiment of the present invention;
FIG. 4 is a schematic diagram of a local encoder according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a joint coding monitoring policy recommendation model according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings.
The following will describe in detail a video surveillance array display optimization method and apparatus based on joint coding according to an embodiment of the present invention with reference to fig. 1 to fig. 5.
Referring to fig. 1, fig. 3, fig. 4 and fig. 5, the video monitoring array display optimization method based on joint coding provided by the embodiment of the invention includes:
step 100: by taking the whole monitoring sequence as the input of the global encoder, the behavior characteristics of monitoring personnel in the monitoring sequence are taken as the output of the global encoder, and the global encoder is constructed;
specifically, referring to fig. 3, the data sets are grouped according to the operation object organization, and the grouped data sets are ordered according to the operation time, and one object organization arranged according to the time sequence corresponds to one sequence, wherein the data sets comprise a user name, an operation object organization and the operation time;
using ordered data sets according to the formulaCalculating reset gate r t Wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W r And U r Is a weight vector;
according to the formulaCalculating candidate behavior->Wherein r is t For resetting the door +.>Output data of t-1 th time of global encoder, x t For the t-th input data of the global encoder, W and U are weight vectors respectively, and as a Hadamard product;
according to the formulaCalculating an update gate, wherein sigma is Sigmoid activate function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W z And U z Is a weight vector;
according to the formulaCalculating candidate behavior->Behavior h before it t-1 Wherein z is t For updating the door->For candidate behavior, ++>Candidate behavior for output data of the global encoder t-1 th time>Behavior h before it t-1 Relation of (1)>Is an output sequence of operations characteristic of the global encoder.
The whole monitoring sequence is used as the input of the global encoder, and the behavior characteristic of the monitoring personnel in the sequence is used as the output of the global encoder. The data set contains 14 data items of operation users, operation user IP, operation user MAC, operation user organization, operation service, operation action, operation object type, operation object organization, description, operation time, operation result, new value and original value.
The data sets are grouped according to the operation object organization, and the grouped data sets are ordered according to the operation time, and one object organization arranged according to the time sequence corresponds to one sequence. In the global encoder, the input data is divided into batches of fixed size for training, and the size of the batches determines the sum of the number of samples for one training, and influences the optimization degree of the model, and the parameter setting and speed of the model input layer. The correct lot is chosen to find the best balance between memory efficiency and memory capacity.
According to the embodiment of the invention, the global encoder splits the sequence data set of the user according to the characteristics of the user click sequence, takes the last click as input and the next click as output, and maintains the relevance between the data through the corresponding relation between the input and the output, thereby solving the problem that modeling is difficult due to large sequence length difference.
Step 200: dynamically selecting and linearly combining different parts of the input sequence by adopting an object-level attention mechanism to construct a local encoder;
because the vectorization summarization of the whole monitoring sequence by the global encoder is difficult to accurately obtain the intention of the monitoring personnel, and the video monitoring-oriented local encoder is designed based on the vectorization summarization, the video monitoring-oriented local encoder has the advantage of adaptively capturing the intention of the monitoring personnel.
Referring to fig. 4, in the construction process of the local encoder, data sets are grouped according to operation object organizations, and the grouped data sets are ordered according to operation time, one object organization arranged according to time sequence corresponds to a sequence, wherein the data sets include a user name, an operation object organization, and an operation time. In the construction process of the local encoder, the adopted data set is the same as the data set adopted in the construction process of the global encoder, and the data preprocessing mode is the same as that of the global encoder.
Referring to FIG. 4, the formula is shownComputing global encoder hidden layer outputAnd a local encoder hidden layer vector representation +.>Wherein matrix A is 1 For taking->Conversion to a potential space, matrix A 2 For taking->Converting into a potential space, wherein sigma is a Sigmoid activation function, v T Is a dimension conversion matrix;
according to the formulaCalculating a weighting factor alpha, wherein ∈>Is the global encoder hidden layer output; />Is a local encoder hidden layer vector representation;
according to the formulaCalculating the intention coefficient of the monitoring person in the monitoring sequence, wherein a tj Is a weighting factor; />Is a local encoder hidden layer vector representation.
Step 300: constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by using a deep learning cyclic neural network structure;
referring to fig. 5, a joint coding monitoring policy recommendation model is constructed with a global encoder and a local encoder by using a deep learning cyclic neural network structure, wherein the global encoder is used for summarizing the whole monitoring sequence, and the local encoder is used for adaptively selecting important items in the current session.
And in the process of constructing the joint coding monitoring strategy recommendation model, a global encoderIs integrated into c t In order to provide a sequential behavior representation of the joint coding monitoring policy recommendation model, global encoder +.>A hidden state is different from the effect of the local encoder, local encoder +.>For calculating the attention weight in the previous hidden state, and the global encoder +.>Is used to encode the entire sequence behavior.
The embodiment of the invention utilizes a deep learning cyclic neural network structure to construct a joint coding monitoring strategy recommendation model containing a global coder and a local coder. For session-based camera surveillance tasks, the global encoder is used to summarize the entire surveillance sequence, while the local encoder may adaptively select the important items in the current session. The sequential behavior facilitates extraction of the primary purpose of the user in the current session. Thus, embodiments of the present invention use the representation of the sequence behavior with the previous hidden state to calculate the attention weight per user click.
Step 400: calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item;
step 500: the video surveillance array display ordering is optimized based on the probability value that each surveillance picture appears next.
The round inspection means that each picture of all installed cameras is displayed on a screen according to the camera sequence, and every few seconds or minutes is switched to the picture shot by the next camera. The manual clicking of the switching picture is omitted in the polling process, and the method is generally suitable for night guarding of community security rooms and electronic patrol of mall security rooms. The round-robin strategy refers to the display sequence of the round-robin cameras and the switching interval. The cyclic neural network (Recurrent Neural Network, RNN) is a type of recurrent neural network which takes sequence data as input, performs recursion in the evolution direction of the sequence and all nodes are connected in a chained manner; the cyclic neural network has memory, parameter sharing and complete graphics, so that the cyclic neural network has certain advantages in learning the nonlinear characteristics of the sequence.
The embodiment of the invention uses a bilinear decoding scheme (bi-linear decoding scheme), which reduces the number of parameters and improves the performance of the model. Wherein, according to the formulaCalculating a similarity score S using a representation of the current monitored sequence and a bilinear similarity function between each candidate i Wherein B is a dimension conversion matrix for converting c t Is converted into and embedded layer->The same dimension. Finally, the similarity score for each item is input to the softmax layer to obtain the probability that the camera view will appear next.
The video monitoring array display optimization method based on joint coding provided by the embodiment of the invention comprises the steps of constructing a global coder by taking the whole monitoring sequence as the input of the global coder and taking the behavior characteristics of monitoring personnel in the monitoring sequence as the output of the global coder; dynamically selecting and linearly combining different parts of the input sequence by adopting an object-level attention mechanism to construct a local encoder; constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by using a deep learning cyclic neural network structure; calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item; optimizing video surveillance array display ordering based on probability values that each surveillance picture appears next; the behavior of the monitoring personnel is visually analyzed by constructing a joint coding monitoring strategy recommendation model containing a global encoder and a local encoder, and then the optimized behavior of the monitoring personnel is automatically captured and summarized by utilizing a cyclic neural network structure.
The embodiment of the invention utilizes the operation log of the monitoring system to automatically learn the behaviors of operators, provides a video monitoring array display optimization method based on a global-local joint coding model, and solves the defects that the conventional round robin mechanism has higher requirement on the experience of the monitoring personnel and cannot accurately round the monitored area. The overall operation sequence is summarized by using the global encoder, and the GRU is used as a main unit of the global encoder, so that the GRU has lower calculation complexity and higher expandability, and longer operation sequences are summarized. The main purpose of capturing operators is achieved by adaptively selecting important items in the sequence of operations with a local encoder.
The video monitoring array display optimization method based on joint coding provided by the embodiment of the invention can adopt two indexes of Recall@20 and MMR@20 for evaluation,where TP represents the number of positive classes predicted as positive classes, FN represents the number of positive classes predicted as negative classes, recall@20 represents: in the ranking of the model scores all predicted items, the proportion of correctly predicted items in the first 20 items.
MRR is an indicator used to measure the effectiveness of search algorithms, and is currently widely used in terms of allowing multiple results to be returned, where a model gives a confidence (score) to each returned result, and then ranks the high-scoring results back in front according to the confidence ranking. Specifically: for a query, the average of the reciprocal ranks of the first correct answer (if the correct item returned is outside the top 20, the reciprocal rank score for that item is 0).
The MRR may be calculated using the following formula: wherein Q is a sample query set; the Q| is the number of queries in Q; rank (rank) i Representing the ranking of the first correct answer in the ith query.
The video monitoring array display optimization method based on joint coding provided by the embodiment of the invention has the advantages that the recall@20 is 48%, the MRR@20 is 22%, and the method is obviously superior to the traditional method in the same scene.
Based on the same inventive concept, referring to fig. 2, an embodiment of the present invention further provides a video surveillance array display optimization device based on joint coding, including:
a global construction module 110, configured to construct a global encoder by taking the entire monitoring sequence as an input of the global encoder, and taking the behavior characteristics of the monitoring personnel in the monitoring sequence as an output of the global encoder;
a local construction module 120 for dynamically selecting and linearly combining different portions of the input sequence to construct a local encoder using an object-level attention mechanism;
the model building module 130 is configured to build a joint coding monitoring policy recommendation model including a global encoder and a local encoder by using a deep learning cyclic neural network structure;
a similarity calculation module 140, configured to calculate a similarity score using the representation of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtain a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item;
the display ranking module 150 is configured to optimize the video surveillance array display ranking based on the probability value that each surveillance picture appears next.
Optionally, the global building module 110 is specifically configured to:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
using ordered data sets according to the formulaCalculating reset gate r t Wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W r And U r Is a weight vector;
according to the formulaCalculating candidate behavior->Wherein r is t For resetting the door +.>Output data of t-1 th time of global encoder, x t For the t-th input data of the global encoder, W and U are weight vectors respectively, and as a Hadamard product;
according to the formulaCalculating an update gate, wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W z And U z Is a weight vector;
according to the formulaCalculating candidate behavior->Behavior h before it t-1 Wherein z is t For updating the door->For candidate behavior, ++>Candidate behavior for output data of the global encoder t-1 th time>Behavior h before it t-1 Relation of (1)>Is an output sequence of operations characteristic of the global encoder.
Optionally, the local construction module 120 is specifically configured to:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
according to the formulaCalculating global encoder hidden layer output +.>And a local encoder hidden layer vector representation +.>Wherein matrix A is 1 For taking->Conversion to a potential space, matrix A 2 For taking->Converting into a potential space, wherein sigma is a Sigmoid activation function, v T Is a dimension conversion matrix;
according to the formulaCalculating a weighting factor alpha, wherein ∈>Is the global encoder hidden layer output; />Is a local encoder hidden layer vector representation;
according to the formulaCalculating the intention coefficient of the monitoring person in the monitoring sequence, wherein a tj Is a weighting factor; />Is a local encoder hidden layer vector representation.
Optionally, the model building module 130 is specifically configured to:
construction of global-contained compilations using deep-learning recurrent neural network structuresAnd the joint coding monitoring strategy recommendation model of the coder and the local coder is used for summarizing the whole monitoring sequence, and the local coder is used for adaptively selecting important items in the current session. Global encoder in combined coding monitoring strategy recommendation model construction processIs integrated into c t In order to provide a sequential behavior representation of the joint coding monitoring policy recommendation model, global encoder +.>A hidden state is different from the effect of the local encoder, local encoder +.>For calculating the attention weight in the previous hidden state, and the global encoder +.>Is used to encode the entire sequence behavior.
The embodiment of the invention also provides a video monitoring array display optimizing device based on joint coding, which utilizes an operation log of a monitoring system to construct a joint coding model, utilizes a global encoder to summarize an operation sequence, utilizes a local encoder to adaptively select important items in the operation sequence, captures the main purpose of operators, and can effectively solve the defects that the conventional round-robin mechanism has high requirement on experience of the monitoring personnel and cannot accurately round-robin the monitoring area.
It will be apparent to those skilled in the art that various modifications and variations can be made to the embodiments of the present invention without departing from the spirit or scope of the embodiments of the invention. Thus, if such modifications and variations of the embodiments of the present invention fall within the scope of the claims and the equivalents thereof, the present invention is also intended to include such modifications and variations.
Claims (4)
1. The video monitoring array display optimization method based on joint coding is characterized by comprising the following steps of:
by taking the whole monitoring sequence as the input of the global encoder, the behavior characteristics of monitoring personnel in the monitoring sequence are taken as the output of the global encoder, and the global encoder is constructed;
dynamically selecting and linearly combining different parts of the input sequence by adopting an object-level attention mechanism to construct a local encoder;
constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by using a deep learning cyclic neural network structure;
calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item;
optimizing video surveillance array display ordering based on probability values that each surveillance picture appears next;
the building global encoder specifically includes:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
using ordered data sets according to the formulaCalculating reset gate r t Wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W r And U r Is a weight vector;
according to the formulaCalculating candidate behavior->Wherein r is t For resetting the door +.>Output data of t-1 th time of global encoder, x t For the t-th input data of the global encoder, W and U are weight vectors respectively, and as a Hadamard product;
according to the formulaCalculating an update gate, wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W z And U z Is a weight vector;
according to the formulaCalculating candidate behavior->Behavior h before it t-1 Wherein z is t For updating the door->For candidate behavior, ++>Candidate behavior for output data of the global encoder t-1 th time>Behavior h before it t-1 Relation of (1)>The output operation sequence characteristic of the global encoder;
the construction of the local encoder comprises in particular:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
according to the formulaCalculating global encoder hidden layer output +.>And a local encoder hidden layer vector representation +.>Wherein matrix A is 1 For taking->Conversion to a potential space, matrix A 2 For connectingConverting into a potential space, wherein sigma is a Sigmoid activation function, v T Is a dimension conversion matrix;
according to the formulaCalculating a weighting factor alpha, wherein ∈>Is the global encoder hidden layer output; />Is a local encoder hidden layer vector representation;
according to the formulaCalculating the intention coefficient of the monitoring person in the monitoring sequence, wherein a tj Is a weighting factor;is a local encoder hidden layer vector representation;
the construction of the joint coding monitoring strategy recommendation model containing the global coder and the local coder specifically comprises the following steps:
and constructing a joint coding monitoring strategy recommendation model containing a global encoder and a local encoder by using a deep learning cyclic neural network structure, wherein the global encoder is used for summarizing the whole monitoring sequence, and the local encoder is used for adaptively selecting important items in the current session.
2. The video surveillance array display optimization method of claim 1, wherein a global encoder is used in the process of building a joint coding surveillance strategy recommendation modelIs integrated into c t In order to provide a sequential behavior representation of the joint coding monitoring policy recommendation model, global encoder +.>A hidden state is different from the effect of the local encoder, local encoder +.>For calculating the attention weight in the previous hidden state, and the global encoder +.>Is used to encode the entire sequence behavior.
3. The utility model provides a video monitoring array display optimizing device based on joint coding which characterized in that, video monitoring array display optimizing device includes:
the global construction module is used for constructing a global encoder by taking the whole monitoring sequence as the input of the global encoder and taking the behavior characteristics of monitoring personnel in the monitoring sequence as the output of the global encoder;
the local construction module is used for dynamically selecting and linearly combining different parts of the input sequence by adopting an object-level attention mechanism to construct a local encoder;
the model construction module is used for constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by utilizing a deep learning cyclic neural network structure;
the similarity calculation module is used for calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item;
the display ordering module is used for optimizing the display ordering of the video monitoring array based on the probability value of each monitoring picture appearing next;
the global construction module is specifically configured to:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
by using after orderingData set according to formulaCalculating reset gate r t Wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W r And U r Is a weight vector;
according to the formulaCalculating candidate behavior->Wherein r is t For resetting the door +.>Output data of t-1 th time of global encoder, x t For the t-th input data of the global encoder, W and U are weight vectors respectively, and as a Hadamard product;
according to the formulaCalculating an update gate, wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W z And U z Is a weight vector;
according to the formulaCalculating candidate behavior->Behavior h before it t-1 Wherein z is t For updating the door->For candidate behavior, ++>Candidate behavior for output data of the global encoder t-1 th time>Behavior h before it t-1 Relation of (1)>The output operation sequence characteristic of the global encoder;
the local construction module is specifically configured to:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
according to the formulaCalculating global encoder hidden layer output +.>And a local encoder hidden layer vector representation +.>Wherein matrix A is 1 For taking->Conversion to a potential space, matrix A 2 For connectingConverting into a potential space, wherein sigma is a Sigmoid activation function, v T Is a dimension conversion matrix;
according to the formulaCalculating a weighting factor alpha, wherein ∈>Is the global encoder hidden layer output; />Is a local encoder hidden layer vector representation;
according to the formulaCalculating the intention coefficient of the monitoring person in the monitoring sequence, wherein a tj Is a weighting factor;is a local encoder hidden layer vector representation;
the model construction module is specifically used for:
and constructing a joint coding monitoring strategy recommendation model containing a global encoder and a local encoder by using a deep learning cyclic neural network structure, wherein the global encoder is used for summarizing the whole monitoring sequence, and the local encoder is used for adaptively selecting important items in the current session.
4. The video surveillance array display optimization apparatus of claim 3, wherein the global encoder is configured to perform a joint encoding surveillance strategy recommendation model construction processIs integrated into c t In order to provide a sequential behavior representation of the joint coding monitoring policy recommendation model, global encoder +.>A hidden state is different from the effect of the local encoder, local encoder +.>For calculating the attention weight in the previous hidden state, and the global encoder +.>Is used to encode the entire sequence behavior.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110802969.0A CN113467740B (en) | 2021-07-15 | 2021-07-15 | Video monitoring array display optimization method and device based on joint coding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110802969.0A CN113467740B (en) | 2021-07-15 | 2021-07-15 | Video monitoring array display optimization method and device based on joint coding |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113467740A CN113467740A (en) | 2021-10-01 |
CN113467740B true CN113467740B (en) | 2024-02-02 |
Family
ID=77880520
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110802969.0A Active CN113467740B (en) | 2021-07-15 | 2021-07-15 | Video monitoring array display optimization method and device based on joint coding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113467740B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6795567B1 (en) * | 1999-09-16 | 2004-09-21 | Hewlett-Packard Development Company, L.P. | Method for efficiently tracking object models in video sequences via dynamic ordering of features |
US10289912B1 (en) * | 2015-04-29 | 2019-05-14 | Google Llc | Classifying videos using neural networks |
CN110119467A (en) * | 2019-05-14 | 2019-08-13 | 苏州大学 | A kind of dialogue-based item recommendation method, device, equipment and storage medium |
CN110955826A (en) * | 2019-11-08 | 2020-04-03 | 上海交通大学 | Recommendation system based on improved recurrent neural network unit |
CN111080400A (en) * | 2019-11-25 | 2020-04-28 | 中山大学 | Commodity recommendation method and system based on gate control graph convolution network and storage medium |
WO2020104590A2 (en) * | 2018-11-21 | 2020-05-28 | Deepmind Technologies Limited | Aligning sequences by generating encoded representations of data items |
CN112488014A (en) * | 2020-12-04 | 2021-03-12 | 重庆邮电大学 | Video prediction method based on gated cyclic unit |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3586276A1 (en) * | 2017-02-24 | 2020-01-01 | Google LLC | Sequence processing using online attention |
US20180247199A1 (en) * | 2017-02-24 | 2018-08-30 | Qualcomm Incorporated | Method and apparatus for multi-dimensional sequence prediction |
-
2021
- 2021-07-15 CN CN202110802969.0A patent/CN113467740B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6795567B1 (en) * | 1999-09-16 | 2004-09-21 | Hewlett-Packard Development Company, L.P. | Method for efficiently tracking object models in video sequences via dynamic ordering of features |
US10289912B1 (en) * | 2015-04-29 | 2019-05-14 | Google Llc | Classifying videos using neural networks |
WO2020104590A2 (en) * | 2018-11-21 | 2020-05-28 | Deepmind Technologies Limited | Aligning sequences by generating encoded representations of data items |
CN110119467A (en) * | 2019-05-14 | 2019-08-13 | 苏州大学 | A kind of dialogue-based item recommendation method, device, equipment and storage medium |
CN110955826A (en) * | 2019-11-08 | 2020-04-03 | 上海交通大学 | Recommendation system based on improved recurrent neural network unit |
CN111080400A (en) * | 2019-11-25 | 2020-04-28 | 中山大学 | Commodity recommendation method and system based on gate control graph convolution network and storage medium |
CN112488014A (en) * | 2020-12-04 | 2021-03-12 | 重庆邮电大学 | Video prediction method based on gated cyclic unit |
Non-Patent Citations (4)
Title |
---|
Sequence Generation Network Based on Hierarchical Attention for Multi-Charge Prediction;baosen ma;IEEE;全文 * |
基于循环时间卷积网络的序列流推荐算法;李太松;贺泽宇;王冰;颜永红;唐向红;;计算机科学(第03期);全文 * |
基于深度学习的推荐技术研究及应用;史冬霞;中国优秀硕士论文全文数据库·信息科技辑;全文 * |
融合知识库和深度学习的电网监控告警事件智能识别;孙国强等;电力自动化设备;第40卷(第4期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113467740A (en) | 2021-10-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bi et al. | Graph-based spatio-temporal feature learning for neuromorphic vision sensing | |
CN110929622B (en) | Video classification method, model training method, device, equipment and storage medium | |
CN107808139B (en) | Real-time monitoring threat analysis method and system based on deep learning | |
Peng et al. | Bag of events: An efficient probability-based feature extraction method for AER image sensors | |
CN110532996A (en) | The method of visual classification, the method for information processing and server | |
CN111339818B (en) | Face multi-attribute recognition system | |
CN107818307B (en) | Multi-label video event detection method based on LSTM network | |
CN110659391A (en) | Video detection method and device | |
CN110222592B (en) | Construction method of time sequence behavior detection network model based on complementary time sequence behavior proposal generation | |
CN112052387A (en) | Content recommendation method and device and computer readable storage medium | |
CN111914676A (en) | Human body tumbling detection method and device, electronic equipment and storage medium | |
CN112766119A (en) | Method for accurately identifying strangers and constructing community security based on multi-dimensional face analysis | |
CN114550053A (en) | Traffic accident responsibility determination method, device, computer equipment and storage medium | |
Venkatesvara Rao et al. | Real-time video object detection and classification using hybrid texture feature extraction | |
CN113467740B (en) | Video monitoring array display optimization method and device based on joint coding | |
Hadji et al. | Region of interest and redundancy problem in migratory birds wild life surveillance | |
CN108921012B (en) | Method for processing image video frame by using artificial intelligence chip | |
Kong et al. | A novel ConvLSTM with multifeature fusion for financial intelligent trading | |
CN115329265A (en) | Method, device and equipment for determining graph code track association degree and storage medium | |
CN111582031B (en) | Multi-model collaborative violence detection method and system based on neural network | |
CN114581769A (en) | Method for identifying houses under construction based on unsupervised clustering | |
Min et al. | Online Fall Detection Using Attended Memory Reference Network | |
CN112153464A (en) | Smart city management system | |
CN112579824A (en) | Video data classification method and device, electronic equipment and storage medium | |
CN116091984B (en) | Video object segmentation method, device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |