CN113467740B - Video monitoring array display optimization method and device based on joint coding - Google Patents

Video monitoring array display optimization method and device based on joint coding Download PDF

Info

Publication number
CN113467740B
CN113467740B CN202110802969.0A CN202110802969A CN113467740B CN 113467740 B CN113467740 B CN 113467740B CN 202110802969 A CN202110802969 A CN 202110802969A CN 113467740 B CN113467740 B CN 113467740B
Authority
CN
China
Prior art keywords
encoder
global
monitoring
sequence
local
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110802969.0A
Other languages
Chinese (zh)
Other versions
CN113467740A (en
Inventor
孙国强
刘保臣
杨志刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Bo Tian Tian Tong Information Technology Co ltd
Original Assignee
Qingdao Bo Tian Tian Tong Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Bo Tian Tian Tong Information Technology Co ltd filed Critical Qingdao Bo Tian Tian Tong Information Technology Co ltd
Priority to CN202110802969.0A priority Critical patent/CN113467740B/en
Publication of CN113467740A publication Critical patent/CN113467740A/en
Application granted granted Critical
Publication of CN113467740B publication Critical patent/CN113467740B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/14Digital output to display device ; Cooperation and interconnection of the display device with other functional units
    • G06F3/1407General aspects irrespective of display type, e.g. determination of decimal point position, display with fixed or driving decimal point, suppression of non-significant zeros
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a video monitoring array display optimization method and device based on joint coding, which belong to the technical field of artificial intelligence, and are implemented by constructing a global encoder and a local encoder; constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by using a deep learning cyclic neural network structure; calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item; optimizing video surveillance array display ordering based on probability values that each surveillance picture appears next; the behavior of the monitoring personnel is visually analyzed by constructing a joint coding monitoring strategy recommendation model containing a global encoder and a local encoder, and then the optimized behavior of the monitoring personnel is automatically captured and summarized by utilizing a cyclic neural network structure.

Description

Video monitoring array display optimization method and device based on joint coding
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a video monitoring array display optimization method and device based on joint coding.
Background
In recent years, with the development of technology and the progress of society, video monitoring has rapidly developed, and is increasingly applied to the traditional and non-traditional security fields. The video monitoring system is one of the most important security measures in the current security. Along with the increase of video monitoring points, the number of required monitoring videos is far greater than the displayable number of monitoring screens of a command center. The supervision personnel can carry out video round inspection through the manual mode, the working strength is high, the efficiency is low, and effective management and control are difficult to realize. With the development of computer vision and artificial intelligence technology, the round-robin mechanism of the intelligent monitoring system lightens the working intensity of supervisory personnel to a certain extent and improves the working efficiency of security management, but the current round-robin mechanism can cause serious information loss.
The existing monitoring camera array sequencing display technology mainly has two thinking directions of fixed regular sequencing display and abnormal picture sequencing display for the display of the monitoring video, the abnormal picture sequencing display method can use a method for calculating video weights based on image comparison, firstly, the weight value of each terminal is calculated based on the difference between the front video and the rear video of a single video acquisition terminal, and then the weight value of each terminal is used as a basis to screen and determine the playing sequence of a plurality of video streams on a monitor screen. The method has good effect on the monitor picture round in a long-term 'dynamic static' state, but has little effect on the monitor picture which continuously and dynamically changes. And secondly, judging whether personnel, abnormal equipment and the like invade or not through a background extraction technology, and carrying out important monitoring camera polling, so that the method has higher requirements on a moving object detection technology, and has higher false alarm rate due to the working environment of the monitoring camera and the reasons of the moving object detection technology. According to the fixed rule ordering display method, fixed picture round inspection is performed at fixed intervals according to the existing experience of monitoring personnel, the monitoring personnel are required to be familiar with risk easily-occurring areas and time, and round inspection monitoring at fixed time and fixed points cannot be performed aiming at high risk areas in different time periods due to fixed round inspection sequences.
The two video monitoring array sequencing display methods have great disadvantages: the round robin sequencing technology based on fixed rules has higher requirement on the experience of monitoring personnel, and can not accurately round different risk areas at different times; the round inspection technology based on the abnormal images is affected by the accuracy of the intelligent image analysis technology, the false alarm rate is high, and the judgment of monitoring personnel on risks is affected.
The video monitoring array ordering display method is theoretically clustered into a recommendation algorithm, and the problem can be effectively solved by using the recommendation algorithm. Monitoring operator information is not of interest in the monitoring system log because the monitoring camera ordering display order is more important to the monitoring system than monitoring operator information. Therefore, the information which can be effectively utilized only has the viewing sequence and the corresponding viewing time of the operator, in this case, the recommendation result which is displayed in the next order is often inaccurate by using the traditional recommendation method, and the recommended result has the problems of hysteresis and repeatability, however, the session-based recommendation system can effectively solve the problems.
Disclosure of Invention
The invention provides a video monitoring array display optimization method and device based on joint coding, which are characterized in that a joint coding monitoring strategy recommendation model containing a global coder and a local coder is constructed, the behaviors of monitoring personnel are visually analyzed, and then the behaviors of the monitoring personnel after optimization are automatically captured and summarized by utilizing a circulating neural network structure.
The specific technical scheme provided by the invention is as follows:
in one aspect, the invention provides a video monitoring array display optimization method based on joint coding, which comprises the following steps:
by taking the whole monitoring sequence as the input of the global encoder, the behavior characteristics of monitoring personnel in the monitoring sequence are taken as the output of the global encoder, and the global encoder is constructed;
dynamically selecting and linearly combining different parts of the input sequence by adopting an object-level attention mechanism to construct a local encoder;
constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by using a deep learning cyclic neural network structure;
calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item;
the video surveillance array display ordering is optimized based on the probability value that each surveillance picture appears next.
Optionally, the building the global encoder specifically includes:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
using ordered data sets according to the formulaCalculating reset gate r t Wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W r And U r Is a weight vector;
according to the formulaCalculating candidate behavior->Wherein r is t For resetting the door +.>Output data of t-1 th time of global encoder, x t For the t-th input data of the global encoder, W and U are weight vectors respectively, and as a Hadamard product;
according to the formulaCalculating an update gate, wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W z And U z Is a weight vector;
according to the formulaCalculating candidate behavior->Behavior h before it t-1 Wherein z is t For updating the door->For candidate behavior, ++>Candidate behavior for output data of the global encoder t-1 th time>Behavior h before it t-1 Relation of (1)>Is an output sequence of operations characteristic of the global encoder.
Optionally, the constructing a local encoder specifically includes:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
according to the formulaCalculating global encoder hidden layer output +.>And a local encoder hidden layer vector representation +.>Wherein matrix A is 1 For taking->Conversion to a potential space, matrix A 2 For taking->Converting into a potential space, wherein sigma is a Sigmoid activation function, v T Is a dimension conversion matrix;
according to the formulaCalculating a weighting factor alpha, wherein ∈>Is the global encoder hidden layer output; />Is a local encoder hidden layer vector representation;
according to the formulaCalculating the intention coefficient of the monitoring person in the monitoring sequence, wherein a tj Is a weighting factor; />Is a local encoder hidden layer vector representation.
Optionally, the constructing a joint coding monitoring policy recommendation model including a global encoder and a local encoder specifically includes:
and constructing a joint coding monitoring strategy recommendation model containing a global encoder and a local encoder by using a deep learning cyclic neural network structure, wherein the global encoder is used for summarizing the whole monitoring sequence, and the local encoder is used for adaptively selecting important items in the current session.
Optionally, in the process of constructing the joint coding monitoring policy recommendation model, a global encoderIs integrated into c t In order to provide a sequential behavior representation of the joint coding monitoring policy recommendation model, global encoder +.>A hidden state is different from the effect of the local encoder, local encoder +.>For calculating the attention weight in the previous hidden state, and the global encoder +.>Is used to encode the entire sequence behavior.
On the other hand, the invention also provides a video monitoring array display optimizing device based on joint coding, which comprises the following steps:
the global construction module is used for constructing a global encoder by taking the whole monitoring sequence as the input of the global encoder and taking the behavior characteristics of monitoring personnel in the monitoring sequence as the output of the global encoder;
the local construction module is used for dynamically selecting and linearly combining different parts of the input sequence by adopting an object-level attention mechanism to construct a local encoder;
the model construction module is used for constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by utilizing a deep learning cyclic neural network structure;
the similarity calculation module is used for calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item;
and the display ordering module is used for optimizing the display ordering of the video monitoring array based on the probability value of each monitoring picture appearing next.
Optionally, the global building module is specifically configured to:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
using ordered data sets according to the formulaCalculating reset gate r t Wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W r And U r Is a weight vector;
according to the formulaCalculating candidate behavior->Wherein r is t For resetting the door +.>Output data of t-1 th time of global encoder, x t For the t-th input data of the global encoder, W and U are weight vectors respectively, and as a Hadamard product;
according to the formulaCalculating an update gate, wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W z And U z Is a weight vector;
according to the formulaCalculating candidate behavior->Behavior h before it t-1 Wherein z is t For updating the door->For candidate behavior, ++>Candidate behavior for output data of the global encoder t-1 th time>Behavior h before it t-1 Relation of (1)>Is an output sequence of operations characteristic of the global encoder.
Optionally, the local construction module is specifically configured to:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
according to the formulaCalculating global encoder hidden layer output +.>And a local encoder hidden layer vector representation +.>Wherein matrix A is 1 For taking->Conversion to a potential space, matrix A 2 For taking->Converting into a potential space, wherein sigma is a Sigmoid activation function, v T Is a dimension conversion matrix;
according to the formulaCalculating a weighting factor alpha, wherein ∈>Is the global encoder hidden layer output; />Is a local encoder hidden layer vector representation;
according to the formulaCalculating the intention coefficient of the monitoring person in the monitoring sequence, wherein a tj Is a weighting factor; />Is a local encoder hidden layer vector representation.
Optionally, the model building module is specifically configured to:
and constructing a joint coding monitoring strategy recommendation model containing a global encoder and a local encoder by using a deep learning cyclic neural network structure, wherein the global encoder is used for summarizing the whole monitoring sequence, and the local encoder is used for adaptively selecting important items in the current session.
Optionally, in the process of constructing the joint coding monitoring policy recommendation model, a global encoderIs integrated into c t In order to provide a sequential behavior representation of the joint coding monitoring policy recommendation model, global encoder +.>A hidden state is different from the effect of the local encoder, local encoder +.>For calculating the attention weight in the previous hidden state, and the global encoder +.>Is used to encode the entire sequence behavior.
The beneficial effects of the invention are as follows:
the video monitoring array display optimization method based on joint coding provided by the embodiment of the invention comprises the steps of constructing a global coder by taking the whole monitoring sequence as the input of the global coder and taking the behavior characteristics of monitoring personnel in the monitoring sequence as the output of the global coder; dynamically selecting and linearly combining different parts of the input sequence by adopting an object-level attention mechanism to construct a local encoder; constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by using a deep learning cyclic neural network structure; calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item; optimizing video surveillance array display ordering based on probability values that each surveillance picture appears next; the behavior of the monitoring personnel is visually analyzed by constructing a joint coding monitoring strategy recommendation model containing a global encoder and a local encoder, and then the optimized behavior of the monitoring personnel is automatically captured and summarized by utilizing a cyclic neural network structure.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a video surveillance array display optimization method based on joint coding according to an embodiment of the present invention;
FIG. 2 is a block diagram of a video surveillance array display optimization method device based on joint coding according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a global encoder provided by an embodiment of the present invention;
FIG. 4 is a schematic diagram of a local encoder according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a joint coding monitoring policy recommendation model according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings.
The following will describe in detail a video surveillance array display optimization method and apparatus based on joint coding according to an embodiment of the present invention with reference to fig. 1 to fig. 5.
Referring to fig. 1, fig. 3, fig. 4 and fig. 5, the video monitoring array display optimization method based on joint coding provided by the embodiment of the invention includes:
step 100: by taking the whole monitoring sequence as the input of the global encoder, the behavior characteristics of monitoring personnel in the monitoring sequence are taken as the output of the global encoder, and the global encoder is constructed;
specifically, referring to fig. 3, the data sets are grouped according to the operation object organization, and the grouped data sets are ordered according to the operation time, and one object organization arranged according to the time sequence corresponds to one sequence, wherein the data sets comprise a user name, an operation object organization and the operation time;
using ordered data sets according to the formulaCalculating reset gate r t Wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W r And U r Is a weight vector;
according to the formulaCalculating candidate behavior->Wherein r is t For resetting the door +.>Output data of t-1 th time of global encoder, x t For the t-th input data of the global encoder, W and U are weight vectors respectively, and as a Hadamard product;
according to the formulaCalculating an update gate, wherein sigma is Sigmoid activate function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W z And U z Is a weight vector;
according to the formulaCalculating candidate behavior->Behavior h before it t-1 Wherein z is t For updating the door->For candidate behavior, ++>Candidate behavior for output data of the global encoder t-1 th time>Behavior h before it t-1 Relation of (1)>Is an output sequence of operations characteristic of the global encoder.
The whole monitoring sequence is used as the input of the global encoder, and the behavior characteristic of the monitoring personnel in the sequence is used as the output of the global encoder. The data set contains 14 data items of operation users, operation user IP, operation user MAC, operation user organization, operation service, operation action, operation object type, operation object organization, description, operation time, operation result, new value and original value.
The data sets are grouped according to the operation object organization, and the grouped data sets are ordered according to the operation time, and one object organization arranged according to the time sequence corresponds to one sequence. In the global encoder, the input data is divided into batches of fixed size for training, and the size of the batches determines the sum of the number of samples for one training, and influences the optimization degree of the model, and the parameter setting and speed of the model input layer. The correct lot is chosen to find the best balance between memory efficiency and memory capacity.
According to the embodiment of the invention, the global encoder splits the sequence data set of the user according to the characteristics of the user click sequence, takes the last click as input and the next click as output, and maintains the relevance between the data through the corresponding relation between the input and the output, thereby solving the problem that modeling is difficult due to large sequence length difference.
Step 200: dynamically selecting and linearly combining different parts of the input sequence by adopting an object-level attention mechanism to construct a local encoder;
because the vectorization summarization of the whole monitoring sequence by the global encoder is difficult to accurately obtain the intention of the monitoring personnel, and the video monitoring-oriented local encoder is designed based on the vectorization summarization, the video monitoring-oriented local encoder has the advantage of adaptively capturing the intention of the monitoring personnel.
Referring to fig. 4, in the construction process of the local encoder, data sets are grouped according to operation object organizations, and the grouped data sets are ordered according to operation time, one object organization arranged according to time sequence corresponds to a sequence, wherein the data sets include a user name, an operation object organization, and an operation time. In the construction process of the local encoder, the adopted data set is the same as the data set adopted in the construction process of the global encoder, and the data preprocessing mode is the same as that of the global encoder.
Referring to FIG. 4, the formula is shownComputing global encoder hidden layer outputAnd a local encoder hidden layer vector representation +.>Wherein matrix A is 1 For taking->Conversion to a potential space, matrix A 2 For taking->Converting into a potential space, wherein sigma is a Sigmoid activation function, v T Is a dimension conversion matrix;
according to the formulaCalculating a weighting factor alpha, wherein ∈>Is the global encoder hidden layer output; />Is a local encoder hidden layer vector representation;
according to the formulaCalculating the intention coefficient of the monitoring person in the monitoring sequence, wherein a tj Is a weighting factor; />Is a local encoder hidden layer vector representation.
Step 300: constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by using a deep learning cyclic neural network structure;
referring to fig. 5, a joint coding monitoring policy recommendation model is constructed with a global encoder and a local encoder by using a deep learning cyclic neural network structure, wherein the global encoder is used for summarizing the whole monitoring sequence, and the local encoder is used for adaptively selecting important items in the current session.
And in the process of constructing the joint coding monitoring strategy recommendation model, a global encoderIs integrated into c t In order to provide a sequential behavior representation of the joint coding monitoring policy recommendation model, global encoder +.>A hidden state is different from the effect of the local encoder, local encoder +.>For calculating the attention weight in the previous hidden state, and the global encoder +.>Is used to encode the entire sequence behavior.
The embodiment of the invention utilizes a deep learning cyclic neural network structure to construct a joint coding monitoring strategy recommendation model containing a global coder and a local coder. For session-based camera surveillance tasks, the global encoder is used to summarize the entire surveillance sequence, while the local encoder may adaptively select the important items in the current session. The sequential behavior facilitates extraction of the primary purpose of the user in the current session. Thus, embodiments of the present invention use the representation of the sequence behavior with the previous hidden state to calculate the attention weight per user click.
Step 400: calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item;
step 500: the video surveillance array display ordering is optimized based on the probability value that each surveillance picture appears next.
The round inspection means that each picture of all installed cameras is displayed on a screen according to the camera sequence, and every few seconds or minutes is switched to the picture shot by the next camera. The manual clicking of the switching picture is omitted in the polling process, and the method is generally suitable for night guarding of community security rooms and electronic patrol of mall security rooms. The round-robin strategy refers to the display sequence of the round-robin cameras and the switching interval. The cyclic neural network (Recurrent Neural Network, RNN) is a type of recurrent neural network which takes sequence data as input, performs recursion in the evolution direction of the sequence and all nodes are connected in a chained manner; the cyclic neural network has memory, parameter sharing and complete graphics, so that the cyclic neural network has certain advantages in learning the nonlinear characteristics of the sequence.
The embodiment of the invention uses a bilinear decoding scheme (bi-linear decoding scheme), which reduces the number of parameters and improves the performance of the model. Wherein, according to the formulaCalculating a similarity score S using a representation of the current monitored sequence and a bilinear similarity function between each candidate i Wherein B is a dimension conversion matrix for converting c t Is converted into and embedded layer->The same dimension. Finally, the similarity score for each item is input to the softmax layer to obtain the probability that the camera view will appear next.
The video monitoring array display optimization method based on joint coding provided by the embodiment of the invention comprises the steps of constructing a global coder by taking the whole monitoring sequence as the input of the global coder and taking the behavior characteristics of monitoring personnel in the monitoring sequence as the output of the global coder; dynamically selecting and linearly combining different parts of the input sequence by adopting an object-level attention mechanism to construct a local encoder; constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by using a deep learning cyclic neural network structure; calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item; optimizing video surveillance array display ordering based on probability values that each surveillance picture appears next; the behavior of the monitoring personnel is visually analyzed by constructing a joint coding monitoring strategy recommendation model containing a global encoder and a local encoder, and then the optimized behavior of the monitoring personnel is automatically captured and summarized by utilizing a cyclic neural network structure.
The embodiment of the invention utilizes the operation log of the monitoring system to automatically learn the behaviors of operators, provides a video monitoring array display optimization method based on a global-local joint coding model, and solves the defects that the conventional round robin mechanism has higher requirement on the experience of the monitoring personnel and cannot accurately round the monitored area. The overall operation sequence is summarized by using the global encoder, and the GRU is used as a main unit of the global encoder, so that the GRU has lower calculation complexity and higher expandability, and longer operation sequences are summarized. The main purpose of capturing operators is achieved by adaptively selecting important items in the sequence of operations with a local encoder.
The video monitoring array display optimization method based on joint coding provided by the embodiment of the invention can adopt two indexes of Recall@20 and MMR@20 for evaluation,where TP represents the number of positive classes predicted as positive classes, FN represents the number of positive classes predicted as negative classes, recall@20 represents: in the ranking of the model scores all predicted items, the proportion of correctly predicted items in the first 20 items.
MRR is an indicator used to measure the effectiveness of search algorithms, and is currently widely used in terms of allowing multiple results to be returned, where a model gives a confidence (score) to each returned result, and then ranks the high-scoring results back in front according to the confidence ranking. Specifically: for a query, the average of the reciprocal ranks of the first correct answer (if the correct item returned is outside the top 20, the reciprocal rank score for that item is 0).
The MRR may be calculated using the following formula: wherein Q is a sample query set; the Q| is the number of queries in Q; rank (rank) i Representing the ranking of the first correct answer in the ith query.
The video monitoring array display optimization method based on joint coding provided by the embodiment of the invention has the advantages that the recall@20 is 48%, the MRR@20 is 22%, and the method is obviously superior to the traditional method in the same scene.
Based on the same inventive concept, referring to fig. 2, an embodiment of the present invention further provides a video surveillance array display optimization device based on joint coding, including:
a global construction module 110, configured to construct a global encoder by taking the entire monitoring sequence as an input of the global encoder, and taking the behavior characteristics of the monitoring personnel in the monitoring sequence as an output of the global encoder;
a local construction module 120 for dynamically selecting and linearly combining different portions of the input sequence to construct a local encoder using an object-level attention mechanism;
the model building module 130 is configured to build a joint coding monitoring policy recommendation model including a global encoder and a local encoder by using a deep learning cyclic neural network structure;
a similarity calculation module 140, configured to calculate a similarity score using the representation of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtain a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item;
the display ranking module 150 is configured to optimize the video surveillance array display ranking based on the probability value that each surveillance picture appears next.
Optionally, the global building module 110 is specifically configured to:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
using ordered data sets according to the formulaCalculating reset gate r t Wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W r And U r Is a weight vector;
according to the formulaCalculating candidate behavior->Wherein r is t For resetting the door +.>Output data of t-1 th time of global encoder, x t For the t-th input data of the global encoder, W and U are weight vectors respectively, and as a Hadamard product;
according to the formulaCalculating an update gate, wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W z And U z Is a weight vector;
according to the formulaCalculating candidate behavior->Behavior h before it t-1 Wherein z is t For updating the door->For candidate behavior, ++>Candidate behavior for output data of the global encoder t-1 th time>Behavior h before it t-1 Relation of (1)>Is an output sequence of operations characteristic of the global encoder.
Optionally, the local construction module 120 is specifically configured to:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
according to the formulaCalculating global encoder hidden layer output +.>And a local encoder hidden layer vector representation +.>Wherein matrix A is 1 For taking->Conversion to a potential space, matrix A 2 For taking->Converting into a potential space, wherein sigma is a Sigmoid activation function, v T Is a dimension conversion matrix;
according to the formulaCalculating a weighting factor alpha, wherein ∈>Is the global encoder hidden layer output; />Is a local encoder hidden layer vector representation;
according to the formulaCalculating the intention coefficient of the monitoring person in the monitoring sequence, wherein a tj Is a weighting factor; />Is a local encoder hidden layer vector representation.
Optionally, the model building module 130 is specifically configured to:
construction of global-contained compilations using deep-learning recurrent neural network structuresAnd the joint coding monitoring strategy recommendation model of the coder and the local coder is used for summarizing the whole monitoring sequence, and the local coder is used for adaptively selecting important items in the current session. Global encoder in combined coding monitoring strategy recommendation model construction processIs integrated into c t In order to provide a sequential behavior representation of the joint coding monitoring policy recommendation model, global encoder +.>A hidden state is different from the effect of the local encoder, local encoder +.>For calculating the attention weight in the previous hidden state, and the global encoder +.>Is used to encode the entire sequence behavior.
The embodiment of the invention also provides a video monitoring array display optimizing device based on joint coding, which utilizes an operation log of a monitoring system to construct a joint coding model, utilizes a global encoder to summarize an operation sequence, utilizes a local encoder to adaptively select important items in the operation sequence, captures the main purpose of operators, and can effectively solve the defects that the conventional round-robin mechanism has high requirement on experience of the monitoring personnel and cannot accurately round-robin the monitoring area.
It will be apparent to those skilled in the art that various modifications and variations can be made to the embodiments of the present invention without departing from the spirit or scope of the embodiments of the invention. Thus, if such modifications and variations of the embodiments of the present invention fall within the scope of the claims and the equivalents thereof, the present invention is also intended to include such modifications and variations.

Claims (4)

1. The video monitoring array display optimization method based on joint coding is characterized by comprising the following steps of:
by taking the whole monitoring sequence as the input of the global encoder, the behavior characteristics of monitoring personnel in the monitoring sequence are taken as the output of the global encoder, and the global encoder is constructed;
dynamically selecting and linearly combining different parts of the input sequence by adopting an object-level attention mechanism to construct a local encoder;
constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by using a deep learning cyclic neural network structure;
calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item;
optimizing video surveillance array display ordering based on probability values that each surveillance picture appears next;
the building global encoder specifically includes:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
using ordered data sets according to the formulaCalculating reset gate r t Wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W r And U r Is a weight vector;
according to the formulaCalculating candidate behavior->Wherein r is t For resetting the door +.>Output data of t-1 th time of global encoder, x t For the t-th input data of the global encoder, W and U are weight vectors respectively, and as a Hadamard product;
according to the formulaCalculating an update gate, wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W z And U z Is a weight vector;
according to the formulaCalculating candidate behavior->Behavior h before it t-1 Wherein z is t For updating the door->For candidate behavior, ++>Candidate behavior for output data of the global encoder t-1 th time>Behavior h before it t-1 Relation of (1)>The output operation sequence characteristic of the global encoder;
the construction of the local encoder comprises in particular:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
according to the formulaCalculating global encoder hidden layer output +.>And a local encoder hidden layer vector representation +.>Wherein matrix A is 1 For taking->Conversion to a potential space, matrix A 2 For connectingConverting into a potential space, wherein sigma is a Sigmoid activation function, v T Is a dimension conversion matrix;
according to the formulaCalculating a weighting factor alpha, wherein ∈>Is the global encoder hidden layer output; />Is a local encoder hidden layer vector representation;
according to the formulaCalculating the intention coefficient of the monitoring person in the monitoring sequence, wherein a tj Is a weighting factor;is a local encoder hidden layer vector representation;
the construction of the joint coding monitoring strategy recommendation model containing the global coder and the local coder specifically comprises the following steps:
and constructing a joint coding monitoring strategy recommendation model containing a global encoder and a local encoder by using a deep learning cyclic neural network structure, wherein the global encoder is used for summarizing the whole monitoring sequence, and the local encoder is used for adaptively selecting important items in the current session.
2. The video surveillance array display optimization method of claim 1, wherein a global encoder is used in the process of building a joint coding surveillance strategy recommendation modelIs integrated into c t In order to provide a sequential behavior representation of the joint coding monitoring policy recommendation model, global encoder +.>A hidden state is different from the effect of the local encoder, local encoder +.>For calculating the attention weight in the previous hidden state, and the global encoder +.>Is used to encode the entire sequence behavior.
3. The utility model provides a video monitoring array display optimizing device based on joint coding which characterized in that, video monitoring array display optimizing device includes:
the global construction module is used for constructing a global encoder by taking the whole monitoring sequence as the input of the global encoder and taking the behavior characteristics of monitoring personnel in the monitoring sequence as the output of the global encoder;
the local construction module is used for dynamically selecting and linearly combining different parts of the input sequence by adopting an object-level attention mechanism to construct a local encoder;
the model construction module is used for constructing a joint coding monitoring strategy recommendation model containing a global coder and a local coder by utilizing a deep learning cyclic neural network structure;
the similarity calculation module is used for calculating a similarity score by using the representation form of the current monitoring sequence and a bilinear similarity function between each candidate item, and obtaining a probability value of the next occurrence of the corresponding monitoring picture according to the similarity score of each item;
the display ordering module is used for optimizing the display ordering of the video monitoring array based on the probability value of each monitoring picture appearing next;
the global construction module is specifically configured to:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
by using after orderingData set according to formulaCalculating reset gate r t Wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W r And U r Is a weight vector;
according to the formulaCalculating candidate behavior->Wherein r is t For resetting the door +.>Output data of t-1 th time of global encoder, x t For the t-th input data of the global encoder, W and U are weight vectors respectively, and as a Hadamard product;
according to the formulaCalculating an update gate, wherein sigma is a Sigmoid activation function, x t For the t-th input data of the global encoder, is->Output data of t-1 th time of global encoder, W z And U z Is a weight vector;
according to the formulaCalculating candidate behavior->Behavior h before it t-1 Wherein z is t For updating the door->For candidate behavior, ++>Candidate behavior for output data of the global encoder t-1 th time>Behavior h before it t-1 Relation of (1)>The output operation sequence characteristic of the global encoder;
the local construction module is specifically configured to:
grouping the data sets according to the operation object organization, and sorting the grouped data sets according to the operation time, wherein one object organization arranged according to the time sequence corresponds to a sequence, and the data sets comprise a user name, an operation object, the operation object organization and the operation time;
according to the formulaCalculating global encoder hidden layer output +.>And a local encoder hidden layer vector representation +.>Wherein matrix A is 1 For taking->Conversion to a potential space, matrix A 2 For connectingConverting into a potential space, wherein sigma is a Sigmoid activation function, v T Is a dimension conversion matrix;
according to the formulaCalculating a weighting factor alpha, wherein ∈>Is the global encoder hidden layer output; />Is a local encoder hidden layer vector representation;
according to the formulaCalculating the intention coefficient of the monitoring person in the monitoring sequence, wherein a tj Is a weighting factor;is a local encoder hidden layer vector representation;
the model construction module is specifically used for:
and constructing a joint coding monitoring strategy recommendation model containing a global encoder and a local encoder by using a deep learning cyclic neural network structure, wherein the global encoder is used for summarizing the whole monitoring sequence, and the local encoder is used for adaptively selecting important items in the current session.
4. The video surveillance array display optimization apparatus of claim 3, wherein the global encoder is configured to perform a joint encoding surveillance strategy recommendation model construction processIs integrated into c t In order to provide a sequential behavior representation of the joint coding monitoring policy recommendation model, global encoder +.>A hidden state is different from the effect of the local encoder, local encoder +.>For calculating the attention weight in the previous hidden state, and the global encoder +.>Is used to encode the entire sequence behavior.
CN202110802969.0A 2021-07-15 2021-07-15 Video monitoring array display optimization method and device based on joint coding Active CN113467740B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110802969.0A CN113467740B (en) 2021-07-15 2021-07-15 Video monitoring array display optimization method and device based on joint coding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110802969.0A CN113467740B (en) 2021-07-15 2021-07-15 Video monitoring array display optimization method and device based on joint coding

Publications (2)

Publication Number Publication Date
CN113467740A CN113467740A (en) 2021-10-01
CN113467740B true CN113467740B (en) 2024-02-02

Family

ID=77880520

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110802969.0A Active CN113467740B (en) 2021-07-15 2021-07-15 Video monitoring array display optimization method and device based on joint coding

Country Status (1)

Country Link
CN (1) CN113467740B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6795567B1 (en) * 1999-09-16 2004-09-21 Hewlett-Packard Development Company, L.P. Method for efficiently tracking object models in video sequences via dynamic ordering of features
US10289912B1 (en) * 2015-04-29 2019-05-14 Google Llc Classifying videos using neural networks
CN110119467A (en) * 2019-05-14 2019-08-13 苏州大学 A kind of dialogue-based item recommendation method, device, equipment and storage medium
CN110955826A (en) * 2019-11-08 2020-04-03 上海交通大学 Recommendation system based on improved recurrent neural network unit
CN111080400A (en) * 2019-11-25 2020-04-28 中山大学 Commodity recommendation method and system based on gate control graph convolution network and storage medium
WO2020104590A2 (en) * 2018-11-21 2020-05-28 Deepmind Technologies Limited Aligning sequences by generating encoded representations of data items
CN112488014A (en) * 2020-12-04 2021-03-12 重庆邮电大学 Video prediction method based on gated cyclic unit

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3586276A1 (en) * 2017-02-24 2020-01-01 Google LLC Sequence processing using online attention
US20180247199A1 (en) * 2017-02-24 2018-08-30 Qualcomm Incorporated Method and apparatus for multi-dimensional sequence prediction

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6795567B1 (en) * 1999-09-16 2004-09-21 Hewlett-Packard Development Company, L.P. Method for efficiently tracking object models in video sequences via dynamic ordering of features
US10289912B1 (en) * 2015-04-29 2019-05-14 Google Llc Classifying videos using neural networks
WO2020104590A2 (en) * 2018-11-21 2020-05-28 Deepmind Technologies Limited Aligning sequences by generating encoded representations of data items
CN110119467A (en) * 2019-05-14 2019-08-13 苏州大学 A kind of dialogue-based item recommendation method, device, equipment and storage medium
CN110955826A (en) * 2019-11-08 2020-04-03 上海交通大学 Recommendation system based on improved recurrent neural network unit
CN111080400A (en) * 2019-11-25 2020-04-28 中山大学 Commodity recommendation method and system based on gate control graph convolution network and storage medium
CN112488014A (en) * 2020-12-04 2021-03-12 重庆邮电大学 Video prediction method based on gated cyclic unit

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Sequence Generation Network Based on Hierarchical Attention for Multi-Charge Prediction;baosen ma;IEEE;全文 *
基于循环时间卷积网络的序列流推荐算法;李太松;贺泽宇;王冰;颜永红;唐向红;;计算机科学(第03期);全文 *
基于深度学习的推荐技术研究及应用;史冬霞;中国优秀硕士论文全文数据库·信息科技辑;全文 *
融合知识库和深度学习的电网监控告警事件智能识别;孙国强等;电力自动化设备;第40卷(第4期);全文 *

Also Published As

Publication number Publication date
CN113467740A (en) 2021-10-01

Similar Documents

Publication Publication Date Title
Bi et al. Graph-based spatio-temporal feature learning for neuromorphic vision sensing
CN110929622B (en) Video classification method, model training method, device, equipment and storage medium
CN107808139B (en) Real-time monitoring threat analysis method and system based on deep learning
Peng et al. Bag of events: An efficient probability-based feature extraction method for AER image sensors
CN110532996A (en) The method of visual classification, the method for information processing and server
CN111339818B (en) Face multi-attribute recognition system
CN107818307B (en) Multi-label video event detection method based on LSTM network
CN110659391A (en) Video detection method and device
CN110222592B (en) Construction method of time sequence behavior detection network model based on complementary time sequence behavior proposal generation
CN112052387A (en) Content recommendation method and device and computer readable storage medium
CN111914676A (en) Human body tumbling detection method and device, electronic equipment and storage medium
CN112766119A (en) Method for accurately identifying strangers and constructing community security based on multi-dimensional face analysis
CN114550053A (en) Traffic accident responsibility determination method, device, computer equipment and storage medium
Venkatesvara Rao et al. Real-time video object detection and classification using hybrid texture feature extraction
CN113467740B (en) Video monitoring array display optimization method and device based on joint coding
Hadji et al. Region of interest and redundancy problem in migratory birds wild life surveillance
CN108921012B (en) Method for processing image video frame by using artificial intelligence chip
Kong et al. A novel ConvLSTM with multifeature fusion for financial intelligent trading
CN115329265A (en) Method, device and equipment for determining graph code track association degree and storage medium
CN111582031B (en) Multi-model collaborative violence detection method and system based on neural network
CN114581769A (en) Method for identifying houses under construction based on unsupervised clustering
Min et al. Online Fall Detection Using Attended Memory Reference Network
CN112153464A (en) Smart city management system
CN112579824A (en) Video data classification method and device, electronic equipment and storage medium
CN116091984B (en) Video object segmentation method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant