CN114815755A - Method for establishing distributed real-time intelligent monitoring system based on intelligent cooperative reasoning - Google Patents

Method for establishing distributed real-time intelligent monitoring system based on intelligent cooperative reasoning Download PDF

Info

Publication number
CN114815755A
CN114815755A CN202210576950.3A CN202210576950A CN114815755A CN 114815755 A CN114815755 A CN 114815755A CN 202210576950 A CN202210576950 A CN 202210576950A CN 114815755 A CN114815755 A CN 114815755A
Authority
CN
China
Prior art keywords
execution
network
time
transverse
camera
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210576950.3A
Other languages
Chinese (zh)
Inventor
胡清华
王卓航
王晓飞
赵云凤
刘志成
仇超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202210576950.3A priority Critical patent/CN114815755A/en
Publication of CN114815755A publication Critical patent/CN114815755A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B19/00Programme-control systems
    • G05B19/02Programme-control systems electric
    • G05B19/418Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM]
    • G05B19/4185Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM] characterised by the network communication
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/30Nc systems
    • G05B2219/31From computer integrated manufacturing till monitoring
    • G05B2219/31088Network communication between supervisor and cell, machine group

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Manufacturing & Machinery (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for establishing a distributed real-time intelligent monitoring system based on intelligent cooperative reasoning, which comprises the following steps: establishing a horizontal and vertical segmentation model based on a deep neural network by using a horizontal segmentation algorithm and a vertical segmentation algorithm; constructing a Markov decision process by using a transverse division point decision, a transverse execution node decision and a longitudinal execution node decision; the base station utilizes a DDQN algorithm to construct a division point execution equipment decision model by taking the minimized task processing time difference as a target; each monitoring terminal inputs a video stream into the decision-making model and uploads the video stream to a transverse execution node, the transverse execution node performs transverse division and execution by using a transverse division model, and the executed network parameters are sent to a longitudinal execution node; the vertical execution node vertically divides and executes the vertical execution network according to the vertical division algorithm; and the cloud receives the execution results from each longitudinal execution node, and completes the cross-camera track matching by using a track matching algorithm. The invention improves the effective utilization rate of system computing power.

Description

Method for establishing distributed real-time intelligent monitoring system based on intelligent cooperative reasoning
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a method for establishing a distributed real-time intelligent monitoring system based on intelligent cooperative reasoning.
Background
An Intelligent monitoring System (ISS) is an important application combining Deep Learning (DL) and Internet of Things (IoT), and Multiple Targets Multiple Camera Tracking (MTMCT) has been widely recognized as a promising solution for Intelligent monitoring systems. However, the current terminal devices have low memory, low power consumption and limited computing power, so that an Artificial Intelligence (AI) model deployed on these resource-limited devices needs to ensure inference delay and accuracy. In order to fully utilize the computing power of the system and minimize the system computing delay, a real-time video analysis system based on edge computing has become a new research hotspot in recent years. However, the existing system architecture lacks a fine-grained discussion of a deep learning model, and the cooperation problem among camera clusters is ignored in model splitting.
The key reason for the success of MTMCT is the explosive development of deep learning techniques and the internet of things, such as image recognition, target detection, and target tracking. Real-time video analysis based on the Internet of things can be realized by utilizing dynamic calculation unloading and resource allocation. The real-time video stream analysis architecture based on edge calculation verifies the possibility of edge calculation in the visual field. However, a wide range of application scenarios lead to different architectures in edge cloud clusters, and there is a lack of discussion on fine-grained Deep Neural Network (DNN) models. In the field of DNN model partitioning, considering cooperation between multi-layer clusters remains a problem to be solved. Thus, edge real-time video analytics systems still present three challenges: 1. offloading the complex DNN model to a limited edge device; 2. offload edge cloud heterogeneous clusters between tiers or nodes; 3. the method relieves the computing pressure of the cloud, fully utilizes the computing power of the system, and minimizes the computing delay in the scenes of multiple video streams. Specifically, the method comprises the following steps:
firstly, data in the MTMCT requires a huge computing power to completely release the potential, and cloud computing is one of the solutions. However, with the proliferation of the number of surveillance cameras, it is challenging to process such huge data only through cloud computing, which faces huge transmission pressures, high latency, expensive cost, and low security.
Secondly, a real-time video analysis system based on edge calculation becomes a research hotspot. Common system designs focus on solving the multi-target multi-camera tracking task by using a collaborative learning strategy, however, the deep neural network still runs in the cloud, resulting in system performance limitations. Existing systems achieve adaptively balancing workloads between smart cameras and partition workloads between cameras and edge clusters to achieve optimized system performance. However, the nature of DNN has not been considered, and artificial intelligence models deployed on these resource-constrained devices need to keep the number of parameters low while tolerating high inference delays and low accuracy. Therefore, it makes sense to combine DNN partitioning with cross-camera video processing.
Driven by the above trend, there are many types of research that pay attention to the splitting of DNN to reduce delay and save resources, and there are two basic classification strategies: horizontal segmentation and vertical segmentation. For the former, the horizontal division of the DNN utilizes the characteristics of the DNN to design a coarse-grained hierarchical calculation division strategy. By treating the DNN as a Directed Acyclic Graph (DAG), the problem of minimizing latency is shifted to an equivalent minimal partitioning problem. However, the horizontally split DNN does not enable parallel execution of the model network and increases the communication cost of intermediate parameters between devices. For vertical segmentation, the feature projections are convolutional layer partitioned based on the computational power of a single node, and then the outputs of all nodes in the host are combined. Vertical Partitioning applies a scalable convolutional layer Fused partition (FTP) policy to minimize memory footprint and reduce transmission between nodes. Vertical segmentation, while enabling both inter-convolutional layer and inter-convolutional layer implementation, requires more detailed discussion of models where residual structure and attention mechanisms exist.
The traditional cloud computing mode cannot handle huge data volume and suffers from heavy transmission pressure, high latency, expensive cost and low security. As shown in fig. 1, a process for deep learning based MTMCT computation is shown. The common MTMCT system flow is divided into the following three major parts: the multiple cameras generate video stream data in real time and upload the video stream to the cloud center for algorithm execution; the MTMCT algorithm deployed in the cloud center performs Target Detection (Target Detection), that is, a common deep neural network such as YOLOv4 is used to detect targets such as people, bikes, and bicycles in a video, and a network structure of YOLOv4 is shown in fig. 3; the MTMCT algorithm executes track association (TrackerAssociation) by using a Deepsort algorithm, namely, corresponding track characteristics are generated for targets in each video, the algorithm compares tracks of front and rear time of the same camera, the currently generated target track is matched with the target track which is detected in the past, track matching is performed between crossing cameras, and finally tracking results of the targets in a single camera and the video of the crossing cameras are obtained.
Disclosure of Invention
Aiming at the technical problem of lightweight model Inference of ISS resource-limited computing nodes, the invention provides a method for establishing a Distributed Real-Time Intelligent monitoring System based on Intelligent cooperative Inference, which realizes the Edge Intelligent cooperative Inference based Distributed Real-Time Intelligent monitoring System (Edge Intelligent Collaborative information for Distributed Real-Time Intelligent tracking System, EI-ISS), and is suitable for executing self-adaptive computing unloading and resource allocation of cooperative multi-camera tracking tasks on an Edge cloud System. In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a method for establishing an intelligent cooperative lightweight model of a distributed intelligent monitoring system comprises the following steps:
s1, constructing a video monitoring system comprising a monitoring terminal, a base station and a cloud;
s2, establishing a horizontal segmentation model based on a deep neural network by using a horizontal segmentation and vertical segmentation algorithm, wherein the output of the horizontal segmentation model comprises a horizontal execution network and a vertical execution network;
s3, constructing a transverse division point decision, a transverse execution node decision and a longitudinal execution node decision of the transverse division model into a Markov decision process;
s4, the cloud utilizes a DDQN multi-agent deep reinforcement learning algorithm and takes the minimized task processing time difference as a target function to construct a division point execution equipment decision model;
s5, each monitoring terminal respectively inputs the generated video stream into a division point execution equipment decision model, uploads the video stream to a corresponding transverse execution node according to a transverse execution node decision generated by the division point execution equipment decision model, and the transverse execution node completes transverse division and execution of the model by using the transverse division model and sends network parameters after transverse execution to a corresponding longitudinal execution node;
s6, the longitudinal execution node respectively utilizes the longitudinal division algorithm to longitudinally divide and execute the respective longitudinal execution network according to the longitudinal execution node decision generated by the division point execution equipment decision model;
and S7, the cloud receives the execution results from the longitudinal execution nodes and completes cross-camera track matching by using a track matching algorithm.
The step S2 includes the following steps:
s2.1, constructing the deep neural network model into a directed acyclic graph;
s2.2, obtaining the longest path from the input layer to the output layer in the deep neural network model by using a depth-first algorithm;
s2.3, dividing the longest path obtained in the step S2.2 into a transverse execution network by utilizing a transverse division algorithm
Figure BDA0003662540030000031
And vertical execution network
Figure BDA0003662540030000032
The horizontal execution network
Figure BDA0003662540030000033
And vertical execution network
Figure BDA0003662540030000034
The expression of the network parameters of (a) is:
Figure BDA0003662540030000035
wherein n is 0 or 1, and when n is 0,
Figure BDA0003662540030000036
represents the transverse division point p h Partitioned transverse execution network
Figure BDA0003662540030000037
When n is 1,
Figure BDA0003662540030000038
represents the transverse division point p h Partitioned vertical execution network
Figure BDA0003662540030000039
The network parameters of (a) are set,
Figure BDA00036625400300000310
representing the Resunit network structure at a transverse division point p h Is divided into
Figure BDA00036625400300000311
Number ratio, omega, in the network layer of a network Resunit A parameter indicating the structure of the reset network,
Figure BDA00036625400300000312
represents CBL conv+bn+leakyRelu Network structure is divided into points p in transverse direction h Is divided into
Figure BDA00036625400300000313
Number ratio in the network, ω CBL Represents CBL conv+bn+leakyRelu The parameters of the network structure are such that,
Figure BDA00036625400300000314
representing CBM conv+bn+mish Network structure is divided into points p in transverse direction h Is divided into
Figure BDA00036625400300000315
Number ratio in the network, ω CBM Representing CBM conv+bn+mish Parameters of the network structure.
The step S3 includes the following steps:
s3.1, constructing a state space, wherein the expression of the state space is as follows:
Figure BDA00036625400300000316
in the formula (I), the compound is shown in the specification,
Figure BDA00036625400300000317
presentation execution camera d m The execution time of the detection task of the video stream of (2),
Figure BDA00036625400300000318
indicating camera d m In the executing node
Figure BDA00036625400300000319
And executing node
Figure BDA00036625400300000320
The delay in the transmission between the first and second,
Figure BDA00036625400300000321
indicating camera d m Detecting task of video stream at execution node
Figure BDA00036625400300000322
The execution time of the first time slot is greater than the execution time of the second time slot,
Figure BDA00036625400300000323
indicating camera d m Detecting task of video stream at execution node
Figure BDA0003662540030000041
The waiting time of the upper computer system is shorter,
Figure BDA0003662540030000042
representing base station and cloud in-phase dividing camera d m States under the influence of other agents besides the agent, including the number of tasks waiting to be executed, the amount of tasks and the predicted completion time of the currently executed task in the base station and the cloud, p m Indicating camera d m The network corresponding to the video stream of (a),
Figure BDA0003662540030000043
indicating camera d m In the state at the time t,
Figure BDA0003662540030000044
Figure BDA0003662540030000045
epsilon represents the set of base stations and,
Figure BDA0003662540030000046
the representation of a cloud is shown,
Figure BDA0003662540030000047
Figure BDA0003662540030000048
is a collection of monitoring terminals;
s3.2, constructing an action space, wherein the expression is as follows:
Figure BDA0003662540030000049
in the formula (I), the compound is shown in the specification,
Figure BDA00036625400300000410
presentation cameraMachine d m An action at time t;
s3.3, constructing a reward function, wherein the expression of the reward function is as follows:
Figure BDA00036625400300000411
in the formula, mu represents a penalty weight, rho represents the tolerance of the maximum time difference of all the cameras for completing the detection task at the moment t, Z represents the maximum time tolerance of completing the detection task of a single camera,
Figure BDA00036625400300000412
indicating camera d 0 In the motion space at the time t,
Figure BDA00036625400300000413
indicating camera d M-1 In the motion space at the time t,
Figure BDA00036625400300000414
indicating camera d 0 In the state space at the time t,
Figure BDA00036625400300000415
indicating camera d M-1 In the state space at the time t,
Figure BDA00036625400300000416
indicating camera d m The reward function at the time of t,
Figure BDA00036625400300000417
presentation execution camera d m The maximum execution time of the detection task of the video stream of (1),
Figure BDA00036625400300000418
presentation execution camera d m The shortest execution time of the detection task of the video stream.
In step S3.1, the execution camera d m Execution time of detection task of video stream
Figure BDA00036625400300000419
The calculation formula of (2) is as follows:
Figure BDA00036625400300000420
in the formula (I), the compound is shown in the specification,
Figure BDA00036625400300000421
indicating camera d m Detection task of video stream is executed in vertical direction
Figure BDA00036625400300000422
The execution time of the first time slot is greater than the execution time of the second time slot,
Figure BDA00036625400300000423
indicating camera d m To the horizontal execution node
Figure BDA00036625400300000424
The transmission delay of (1).
The camera d m Detection task of video stream is executed in horizontal direction
Figure BDA00036625400300000425
Waiting time of
Figure BDA00036625400300000426
The calculation formula of (2) is as follows:
Figure BDA00036625400300000427
in the formula (I), the compound is shown in the specification,
Figure BDA00036625400300000428
indicating camera d m′ Transmitting detection task of video stream to transverse execution node
Figure BDA00036625400300000429
Is delayed in transmission, and
Figure BDA00036625400300000430
Figure BDA00036625400300000431
is an index function.
The camera d m To the horizontal execution node
Figure BDA00036625400300000432
Is delayed
Figure BDA00036625400300000433
The calculation formula of (2) is as follows:
Figure BDA00036625400300000434
in the formula (I), the compound is shown in the specification,
Figure BDA0003662540030000051
indicating camera d m And a horizontal execution node
Figure BDA0003662540030000052
Bandwidth speed between, K m Indicating camera d m The actual amount of data of the video stream is transmitted per second.
The camera d m Detection task of video stream is executed in horizontal direction
Figure BDA0003662540030000053
Execution time on
Figure BDA0003662540030000054
The calculation formula of (2) is as follows:
Figure BDA0003662540030000055
in the formula (I), the compound is shown in the specification,
Figure BDA0003662540030000056
indicating camera d m Corresponds to the regression parameter of the execution time,
Figure BDA0003662540030000057
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned transverse execution network
Figure BDA0003662540030000058
In the horizontal execution node
Figure BDA0003662540030000059
The output parameters of the last execution are output,
Figure BDA00036625400300000510
to represent
Figure BDA00036625400300000511
The regression parameters corresponding to the execution time are,
Figure BDA00036625400300000512
representing a horizontal execution network
Figure BDA00036625400300000513
Corresponds to the regression parameter of the execution time,
Figure BDA00036625400300000514
representing a horizontal execution network
Figure BDA00036625400300000515
In the horizontal execution node
Figure BDA00036625400300000516
The regression parameters of (a) above (b),
Figure BDA00036625400300000517
constant term, K, representing a fitting performed by time regression m Indicating camera d m The actual amount of data of the video stream is transmitted per second,
Figure BDA00036625400300000518
representing horizontally executing nodes
Figure BDA00036625400300000519
The computing power of (a) is determined,
Figure BDA00036625400300000520
to calculate
Figure BDA00036625400300000521
The required computational power of the fitting function is regressed,
Figure BDA00036625400300000522
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned transverse execution network
Figure BDA00036625400300000523
The network parameter of (2).
The camera d m The detection task of the video stream is executed on the horizontal execution node
Figure BDA00036625400300000524
And a vertical execution node
Figure BDA00036625400300000525
Transmission delay therebetween
Figure BDA00036625400300000526
The calculation formula of (2) is as follows:
Figure BDA00036625400300000527
in the formula (I), the compound is shown in the specification,
Figure BDA00036625400300000528
representing horizontally executing nodes
Figure BDA00036625400300000529
And a vertical execution node
Figure BDA00036625400300000530
The speed of the bandwidth in between is,
Figure BDA00036625400300000531
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned transverse execution network
Figure BDA00036625400300000532
In the horizontal execution node
Figure BDA00036625400300000533
And outputting the executed output parameters.
In step S4, the expression of the objective function is:
Figure BDA00036625400300000534
in the formula, T Total Representing the maximum processing time difference of a video detection task periodically generated by a camera cluster, P representing a transverse division point set of a deep neural network model, Y 0 A set of transverse execution nodes Y representing a transverse execution network divided by each transverse division point in the set of transverse division points P 1 Represents a longitudinal execution node set of a longitudinal execution network divided by each transverse division point in the transverse division point set P, P represents the tolerance of the maximum time difference of finishing detection tasks of all cameras at the time t,
Figure BDA0003662540030000061
presentation execution camera d m The maximum execution time of the detection task of the video stream of (1),
Figure BDA0003662540030000062
presentation execution camera d m The shortest execution time of the detection task of the video stream of (2),
Figure BDA0003662540030000063
Figure BDA0003662540030000064
is a collection of monitoring terminals.
The step S6 includes the following steps:
s6.1, dividing the longitudinal execution network into l x l fusion areas according to a l x l grid method;
s6.2, establishing a relation among network parameters, input and output of the longitudinal execution network by using a regression function, wherein the corresponding expression is as follows:
Figure BDA0003662540030000065
in the formula (I), the compound is shown in the specification,
Figure BDA0003662540030000066
representing a horizontal execution network
Figure BDA0003662540030000067
In the horizontal execution node
Figure BDA0003662540030000068
The regression fitting parameters of the intermediate parameters output after the execution and the execution time of the longitudinal execution network are completed,
Figure BDA0003662540030000069
representing vertical execution networks
Figure BDA00036625400300000610
Executing nodes in the vertical direction
Figure BDA00036625400300000611
Regression fitting parameters of the output parameters after completion of executionThe number of the first and second groups is,
Figure BDA00036625400300000612
representing vertical execution networks
Figure BDA00036625400300000613
The regression-fit parameters of the parameters,
Figure BDA00036625400300000614
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned vertical execution network
Figure BDA00036625400300000615
Executing nodes in the vertical direction
Figure BDA00036625400300000616
The output parameter after the execution is finished,
Figure BDA00036625400300000617
representing vertical execution networks
Figure BDA00036625400300000618
Executing nodes in the vertical direction
Figure BDA00036625400300000619
The above regression fit constant term parameters,
Figure BDA00036625400300000620
representing vertical execution nodes
Figure BDA00036625400300000621
The computing power of (a) is determined,
Figure BDA00036625400300000622
representation calculation
Figure BDA00036625400300000623
The required computational power of the fitting function is regressed,
Figure BDA00036625400300000624
to represent
Figure BDA00036625400300000625
The regression of the parameters of (a) to (b),
Figure BDA00036625400300000626
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned vertical execution network
Figure BDA00036625400300000627
The network parameters of (a) are set,
Figure BDA00036625400300000628
indicating camera d m Detection task of video stream is executed in vertical direction
Figure BDA00036625400300000629
The execution time of the first time slot is greater than the execution time of the second time slot,
Figure BDA00036625400300000630
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned transverse execution network
Figure BDA00036625400300000631
In the horizontal execution node
Figure BDA00036625400300000632
An executed output parameter;
s6.3, each vertical execution node
Figure BDA00036625400300000633
And carrying out parallel computation on the received longitudinal execution network, and merging and outputting the computed results.
The invention has the beneficial effects that:
the computing power requirement, the time cost and the network parameters of the complex DNN target identification algorithm are modeled and represented by using a regression equation, the transverse segmentation and vertical segmentation methods are combined, the computing power of the base station is fully utilized, and the computing time cost of the complex DNN model is reduced. In addition, the method adopts a solution based on deep reinforcement learning to obtain an approximate optimal solution of dynamic resource allocation, accelerates the cooperative reasoning speed of the edge cloud system, improves the effective utilization rate of system computing power, and has better effect than a baseline through experiments. The model is divided into two parts by the transverse division algorithm according to the DNNs level granularity, so that the computing capacity utilization rate of the system is improved, and the higher cloud service cost is reduced. The longitudinal segmentation algorithm is based on the FTP strategy, so that the memory occupation can be minimized, the transmission between nodes can be reduced, and the parallel computation of the model can be realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flow of a deep learning based MTMCT system.
Fig. 2 is a workflow flow of a base station and cloud based MTMCT system.
Fig. 3 is a network structure diagram of YOLOv4 in the prior art.
Fig. 4 is a schematic diagram of segmentation based on the horizontal segmentation method and the vertical segmentation method.
FIG. 5 is a schematic diagram of a multi-agent reinforcement learning training system.
Fig. 6 is a flow chart of modeling according to the present application.
Fig. 7 is a graph showing the effect of the present application compared with other algorithms.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.
A method for establishing a distributed real-time intelligent monitoring system based on intelligent collaborative reasoning, as shown in fig. 1 and 2, includes the following steps:
s1, constructing a video monitoring system comprising a monitoring terminal, a base station and a cloud;
the video monitoring system adopts
Figure BDA0003662540030000071
It is shown that the process of the present invention,
Figure BDA0003662540030000072
Figure BDA0003662540030000073
representing a collection of monitoring terminals, i.e. cameras, each of which is an agent,
Figure BDA0003662540030000074
m represents a monitoring terminal set
Figure BDA0003662540030000075
The number of the monitoring terminals, epsilon represents the set of base stations, epsilon is { e } 0 ,e 1 ,...,e N-1 N denotes the number of base stations in the set of base stations epsilon,
Figure BDA0003662540030000076
the representation of a cloud is shown,
Figure BDA0003662540030000077
each monitoring terminal is in wireless connection with all base stations, any two base stations are in wired connection, and all the base stations are in wired connection with the cloud, so that transmission of video streams is achieved. Each base station and the cloud are execution nodes which are responsible for shooting the video cameraThe target tracking is realized by detecting and calculating the video stream, and the computing power of the execution node i is represented as C i And is and
Figure BDA0003662540030000078
represents the speed per bit of a CPU or GPU disk; by the use of b i,m Indicating camera d m Bandwidth speed between execution node i in kb/s, camera d m Generating x per second m The larger the batch size, the more memory and computational power are consumed in image detection, due to the limited memory of the execution nodes in the video stream of frames. Therefore, to alleviate the overall stress on the system, it is believed that the camera will drop some frames per second, with the frame loss rate set to F. Camera d m The number of valid data frames per second is f m =(1-F)·x m Actual data amount of transmitted video stream is K m =k·f m Where k is the average unit amount per video frame. If the executing node i will execute the detecting camera d m Detecting and tracking task of video stream, hereinafter referred to as detecting task of camera, then the camera d is detected m The transmission delay of the video stream to the execution node i is expressed as
Figure BDA0003662540030000081
S2, designing a horizontal segmentation model based on a deep neural network by using a horizontal segmentation algorithm, wherein the output of the horizontal segmentation model comprises a horizontal execution network and a vertical execution network, as shown in FIG. 4, the method comprises the following steps:
s2.1, constructing a Deep Neural Network (DNN) model into a directed acyclic graph;
the directed acyclic graph adopts
Figure BDA0003662540030000082
It is shown that,
Figure BDA0003662540030000083
Figure BDA0003662540030000084
representing a set of vertices and also the DNN model network fabric layers,
Figure BDA0003662540030000085
Figure BDA0003662540030000086
a set of edges is represented that are,
Figure BDA0003662540030000087
representing a vertex v j Is a vertex v in a network structure of the DNN model j′ An output of, and
Figure BDA0003662540030000088
the present application uses YOLOv4 as a DNN model of a cross-camera tracking task, as shown in fig. 3, YOLOv4 has a large number of residual structures, and the YOLOv4 network mainly includes the following three network structures: CBL, CBM, ResUnit are as network element, the network structure of YOLOv4 network is prior art, and the detailed description is not repeated in this application.
S2.2, obtaining the longest path from the input layer to the output layer in the DNN model by using a depth-first algorithm;
s2.3, dividing the longest path obtained in the step S2.2 into a transverse execution network by utilizing a transverse division algorithm
Figure BDA0003662540030000089
And vertical execution network
Figure BDA00036625400300000810
The horizontal execution network
Figure BDA00036625400300000811
And vertical execution network
Figure BDA00036625400300000812
The expression of the network parameters of (a) is:
Figure BDA00036625400300000813
wherein n is 0 or 1, and when n is 0,
Figure BDA00036625400300000814
represents the transverse division point p h Partitioned transverse execution network
Figure BDA00036625400300000815
When n is 1,
Figure BDA00036625400300000816
represents the transverse division point p h Partitioned vertical execution network
Figure BDA00036625400300000817
The network parameters of (a) are set,
Figure BDA00036625400300000818
Figure BDA00036625400300000819
a set of lateral segmentation points is represented,
Figure BDA00036625400300000820
and is
Figure BDA00036625400300000821
P represents a set of transverse segmentation points
Figure BDA00036625400300000822
The number of the elements in the Chinese character,
Figure BDA00036625400300000823
represents the transverse division point p h Is divided into
Figure BDA00036625400300000824
The number ratio, omega, of the network structure of the Resunit network in the network in this network layer Resunit Representation of a Resunit network structureIs determined by the parameters of (a) and (b),
Figure BDA00036625400300000825
represents the transverse division point p h Is divided into
Figure BDA00036625400300000826
CBL in a network conv+bn+leakyRelu The number ratio, ω, of the network structure in the network layer CBL Represents CBL conv+bn+leakyRelu The parameters of the network structure are such that,
Figure BDA00036625400300000827
represents the transverse division point p h Is divided into
Figure BDA00036625400300000828
CBM in a network conv+bn+mish The number ratio, ω, of the network structure in the network layer CBM Representing CBM conv+bn+mish Parameters of the network structure.
S3, constructing a transverse segmentation point decision, a transverse execution node decision and a longitudinal execution node decision of the transverse segmentation model into a Markov decision process, comprising the following steps:
s3.1, constructing a state space, wherein the expression is as follows:
Figure BDA0003662540030000091
in the formula (I), the compound is shown in the specification,
Figure BDA0003662540030000092
presentation execution camera d m The execution time of the detection task of the video stream of (2),
Figure BDA0003662540030000093
indicating camera d m In the horizontal execution node
Figure BDA0003662540030000094
And a vertical execution node
Figure BDA0003662540030000095
The delay in the transmission between the first and second nodes,
Figure BDA0003662540030000096
indicating camera d m Detection task of video stream is executed in horizontal direction
Figure BDA0003662540030000097
The execution time of the first time slot is greater than the execution time of the second time slot,
Figure BDA0003662540030000098
indicating camera d m Detection task of video stream is executed in horizontal direction
Figure BDA0003662540030000099
The waiting time of the upper computer system is shorter,
Figure BDA00036625400300000910
representing base station and cloud in-phase dividing camera d m States under the influence of other agents besides the agents include the number of tasks waiting to be executed in the base station and the cloud, the size of the tasks, namely the task amount, and the predicted completion time of the currently executed task, p m Indicating camera d m The network corresponding to the video stream of (a),
Figure BDA00036625400300000911
indicating camera d m At time t, and
Figure BDA00036625400300000912
the execution camera d m Execution time of detection task of video stream
Figure BDA00036625400300000913
The calculation formula of (2) is as follows:
Figure BDA00036625400300000914
in the formula (I), the compound is shown in the specification,
Figure BDA00036625400300000915
indicating camera d m Detection task of video stream is executed in vertical direction
Figure BDA00036625400300000916
The execution time of the first time slot is greater than the execution time of the second time slot,
Figure BDA00036625400300000917
indicating camera d m To the horizontal execution node
Figure BDA00036625400300000918
The transmission delay of (1).
Latency delays may occur when multiple cameras select the same execution node for computation. All these transmitted tasks are stored In a task queue of the execution node and executed In sequence on the execution node according to the First In First Out (FIFO) principle, so that the camera d m Detection task of video stream is executed in horizontal direction
Figure BDA00036625400300000919
Waiting time of
Figure BDA00036625400300000920
The calculation formula of (2) is as follows:
Figure BDA00036625400300000921
in the formula (I), the compound is shown in the specification,
Figure BDA00036625400300000922
indicating camera d m Transfer of detection tasks of the video stream to the horizontal execution node
Figure BDA00036625400300000923
Is delayedAnd is and
Figure BDA00036625400300000924
Figure BDA00036625400300000925
for an index function, the value of the index function is 1 when the parameter of the index function is true, and 0 otherwise.
The camera d m To the horizontal execution node
Figure BDA00036625400300000926
Is delayed
Figure BDA00036625400300000927
The calculation formula of (2) is as follows:
Figure BDA00036625400300000928
in the formula (I), the compound is shown in the specification,
Figure BDA00036625400300000929
indicating camera d m And a horizontal execution node
Figure BDA00036625400300000930
Bandwidth speed in between.
Construction of Camera d Using regression function m Horizontal division point p of network corresponding to detection task of video stream m Partitioned transverse execution network
Figure BDA00036625400300000931
With respect to the network parameters and inputs and outputs, said camera d m Detection task of video stream is executed in horizontal direction
Figure BDA00036625400300000932
Execution time on
Figure BDA00036625400300000933
The calculation formula of (2) is as follows:
Figure BDA0003662540030000101
in the formula (I), the compound is shown in the specification,
Figure BDA0003662540030000102
indicating camera d m Corresponds to the regression parameter of the execution time,
Figure BDA0003662540030000103
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned transverse execution network
Figure BDA0003662540030000104
In the horizontal execution node
Figure BDA0003662540030000105
The output parameters of the last execution are output,
Figure BDA0003662540030000106
to represent
Figure BDA0003662540030000107
The regression parameters corresponding to the execution time are,
Figure BDA0003662540030000108
representing a horizontal execution network
Figure BDA0003662540030000109
Corresponds to the regression parameter of the execution time,
Figure BDA00036625400300001010
representing a horizontal execution network
Figure BDA00036625400300001011
In the horizontal execution node
Figure BDA00036625400300001012
The regression parameters of (a) above (b),
Figure BDA00036625400300001013
a constant term representing a time-regression fit is performed.
Figure BDA00036625400300001014
Representing horizontally executing nodes
Figure BDA00036625400300001015
The computing power of (a) is determined,
Figure BDA00036625400300001016
to calculate
Figure BDA00036625400300001017
The required computational power of the fitting function is regressed,
Figure BDA00036625400300001018
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned transverse execution network
Figure BDA00036625400300001019
The network parameter of (2).
The camera d m The detection task of the video stream is executed on the horizontal execution node
Figure BDA00036625400300001020
And a vertical execution node
Figure BDA00036625400300001021
Transmission delay therebetween
Figure BDA00036625400300001022
The calculation formula of (2) is as follows:
Figure BDA00036625400300001023
in the formula (I), the compound is shown in the specification,
Figure BDA00036625400300001024
representing horizontally executing nodes
Figure BDA00036625400300001025
And a vertical execution node
Figure BDA00036625400300001026
Bandwidth speed in between.
S3.2, constructing an action space, wherein the expression is as follows:
Figure BDA00036625400300001027
in the formula (I), the compound is shown in the specification,
Figure BDA00036625400300001028
indicating camera d m The corresponding action of the video stream at time t is also called decision.
S3.3, constructing a reward function, wherein the expression of the reward function is as follows:
Figure BDA00036625400300001029
in the formula, mu represents a penalty weight, rho represents the tolerance of the maximum time difference of all the cameras for completing the detection task at the moment t, Z represents the maximum time tolerance of completing the detection task of a single camera,
Figure BDA00036625400300001030
indicating camera d 0 In the motion space at the time t,
Figure BDA00036625400300001031
indicating camera d M-1 In the motion space at the time t,
Figure BDA00036625400300001032
indicating camera d 0 In the state space at the time t,
Figure BDA00036625400300001033
indicating camera d M-1 In the state space at the time t,
Figure BDA00036625400300001034
indicating camera d m The reward function at the time of t,
Figure BDA00036625400300001035
presentation execution camera d m The maximum execution time of the detection task of the video stream of (1),
Figure BDA00036625400300001036
presentation execution camera d m The shortest execution time of the detection task of the video stream.
Figure BDA00036625400300001037
Reflecting the generated task completion time difference of all the camera machines at the time t.
S4, the cloud utilizes an algorithm of Deep Reinforcement Learning (DRL) of a DDQN (double Deep Q network) and takes a minimized task processing time difference as a target function to construct a decision model of a split point execution device, and the solution of the Markov decision process is realized;
the expression of the objective function is:
Figure BDA0003662540030000111
in the formula, T Total Representing the maximum processing time difference of video analysis tasks periodically generated by a camera cluster, P representing a transverse division point set of a DNN model, and Y 0 A set of transverse execution nodes Y representing a transverse execution network divided by each transverse division point in the set of transverse division points P 1 Representing each horizontal in the set P of horizontal division pointsAnd a vertical execution node set of the vertical execution network divided to the division point.
The method for constructing the segmentation point execution equipment decision model comprises the following steps:
a, setting training round
Figure BDA00036625400300001110
Initializing a training period tau to 1;
b, continuously generating video streams by each camera at the time t, and obtaining the action of each camera according to the current DDQN network;
c, distributing the video stream to different horizontal execution nodes according to the action;
d, after each transverse execution node finishes the transverse division task, transmitting the intermediate parameters to the longitudinal execution node, and finally finishing the execution of the detection task;
for cross-camera tracking tasks, all cameras should work in concert, i.e. the simultaneously generated detection tasks should be "as synchronized as possible" within a certain threshold.
e, all the longitudinal execution nodes transmit respective target recognition results to the cloud, and the cloud finishes cross-camera multi-target tracking and obtains a final detection result;
f, the cloud simultaneously obtains the rewards of all the agents under the current action
Figure BDA0003662540030000112
And new state
Figure BDA0003662540030000113
And will be
Figure BDA0003662540030000114
Storing the data into an experience pool;
g, the cloud gets persistent from experience pool through experience replay
Figure BDA0003662540030000115
Training and updating a DDQN model of the multi-agent, and making a decision at the next moment t + 1;
as shown in fig. 5 and 6, the DDQN has an evaluation network and a target network, and eliminates the problem of overestimation of DQN by decoupling policy evaluation from action selection and Q of the next action, and the target network updates the weights from the evaluation network periodically. That is, DDQN is theta according to the parameter t Selects the best action for the evaluation network and based on the parameters
Figure BDA0003662540030000116
The target network of (2) obtains a Q value. In distributed execution, the DDQN weights after centralized training are shared with all agents. Each agent has a homogenous state, action and reward space, which reduces the problem of the large number of weights that must be trained.
According to the method, each camera is modeled as an intelligent agent, and each camera
Figure BDA0003662540030000117
The corresponding agent is in accordance with the strategy pi and the current state
Figure BDA0003662540030000118
Selecting an action
Figure BDA0003662540030000119
Wherein S represents a set of all states, A represents a set of all actions, and actions
Figure BDA0003662540030000121
After interaction with the environment, the camera d m Corresponding agent earning rewards
Figure BDA0003662540030000122
And transition to a new state
Figure BDA0003662540030000123
And through the duration reward R, learning strategy pi, making the accumulated reward G t And (4) maximizing. The goal of the agent is to maximize
Figure BDA0003662540030000124
Figure BDA0003662540030000125
Where γ is the discount coefficient, R t+1 Reward, Q (S), for obtaining at time t +1 t+1 ) Denotes the Q value, argmax, obtained at time t +1 a Q(S t+1 ,a;θ t ) Representing the action that maximizes the state action cost function,
Figure BDA0003662540030000126
representing parameters of the target network. Each agent observes the state at time t
Figure BDA0003662540030000127
The base station and the cloud follow FIFO (first in first out) principle, and need to cooperate to complete resource allocation decision of the multi-agent, namely transverse division point decision and calculation unloading decision, namely transverse execution node decision and longitudinal execution node decision. Finding an optimal strategy pi for real-time load balancing of multiple consecutive video analytics tasks in a distributed scenario * Is a challenge. The goal of the EI-ISS is to minimize the processing time difference of the video analysis tasks periodically generated by the camera cluster, and the EI-ISS not only considers minimizing the processing time of each camera video task, but also ensures that the time difference between the cameras completing the task under the same moment task is minimized.
S5, each monitoring terminal respectively inputs the generated video stream into a dividing point executing equipment decision model, uploads the video stream to a transverse executing node according to a transverse executing node decision generated by the dividing point executing equipment decision model, the transverse executing node completes transverse division and execution of the model by using the transverse dividing model, and sends network parameters after transverse execution to a longitudinal executing node, and the method comprises the following steps:
s5.1, each monitoring terminal inputs the generated video stream into a decision model of a segmentation point execution device, and the decision model of the segmentation point execution device generates a transverse segmentation point decision, a transverse execution node decision and a longitudinal execution node decision;
s5.2, the monitoring terminals respectively transmit the respective video streams to the corresponding transverse execution nodes according to the transverse execution node decision;
s5.3, the transverse execution node utilizes the transverse division model to divide the network model corresponding to the video stream according to the transverse division point decision, and transverse execution of the transverse execution network is completed;
and S5.4, the transverse execution node transmits the network parameters after transverse execution to the longitudinal execution node corresponding to the longitudinal execution node decision.
S6, the vertical execution node respectively uses the vertical division algorithm to perform vertical division on the vertical execution network, namely, the target identification of the MTMCT is completed, the method comprises the following steps:
s6.1, dividing the neural network of the longitudinal execution network into l x l fusion areas according to a l x l grid method;
s6.2, constructing a longitudinal execution network by utilizing a regression function
Figure BDA0003662540030000128
The relationship between the input and output of the network parameter, the corresponding expression, i.e. camera d m Detection task of video stream is executed in vertical direction
Figure BDA0003662540030000129
Execution time on
Figure BDA00036625400300001210
The calculation formula of (2) is as follows:
Figure BDA0003662540030000131
in the formula (I), the compound is shown in the specification,
Figure BDA0003662540030000132
representing a horizontal execution network
Figure BDA0003662540030000133
In the horizontal execution node
Figure BDA0003662540030000134
The regression simulation parameters of the output intermediate parameters and the execution time of the longitudinal network after the execution is finished,
Figure BDA0003662540030000135
representing a longitudinally segmented network
Figure BDA0003662540030000136
Executing nodes in the vertical direction
Figure BDA0003662540030000137
Regression fitting parameters of the output parameters after completion of execution,
Figure BDA0003662540030000138
representing vertical execution networks
Figure BDA0003662540030000139
The regression-fit parameters of the parameters,
Figure BDA00036625400300001310
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned vertical execution network
Figure BDA00036625400300001311
Executing nodes in the vertical direction
Figure BDA00036625400300001312
The output parameters of the last execution are output,
Figure BDA00036625400300001313
representing vertical execution networks
Figure BDA00036625400300001314
Executing nodes in the vertical direction
Figure BDA00036625400300001315
The above regression fit constant term parameters,
Figure BDA00036625400300001316
representing vertical execution nodes
Figure BDA00036625400300001317
The computing power of (a) is determined,
Figure BDA00036625400300001318
representing computations
Figure BDA00036625400300001319
The required computational power of the fitting function is regressed,
Figure BDA00036625400300001320
to represent
Figure BDA00036625400300001321
The regression of the parameters of (a) to (b),
Figure BDA00036625400300001322
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned vertical execution network
Figure BDA00036625400300001323
The network parameter of (2).
S6.3, each vertical execution node
Figure BDA00036625400300001324
And respectively carrying out parallel computation on the received parameters output by the transverse execution nodes, and merging and outputting the computed results by utilizing a full connection layer.
And S7, the cloud receives the target identification results from each longitudinal execution node, cross-camera track matching is completed by adopting a Deepsort track matching algorithm, and finally a cross-camera multi-target tracking result is obtained.
According to the method and the device, regression and parameter fitting are respectively carried out on the computing resources of the camera, the edge device and the cloud, and corresponding dynamic environment representation is obtained. Then, in an online decision-making stage, a division point decision is carried out through the multi-agent according to a real dynamic network environment and user requirements, and finally a division result is obtained. The necessity of performing cross-camera, device decision model inference with segmentation points is set forth below:
as shown in fig. 1, a plurality of cameras are on and continuously generating a video stream. The camera requests to offload the video tracking task to the base station or the cloud node. And according to the decision result of the MADRL, the camera unloads the first part of the task to a transverse execution node, the node stores a pre-configuration file of the model, transversely divides the model and executes a transverse processing task. After that, the horizontal execution node needs to transmit the intermediate parameters to the node performing the vertical segmentation, and the vertical execution node accepts the network intermediate parameters and performs the vertical model segmentation and the parallel computation of the network according to the pre-configuration file. And uploading and storing the final target recognition result to a cloud center, and firstly performing camera internal tracking recognition on a single camera by the cloud center by using a DeepSort track matching algorithm. After the target recognition tasks are completed by the cameras, the cloud center completes track analysis and recognition among the cameras, and finally a tracking result is obtained. It can be seen that, in this process, the communication, calculation, and cache resources of the base station, that is, the edge and cloud systems, are all invoked, and therefore, in this process, the present application needs to be used to implement the joint allocation of the three-dimensional resources.
To demonstrate the superiority of the present application in a collaborative multitasking flow scenario, a greedy algorithm and a stochastic algorithm are compared with the present application, as shown in fig. 7. The Epsilon-Greedy algorithm makes optimal decisions based on current tasks, resource availability, and delay constraints. The discount factor and the minimum greedy rate are 0.0001 and 0.3, respectively. And the random algorithm carries out random decision, and does not consider the cooperative relationship between the edge devices. The MTMCT system requires coordinated cross-camera tracking, and the lengthy differences in task processing between cameras can result in loss and duplication of tracked objects. Therefore, based on a user-defined time threshold ρ, we have devised a metric to determine the success rate of collaboration between tasks. As can be seen from the figure, the cooperation rate of the DDQN gradually increases and converges, and finally reaches the cooperation time requirement provided by the user, and the cooperation rate of the greedy algorithm only fluctuates around 0.4, so that the cooperation superiority of the cross-camera tracking cooperation system is proved.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A method for establishing a distributed real-time intelligent monitoring system based on intelligent collaborative reasoning is characterized by comprising the following steps:
s1, constructing a video monitoring system comprising a monitoring terminal, a base station and a cloud;
s2, establishing a horizontal segmentation model based on a deep neural network by using a horizontal segmentation and vertical segmentation algorithm, wherein the output of the horizontal segmentation model comprises a horizontal execution network and a vertical execution network;
s3, constructing a transverse division point decision, a transverse execution node decision and a longitudinal execution node decision of the transverse division model into a Markov decision process;
s4, the cloud utilizes a DDQN multi-agent deep reinforcement learning algorithm and takes the minimized task processing time difference as a target function to construct a division point execution equipment decision model;
s5, each monitoring terminal respectively inputs the generated video stream into a division point execution equipment decision model, uploads the video stream to a corresponding transverse execution node according to a transverse execution node decision generated by the division point execution equipment decision model, and the transverse execution node completes transverse division and execution of the model by using the transverse division model and sends network parameters after transverse execution to a corresponding longitudinal execution node;
s6, the longitudinal execution node respectively utilizes the longitudinal division algorithm to longitudinally divide and execute the respective longitudinal execution network according to the longitudinal execution node decision generated by the division point execution equipment decision model;
and S7, the cloud receives the execution results from the longitudinal execution nodes and completes cross-camera track matching by using a track matching algorithm.
2. The method for establishing a distributed real-time intelligent monitoring system based on intelligent cooperative reasoning according to claim 1, wherein the step S2 comprises the following steps:
s2.1, constructing the deep neural network model into a directed acyclic graph;
s2.2, obtaining the longest path from the input layer to the output layer in the deep neural network model by using a depth-first algorithm;
s2.3, dividing the longest path obtained in the step S2.2 into a transverse execution network by utilizing a transverse division algorithm
Figure FDA0003662540020000011
And vertical execution network
Figure FDA0003662540020000012
The horizontal execution network
Figure FDA0003662540020000013
And vertical execution network
Figure FDA0003662540020000014
The expression of the network parameters of (a) is:
Figure FDA0003662540020000015
wherein n is 0 or 1, and when n is 0,
Figure FDA0003662540020000016
represents the transverse division point p h Partitioned transverse execution network
Figure FDA0003662540020000017
When n is 1,
Figure FDA0003662540020000018
represents the transverse division point p h Partitioned vertical execution network
Figure FDA0003662540020000019
The network parameters of (a) are set,
Figure FDA00036625400200000110
representing the Resunit network structure at a transverse division point p h Is divided into
Figure FDA00036625400200000111
Number ratio, omega, in the network layer of a network Resunit A parameter indicating the structure of the reset network,
Figure FDA00036625400200000112
represents CBL conv+bn+leakyRelu Network structure is divided into points p in transverse direction h Is divided into
Figure FDA00036625400200000113
Number ratio in the network, ω CBL Represents CBL conv+bn+leakyRelu The parameters of the network structure are such that,
Figure FDA00036625400200000114
representing CBM conv+bn+mish Network structure is divided into points p in transverse direction h Is divided into
Figure FDA0003662540020000021
Number ratio in the network, ω CBM Representing CBM conv+bn+mish Parameters of the network structure.
3. The method for establishing a distributed real-time intelligent monitoring system based on intelligent cooperative reasoning according to claim 1, wherein the step S3 comprises the following steps:
s3.1, constructing a state space, wherein the expression is as follows:
Figure FDA0003662540020000022
in the formula (I), the compound is shown in the specification,
Figure FDA0003662540020000023
presentation execution camera d m The execution time of the detection task of the video stream of (2),
Figure FDA0003662540020000024
indicating camera d m In the execution node
Figure FDA0003662540020000025
And executing node
Figure FDA0003662540020000026
The delay in the transmission between the first and second,
Figure FDA0003662540020000027
indicating camera d m Detecting task of video stream at execution node
Figure FDA0003662540020000028
The execution time of the first time slot is greater than the execution time of the second time slot,
Figure FDA0003662540020000029
indicating camera d m Detecting task of video stream at execution node
Figure FDA00036625400200000210
The waiting time of the upper computer system is shorter,
Figure FDA00036625400200000211
representing base station and cloud in-phase dividing camera d m Other than agent effectsThe state of the state comprises the number of tasks waiting to be executed, the task amount and the predicted completion time of the current task to be executed in the base station and the cloud, p m Presentation Camera d m The network corresponding to the video stream of (a),
Figure FDA00036625400200000212
indicating camera d m In the state at the time t,
Figure FDA00036625400200000213
Figure FDA00036625400200000214
epsilon represents the set of base stations and,
Figure FDA00036625400200000215
the representation of a cloud is shown,
Figure FDA00036625400200000216
Figure FDA00036625400200000217
is a collection of monitoring terminals;
s3.2, constructing an action space, wherein the expression is as follows:
Figure FDA00036625400200000218
in the formula (I), the compound is shown in the specification,
Figure FDA00036625400200000219
indicating camera d m An action at time t;
s3.3, constructing a reward function, wherein the expression of the reward function is as follows:
Figure FDA00036625400200000220
in the formula, mu represents a penalty weight, rho represents the tolerance of the maximum time difference of all cameras completing the detection task at the time t, Z represents the maximum time tolerance of completing the detection task of a single camera,
Figure FDA00036625400200000221
indicating camera d 0 In the motion space at the time t,
Figure FDA00036625400200000222
indicating camera d M-1 In the motion space at the time t,
Figure FDA00036625400200000223
indicating camera d 0 In the state space at the time t,
Figure FDA00036625400200000224
indicating camera d M-1 In the state space at the time t,
Figure FDA00036625400200000225
indicating camera d m The reward function at the time of t,
Figure FDA00036625400200000226
presentation execution camera d m The maximum execution time of the detection task of the video stream of (1),
Figure FDA00036625400200000227
presentation execution camera d m The shortest execution time of the detection task of the video stream.
4. Method for establishing a distributed real-time intelligent monitoring system based on intelligent cooperative reasoning according to claim 3, wherein in step S3.1, the execution camera d m Execution time of detection task of video stream
Figure FDA00036625400200000228
The calculation formula of (2) is as follows:
Figure FDA0003662540020000031
in the formula (I), the compound is shown in the specification,
Figure FDA0003662540020000032
indicating camera d m Detection task of video stream is executed in vertical direction
Figure FDA0003662540020000033
The execution time of the first time slot is greater than the execution time of the second time slot,
Figure FDA0003662540020000034
indicating camera d m To the horizontal execution node
Figure FDA0003662540020000035
The transmission delay of (1).
5. The method for establishing a distributed real-time intelligent monitoring system based on intelligent cooperative reasoning according to claim 4, wherein the camera d m Detection task of video stream is executed in horizontal direction
Figure FDA0003662540020000036
Waiting time of
Figure FDA0003662540020000037
The calculation formula of (2) is as follows:
Figure FDA0003662540020000038
in the formula (I), the compound is shown in the specification,
Figure FDA0003662540020000039
indicating camera d m′ Transmitting detection task of video stream to transverse execution node
Figure FDA00036625400200000310
Is delayed, and
Figure FDA00036625400200000311
Figure FDA00036625400200000312
is an index function.
6. The method for establishing the distributed real-time intelligent monitoring system based on intelligent cooperative reasoning according to claim 4, wherein the camera d is arranged m To the horizontal execution node
Figure FDA00036625400200000331
Is delayed
Figure FDA00036625400200000313
The calculation formula of (2) is as follows:
Figure FDA00036625400200000314
in the formula (I), the compound is shown in the specification,
Figure FDA00036625400200000315
indicating camera d m And a horizontal execution node
Figure FDA00036625400200000316
Bandwidth speed between, K m Indicating camera d m The actual amount of data of the video stream is transmitted per second.
7. The intelligence-based system of claim 3The establishment method of the distributed real-time intelligent monitoring system capable of collaborative reasoning is characterized in that the camera d m Detection task of video stream is executed in horizontal direction
Figure FDA00036625400200000317
Execution time on
Figure FDA00036625400200000318
The calculation formula of (2) is as follows:
Figure FDA00036625400200000319
in the formula (I), the compound is shown in the specification,
Figure FDA00036625400200000320
indicating camera d m Corresponds to the regression parameter of the execution time,
Figure FDA00036625400200000321
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned transverse execution network
Figure FDA00036625400200000322
In the horizontal execution node
Figure FDA00036625400200000323
The output parameters of the last execution are output,
Figure FDA00036625400200000324
to represent
Figure FDA00036625400200000325
The regression parameters corresponding to the execution time are,
Figure FDA00036625400200000326
representing a horizontal execution network
Figure FDA00036625400200000327
Corresponds to the regression parameter of the execution time,
Figure FDA00036625400200000328
representing a horizontal execution network
Figure FDA00036625400200000329
In the horizontal execution node
Figure FDA00036625400200000330
The regression parameters of (a) above (b),
Figure FDA0003662540020000041
constant term, K, representing a fitting performed by time regression m Indicating camera d m The actual amount of data of the video stream is transmitted per second,
Figure FDA0003662540020000042
representing horizontally executing nodes
Figure FDA0003662540020000043
The computing power of (a) is determined,
Figure FDA0003662540020000044
to calculate
Figure FDA0003662540020000045
The required computational power of the fitting function is regressed,
Figure FDA0003662540020000046
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned transverse execution network
Figure FDA0003662540020000047
The network parameter of (2).
8. The method for establishing a distributed real-time intelligent monitoring system based on intelligent cooperative reasoning according to claim 3, wherein the camera d m The detection task of the video stream is executed on the horizontal execution node
Figure FDA0003662540020000048
And a vertical execution node
Figure FDA0003662540020000049
Transmission delay therebetween
Figure FDA00036625400200000410
The calculation formula of (2) is as follows:
Figure FDA00036625400200000411
in the formula (I), the compound is shown in the specification,
Figure FDA00036625400200000412
representing horizontally executing nodes
Figure FDA00036625400200000413
And a vertical execution node
Figure FDA00036625400200000414
The speed of the bandwidth in between is,
Figure FDA00036625400200000415
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned transverse execution network
Figure FDA00036625400200000416
In the transverse directionNode point
Figure FDA00036625400200000417
And outputting the executed output parameters.
9. The method for establishing a distributed real-time intelligent monitoring system based on intelligent cooperative reasoning according to claim 1, wherein in step S4, the expression of the objective function is:
Figure FDA00036625400200000418
in the formula, T Total Representing the maximum processing time difference of a video detection task periodically generated by a camera cluster, P representing a transverse division point set of a deep neural network model, Y 0 A set of transverse execution nodes Y representing a transverse execution network divided by each transverse division point in the set of transverse division points P 1 Represents a longitudinal execution node set of a longitudinal execution network divided by each transverse division point in the transverse division point set P, P represents the tolerance of the maximum time difference of finishing detection tasks of all cameras at the time t,
Figure FDA00036625400200000419
presentation execution camera d m The maximum execution time of the detection task of the video stream of (1),
Figure FDA00036625400200000420
representing the shortest execution time for executing the detection task of the video stream of the camera dm,
Figure FDA00036625400200000421
Figure FDA00036625400200000422
is a collection of monitoring terminals.
10. The method for establishing a distributed real-time intelligent monitoring system based on intelligent cooperative reasoning according to claim 1, wherein the step S6 comprises the following steps:
s6.1, dividing the longitudinal execution network into l x l fusion areas according to a l x l grid method;
s6.2, establishing a relation among network parameters, input and output of the longitudinal execution network by using a regression function, wherein the corresponding expression is as follows:
Figure FDA0003662540020000051
in the formula (I), the compound is shown in the specification,
Figure FDA0003662540020000052
representing a horizontal execution network
Figure FDA0003662540020000053
In the horizontal execution node
Figure FDA0003662540020000054
The regression fitting parameters of the intermediate parameters output after the execution and the execution time of the longitudinal execution network are completed,
Figure FDA0003662540020000055
representing vertical execution networks
Figure FDA0003662540020000056
Executing nodes in the vertical direction
Figure FDA0003662540020000057
Regression fitting parameters of the output parameters after completion of execution,
Figure FDA0003662540020000058
representing vertical execution networks
Figure FDA0003662540020000059
The regression-fit parameters of the parameters,
Figure FDA00036625400200000510
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned vertical execution network
Figure FDA00036625400200000511
Executing nodes in the vertical direction
Figure FDA00036625400200000512
The output parameters of the last execution are output,
Figure FDA00036625400200000513
representing vertical execution networks
Figure FDA00036625400200000514
Executing nodes in the vertical direction
Figure FDA00036625400200000515
The above regression fit constant term parameters,
Figure FDA00036625400200000516
representing vertical execution nodes
Figure FDA00036625400200000517
The computing power of (a) is determined,
Figure FDA00036625400200000518
representation calculation
Figure FDA00036625400200000519
The required computational power of the fitting function is regressed,
Figure FDA00036625400200000520
to represent
Figure FDA00036625400200000521
The regression of the parameters of (a) to (b),
Figure FDA00036625400200000522
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned vertical execution network
Figure FDA00036625400200000523
The network parameters of (a) are set,
Figure FDA00036625400200000524
indicating camera d m Detection task of video stream is executed in vertical direction
Figure FDA00036625400200000525
The execution time of the execution,
Figure FDA00036625400200000526
indicating camera d m Horizontal division point p of network corresponding to detection task of video stream m Partitioned transverse execution network
Figure FDA00036625400200000527
In the horizontal execution node
Figure FDA00036625400200000528
An executed output parameter;
s6.3, each vertical execution node
Figure FDA00036625400200000529
And carrying out parallel computation on the received longitudinal execution network, and merging and outputting the computed results.
CN202210576950.3A 2022-05-25 2022-05-25 Method for establishing distributed real-time intelligent monitoring system based on intelligent cooperative reasoning Pending CN114815755A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210576950.3A CN114815755A (en) 2022-05-25 2022-05-25 Method for establishing distributed real-time intelligent monitoring system based on intelligent cooperative reasoning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210576950.3A CN114815755A (en) 2022-05-25 2022-05-25 Method for establishing distributed real-time intelligent monitoring system based on intelligent cooperative reasoning

Publications (1)

Publication Number Publication Date
CN114815755A true CN114815755A (en) 2022-07-29

Family

ID=82517695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210576950.3A Pending CN114815755A (en) 2022-05-25 2022-05-25 Method for establishing distributed real-time intelligent monitoring system based on intelligent cooperative reasoning

Country Status (1)

Country Link
CN (1) CN114815755A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117834643A (en) * 2024-03-05 2024-04-05 南京邮电大学 Deep neural network collaborative reasoning method for industrial Internet of things

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117834643A (en) * 2024-03-05 2024-04-05 南京邮电大学 Deep neural network collaborative reasoning method for industrial Internet of things
CN117834643B (en) * 2024-03-05 2024-05-03 南京邮电大学 Deep neural network collaborative reasoning method for industrial Internet of things

Similar Documents

Publication Publication Date Title
Baccour et al. Pervasive AI for IoT applications: A survey on resource-efficient distributed artificial intelligence
CN109818786B (en) Method for optimally selecting distributed multi-resource combined path capable of sensing application of cloud data center
Zhang et al. Towards green metaverse networking: Technologies, advancements and future directions
Djigal et al. Machine and deep learning for resource allocation in multi-access edge computing: A survey
Hou et al. Distredge: Speeding up convolutional neural network inference on distributed edge devices
CN113760511B (en) Vehicle edge calculation task unloading method based on depth certainty strategy
Zhou et al. Edge computation offloading with content caching in 6G-enabled IoV
Qi et al. Vehicular edge computing via deep reinforcement learning
WO2023175335A1 (en) A time-triggered federated learning algorithm
Xiao et al. Toward collaborative occlusion-free perception in connected autonomous vehicles
CN113626104A (en) Multi-objective optimization unloading strategy based on deep reinforcement learning under edge cloud architecture
CN116669111A (en) Mobile edge computing task unloading method based on blockchain
CN114815755A (en) Method for establishing distributed real-time intelligent monitoring system based on intelligent cooperative reasoning
CN116263681A (en) Mobile edge computing task unloading method, device, equipment and storage medium
Cui et al. Multi-Agent Reinforcement Learning Based Cooperative Multitype Task Offloading Strategy for Internet of Vehicles in B5G/6G Network
Peng et al. Dynamic visual SLAM and MEC technologies for B5G: a comprehensive review
Henna et al. Distributed and collaborative high-speed inference deep learning for mobile edge with topological dependencies
Ju et al. eDeepSave: Saving DNN inference using early exit during handovers in mobile edge environment
Wu et al. Adaptive client and communication optimizations in Federated Learning
CN115208892B (en) Vehicle-road collaborative online task scheduling method and system based on dynamic resource demand
CN113157344B (en) DRL-based energy consumption perception task unloading method in mobile edge computing environment
Huai et al. Towards deep learning on resource-constrained robots: A crowdsourcing approach with model partition
CN114916013A (en) Method, system and medium for optimizing unloading time delay of edge task based on vehicle track prediction
Li et al. ESMO: Joint frame scheduling and model caching for edge video analytics
CN114022731A (en) Federal learning node selection method based on DRL

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination