CN113507605B

CN113507605B - Dangerous area monitoring video compression storage method and system based on artificial intelligence

Info

Publication number: CN113507605B
Application number: CN202111058685.1A
Authority: CN
Inventors: 管英; 宋伟
Original assignee: Nantong Haiteng Copper Co ltd
Current assignee: NANTONGYOUYUAN ART DESIGN Co.,Ltd.
Priority date: 2021-09-10
Filing date: 2021-09-10
Publication date: 2021-12-07
Anticipated expiration: 2041-09-10
Also published as: CN113507605A

Abstract

The invention discloses a dangerous area monitoring video compression storage method and system based on artificial intelligence. The method comprises the following steps: and stacking the monitoring video image sequences of the dangerous areas and inputting the stacked monitoring video image sequences into a compression network to obtain a characteristic diagram of monitoring video information as compression data. Under the supervision of a compression network loss function, a compression network of a monitoring video sequence in a dangerous area, which can pay high attention to the position information and the track information of personnel, is trained. Compared with the prior art, the compression storage method can achieve a satisfactory compression storage effect: important personnel information is not lost as far as possible, and higher restoration degree of the storage characteristics after graphical compression is ensured.

Description

Dangerous area monitoring video compression storage method and system based on artificial intelligence

Technical Field

The invention relates to the field of artificial intelligence and video compression, in particular to a dangerous area monitoring video compression and storage method and system based on artificial intelligence.

Background

In large chemical enterprises, the production environment of the large chemical enterprises has a plurality of production or storage devices, such as reaction devices for chemical reactions of chemicals, storage devices for storing hazardous chemicals, and devices for providing high temperature and high pressure for chemical reactions, which are inevitably in failure or abnormal during operation, thereby causing dangerous accidents, and in severe cases, causing large-scale casualties, so that the whole production area is a dangerous area. The production staff need operate, patrol, overhaul and the like on the equipment, the staff are important participants for maintaining the safety of the area and timely discovering and handling dangers, and the staff act by action and concern the production safety of enterprises. In order to acquire the moving track and behavior of personnel in a production area, an enterprise needs to install a camera in the production area to acquire monitoring video images, and the video images can be used for evaluating the track risk and the working state of the personnel and early warning the safety risk, so that the enterprise can master details in the production process in time, and the production is ensured to be carried out safely. During production, a large amount of surveillance video is generated, and the surveillance video occupies a large amount of storage space. Although there are many methods for compressing video data in the prior art, on one hand, there is no method for compressing specific features or information in video data in the prior art, especially no compression method for the field concerned with personal safety; on the other hand, the compression ratio is low, so that the compressed data still contains much useless information.

Disclosure of Invention

In order to solve the above technical problems, an object of the present invention is to provide a method and a system for compressing and storing a surveillance video in a dangerous area based on artificial intelligence, which further compress the surveillance video based on the existing general video compression method, retain important personnel information and personnel track information in the device, and remove unimportant background information. The adopted technical scheme is as follows:

in a first aspect, an embodiment of the present invention provides a dangerous area monitoring video compression storage method based on artificial intelligence, including:

stacking the monitoring video image sequences of the dangerous areas and inputting the stacked monitoring video image sequences into a compression network to obtain a characteristic diagram of monitoring video information as compression data, wherein the loss of the compression network comprises the following steps: correcting the weight mask of each image of the monitoring video image sequence by using the minimum distance between personnel in each image and equipment in the dangerous area to obtain a first correction weight mask, correcting the difference of pixel values of each image corresponding to the monitoring video image sequence and the decompressed image sequence by using the first correction weight mask, and summing the correction values to obtain a first loss; acquiring the personnel position information difference of each image corresponding to the monitoring video image sequence and the decompressed image sequence and the distance difference between any two personnel positions; for the distance difference between any two person positions: obtaining a second weight according to the minimum distance between two persons in the monitoring video image and equipment in the dangerous area, and correcting the distance difference between the positions of the two persons by using the second weight to obtain a correction loss; summing the personnel position information difference and the correction loss to obtain a second loss; the first loss is summed with the second loss to obtain a loss of the compressed network.

Preferably, the loss of the compressed network further comprises: and correcting the difference of the pixel value of each image corresponding to the monitoring video image sequence and the decompressed image sequence by using the uncorrected weight mask of the monitoring video image sequence of the dangerous area, and summing the corrected values to obtain a third loss.

Preferably, the loss of the compressed network is the sum of the first loss, the second loss and the third loss.

Preferably, the compression network comprises a primary compression sub-network, the sequence of monitoring video images of the dangerous area is input into the primary compression network, and a single-channel characteristic diagram of each image which focuses more on the personnel information in the dangerous area is output under the supervision of a loss function of the primary compression network.

Preferably, the loss of the preliminary compression network comprises: and correcting the pixel value difference of each image corresponding to the monitoring video image sequence and the decompressed image sequence by using the uncorrected weight mask of the monitoring video image sequence in the dangerous area to obtain the loss of the preliminary compression network.

Preferably, the compression network further comprises a first encoder: and stacking the single-channel feature maps together to form a multi-channel feature map, inputting the multi-channel feature map into a first encoder, and outputting the feature map of the monitoring video information.

In a second aspect, another embodiment of the present invention provides an artificial intelligence-based dangerous area surveillance video compression storage system:

the compression network module is used for analyzing the input of the monitoring video image sequence in the dangerous area after being stacked to obtain a characteristic diagram of monitoring video information as compressed data; the loss of the compressed network module comprises: correcting the weight mask of each image of the monitoring video image sequence by using the minimum distance between personnel in each image and equipment in the dangerous area to obtain a first correction weight mask, correcting the difference of pixel values of each image corresponding to the monitoring video image sequence and the decompressed image sequence by using the first correction weight mask, and summing the correction values to obtain a first loss; acquiring the personnel position information difference of each image corresponding to the monitoring video image sequence and the decompressed image sequence and the distance difference between any two personnel positions; for the distance difference between any two person positions: obtaining a second weight according to the minimum distance between two persons in the monitoring video image and equipment in the dangerous area, and correcting the distance difference between the positions of the two persons by using the second weight to obtain a correction loss; summing the personnel position information difference and the correction loss to obtain a second loss; the first loss is summed with the second loss to obtain a loss of the compression network module.

Preferably, the loss of the compressed network module further comprises: and correcting the difference of the pixel value of each image corresponding to the monitoring video image sequence and the decompressed image sequence by using the uncorrected weight mask of the monitoring video image sequence of the dangerous area, and summing the corrected values to obtain a third loss.

Preferably, the loss of the compression network module is the sum of the first loss, the second loss and the third loss.

Preferably, the compression network module comprises a primary compression sub-network module, the sequence of the monitoring video images of the dangerous area is input into the primary compression sub-network module, and a single-channel feature map of each image which focuses more on the personnel information in the dangerous area is output under the supervision of a loss function of the primary compression sub-network module.

The invention has the following beneficial effects:

the method has the advantages that compression and decompression of monitoring video data are realized by designing a compression network, a decompression network and a loss function, compared with the existing video image compression method, the method mainly focuses on characteristic information of personnel in the video, the video compression ratio is greatly increased, in addition, the loss function is constructed by analyzing position information of the personnel and equipment, so that a compression result focuses on track information of the personnel far away from the equipment and focuses on self characteristic information of the personnel near the equipment, the video compression ratio is further greatly increased, the related information of the personnel can be accurately restored, and the decompressed video can analyze the activity track of the personnel in a dangerous area and the self information of the personnel such as behaviors, clothes, identities, working states and the like when the personnel are near the equipment; the compression degree is improved, meanwhile, useful information is greatly reserved, the storage space is saved, the analysis of the activity track and the working behavior and state of personnel in the enterprise monitoring dangerous area is facilitated, the enterprise is ensured to master details in the production process in time, and the production is ensured to be carried out safely.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a flowchart of a dangerous area surveillance video compression storage method based on artificial intelligence according to an embodiment of the present invention;

fig. 2 is a preliminary compression flowchart of a dangerous area surveillance video compression and storage method based on artificial intelligence according to an embodiment of the present invention.

Detailed Description

To further illustrate the technical means and effects of the present invention adopted to achieve the predetermined objects, the following detailed description will be given to a method and a system for compressing and storing a surveillance video in a dangerous area based on artificial intelligence according to the present invention, with reference to the accompanying drawings and preferred embodiments.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

In order to solve the problem of compression and storage of monitoring video images of dangerous areas and the problem of loss of important information of personnel during compression and storage of the monitoring video images of the dangerous areas, the method adds characteristic information of the personnel in a video image sequence into a compression network, greatly increases the video compression ratio, constructs a loss function through position information of analysts and equipment, enables a compression result to pay attention to track information of the personnel far away from the equipment and pay attention to self characteristic information of the personnel near the equipment, not only greatly increases the video compression ratio, but also can accurately retain relevant information of the personnel, and enables the decompressed video to analyze the movement track of the personnel in the dangerous areas and the self information of the personnel such as behaviors, clothes, identities, working states and the like when the personnel are near the equipment. The following describes a specific scheme of a dangerous area monitoring video compression storage method and system based on artificial intelligence in detail with reference to the accompanying drawings.

Specific example 1:

the embodiment provides a dangerous area monitoring video compression storage method based on artificial intelligence.

The specific scenes aimed by the invention are as follows: historical video data that has been in the database for some time is compressed. And only the track of the personnel in the video and the behavior and the working state of the personnel near the equipment are concerned, and no specific production scene is concerned. Because data such as the operation parameters of the equipment and the like can not be represented on the monitoring video, the video image information such as the operation data of the equipment and the like does not need to be concerned. The method is characterized in that a camera is installed in a production area of a chemical enterprise, the visual angle of the camera is downward in an overlooking and inclining mode, and people video data in the production area are collected, and the size of a video image collected by the camera is 2048 multiplied by 2048.

Referring to fig. 1, a flowchart of a dangerous area surveillance video compression storage method based on artificial intelligence according to an embodiment of the present invention is shown. The dangerous area monitoring video compression storage method based on artificial intelligence comprises the following steps:

stacking the monitoring video image sequences of the dangerous areas and inputting the stacked monitoring video image sequences into a compression network to obtain a characteristic diagram of monitoring video information as compression data, wherein the loss of the compression network comprises the following steps: correcting the weight mask of each image of the monitoring video image sequence by using the minimum distance between personnel in each image and equipment in the dangerous area to obtain a first correction weight mask, correcting the difference of pixel values of each image corresponding to the monitoring video image sequence and the decompressed image sequence by using the first correction weight mask, and summing the correction values to obtain a first loss; acquiring the personnel position information difference of each image corresponding to the monitoring video image sequence and the decompressed image sequence and the distance difference between any two personnel; distance difference for any two people: obtaining a second weight according to the minimum distance between two persons in the monitoring video image and equipment in the dangerous area, and correcting the distance difference between the two persons by using the second weight to obtain a correction loss; summing the personnel position information difference and the correction loss to obtain a second loss; the first loss is summed with the second loss to obtain a loss of the compressed network.

The specific implementation steps are as follows:

firstly, a surveillance video sequence of the danger area is obtained, and all the person attention distribution maps CAM on each frame image of the surveillance video sequence of the danger area and the individual attention distribution map of each person are obtained. The acquisition steps are as follows:

(1) the historical monitoring video data stored in the database is obtained, information contained in each frame of image on the video comprises personnel, equipment, background and the like, and each frame of image in the monitoring video is converted into a gray-scale image, so that the storage space is saved, the calculated amount is reduced, and the follow-up description of the invention is facilitated.

(2) Acquiring a frame of image in a monitoring video sequence, inputting the image into a DNN network, and acquiring a CAM image output by the DNN network. The invention takes the CAM image as a human attention distribution map on each frame of image. The personnel attention distribution map on each frame of image is a single-channel gray level map CAM, the position with large CAM gray level value shows that important personnel characteristic information distribution exists at the position, and the more the information of the position needs to be concerned when the video image is compressed; for a location with a small gray value, even 0, indicating that there is no significant distribution of human feature information at that location, the information at that location is not of interest when compressing the video image.

(3) Since a frame of image in the surveillance video sequence may include a plurality of people, the CAM image corresponding to one image includes the attention distribution of the plurality of people, and the following method needs to acquire the attention distribution map of each person, so that the attention distribution map of each person needs to be separated from all the person attention distribution maps CAM, and the specific method is as follows: inputting the frame image into a Mask-RCNN semantic segmentation network to obtain each personnel semantic region, and segmenting the attention distribution map of each personnel according to the semantic region of each personnel.

To this end, an attention profile CAM of all persons is obtained for each frame of image in the surveillance video sequence of the area of danger, as well as an individual attention profile for each person.

Secondly, training a preliminary compression network model: extracting a single-channel feature map of each image in a monitoring video image sequence of the dangerous area by using a primary compression sub-network: and inputting the monitoring video image sequence of the dangerous area into a primary compression sub-network, and outputting a single-channel characteristic diagram of each image which focuses more on the personnel information in the dangerous area under the supervision of a loss function of the primary compression sub-network. Wherein the loss of the preliminary compression sub-network comprises: and correcting the difference of the pixel values of each image corresponding to the monitoring video image sequence and the image sequence after the initial decompression by using a weight mask of the monitoring video image sequence in the dangerous area to obtain the loss of the initial compression sub-network.

Preferably, the weight mask of the monitored video image sequence of the dangerous area according to the invention uses a human attention profile CAM.

Specifically, acquiring continuous K frames of images on a monitoring video sequence of a dangerous area, and setting the K frames of images as

. Image processing method

Respectively input to the initial compression network encoder, and output K-frame single-channel feature diagram under the supervision of initial compression network loss function

。

Preferably, K =7 according to the invention.

Particularly, the output result of the primary compression sub-network outputs K single-channel characteristic maps which pay more attention to personnel information in the dangerous area under the supervision of a loss function of the primary compression sub-network. The specific preliminary compression subnetwork loss function is:

wherein,

to represent

Is an image

Is detected at a pixel position in the image,

representing images

In position

The pixel value of (d);

representing images

In position

The pixel value of (a), wherein,

after preliminary decompression

A corresponding image;

representing images

The above-obtained attention distribution map of the person,

as an image

In position

The pixel value of (d); to prevent the occurrence of an element of 0, let

Wherein

the program is assigned a symbol.

Smaller representation of input original image

And the preliminarily decompressed image

The closer the pixel values of the corresponding pixels are, the greater the degree of closeness of the pixels at the positions where the attention of the person is strong is, so that

Preliminarily decompressed image of single-channel feature map obtained after convergence

The information of the personnel can be restored as far as possible.

The purpose and benefit of the CAM for the attention profile of the lead person is to

Is compressed into

After that, make

Containing more and more adequate personal characteristics than making

Contain too much equipment or background information to enable

Can be reduced as much as possible while ensuring

Preliminary decompression to

After that, the air conditioner is started to work,

the personnel information can be clearly and accurately restored. The training method of the network is conventional, the data set does not need to be labeled, and the training process is not repeated in the invention.

At this point, a preliminary compression network model is trained.

And finally, stacking the monitoring video image sequences of the dangerous areas and inputting the stacked monitoring video image sequences into a compression network to obtain a characteristic diagram of the monitoring video information as compressed data.

Specifically, the single-channel feature maps are stacked together to form a multi-channel feature map, the multi-channel feature map is input into a first encoder, and the feature map of the monitoring video information is output under the supervision of a compression network loss function.

Specifically, the compression process of the monitoring video of the dangerous area by the compression network is as follows:

(1) acquiring continuous K frames of images on a monitoring video, and outputting K frames of single-channel feature maps through a trained primary compression sub-network

；

(2) Subjecting K frames to single-channel feature map

Stacked together to form a K-channel signature FM.

(3) Inputting the feature map FM into a first encoder, outputting a feature map F, which is a single-channel image, and

are the same size. Inputting F into the first decoder, outputting a K-channel image

These channels are respectively denoted as

。

The above-mentioned is composed of

The process of obtaining F is called the compression process of the video sequence, the network used is called the compression network, and F is the compression result. Obtained from F

The process of (a) is a decompression process, the network used is called decompression network,

is the decompression result.

The loss of the compressed network includes: correcting the weight mask of each image of the monitoring video image sequence by using the minimum distance between personnel in each image and equipment in the dangerous area to obtain a first correction weight mask, correcting the difference of pixel values of each image corresponding to the monitoring video image sequence and the decompressed image sequence by using the first correction weight mask, and summing the correction values to obtain a first loss; acquiring the personnel position information difference of each image corresponding to the monitoring video image sequence and the decompressed image sequence and the distance difference between any two personnel; distance difference for any two people: obtaining a second weight according to the minimum distance between two persons in the monitoring video image and equipment in the dangerous area, and correcting the distance difference between the two persons by using the second weight to obtain a correction loss; and summing the personnel position information difference and the correction loss to obtain a second loss. The loss of the compression network is the sum of the first loss and the second loss.

In addition, in order to improve the recovery effect of the decompressed data, the loss of the compression network further includes: and correcting the difference of the pixel value of each image corresponding to the monitoring video image sequence and the decompressed image sequence by using the weight mask of the monitoring video image sequence in the dangerous area, and summing the corrected values to obtain a third loss. The loss of the compression network is the sum of the first loss, the second loss and the third loss.

The invention expects that when the person approaches the equipment, the accurate information of the person can be stored, and when the person leaves the equipment, the detailed information of the person is not needed to be paid attention to, and only the track information or the position information of the person is needed to be paid attention to. In order to judge the position relationship between the personnel and the equipment, the positions of the personnel and the equipment need to be obtained from the input image, and once the position characteristics of the personnel and the equipment are obtained from the input image

Is then necessarily in

Will also contain device location information, but from

When acquiring the device information, information such as shape, size, texture and the like, which is dependent on the whole device, is often required, and is also stored or included in

Thereby enabling a significant portion of the information contained in the signature F by the device. The invention does not pay attention to too much information of the equipment, does not want the characteristic diagram F to contain too much equipment information, and wants the characteristic diagram F to store more personnel information so as to sequentially improve the degree of video compression and the capacity of decompression. In order to solve the problem that the feature diagram F contains too much device information, the present invention only needs to use the location information of the device as the input of the network, and the specific method is as follows: the position of each device on each frame of image is acquired before video compression, and for one camera, the camera is in a fixed view angle, and the position of the device is fixed, so that the position of the device in the image is considered to be fixed, and the position information of each device in the image can be marked by people or obtained according to a key point detection network.

Specifically, the loss of the compressed network is:

wherein

referred to as the first loss, is,

referred to as the second loss, is,

referred to as the third loss. At the loss function

Under the supervision of (2), when

And when the convergence effect is achieved, the compression network training is finished.

Wherein,

the items are as follows:

wherein,

representing person n in image

The attention distribution map of the person is shown,

representing a pixel value on the image;

presentation pair

The scaling variation or gamma correction, i.e. the first correction weight mask,

is a scaling factor representing the nth person in the image with whom

The distance from the nearest device, the factor being greater

The value of (a) approaches zero, i.e. smaller pixel values on the person attention profile become smaller when the person is further away from the device, so that no more attention is paid to the detailed information of the person further away from the device.

Comprises the following steps:

wherein,

,

representing images

The set of people of (a) above,

is a collection

Any two different people.

Representing person n in image

And in the image

The distance between the positions is expected to be as small as possible, so that the personnel on the decompressed image can still obtain the positions of the personnel by detecting the network through key points.

Representing persons

In the image

OnDistance between two adjacent plates

With personnel

In the image

Distance of

The difference of (a).

Representing persons in each image

The maximum value of the distance from the respective closest device, i.e. the second weight;

the representation introduces more attention or stronger constraints so that the position of a person far from the device can have an accurate relative relationship with other persons, making the position location more accurate. In any case, it is preferable that,

the smaller the size, the more guaranteed the person is to have an accurate track or position far from the device.

Comprises the following steps:

wherein,

to represent

Is an image

One pixel location;

representing images

In position

The pixel value of (d);

representing images

In position

The pixel value of (c).

To represent the image

The obtained attention distribution map of the person;

representing images

In position

To prevent the occurrence of an element of 0, let

. Wherein,

the program is assigned a symbol.

Smaller representation of input original image

And decompressed image

The closer the pixel values of the corresponding pixels are, the greater the degree of pixel proximity at the position where the attention of the person is strong is, so that

Converging to make a compressed image

The information of the personnel can be restored as far as possible.

So far, a compression network, a decompression network and a compressed and stored characteristic diagram F of the monitoring video sequence of the dangerous area are obtained. The invention carries out the compression storage step for every K frames of images, namely K frames of monitoring video image data

And the F is compressed into a characteristic diagram F, and the F is stored, so that the storage capacity is greatly reduced. The video size of the invention is 2048 × 2048, and the size of F is set to 128 × 128.

Specific example 2:

the embodiment provides a dangerous area monitoring video compression storage system based on artificial intelligence.

The method comprises the following steps of stacking monitoring video image sequences of a dangerous area and inputting the stacked monitoring video image sequences into a compression network module to obtain a characteristic diagram of monitoring video information as compression data, wherein the loss of the compression network module comprises the following steps: correcting the weight mask of each image of the monitoring video image sequence by using the minimum distance between personnel in each image and equipment in the dangerous area to obtain a first correction weight mask, correcting the difference of pixel values of each image corresponding to the monitoring video image sequence and the decompressed image sequence by using the first correction weight mask, and summing the correction values to obtain a first loss;

acquiring the personnel position information difference of each image corresponding to the monitoring video image sequence and the decompressed image sequence and the distance difference between any two personnel; distance difference for any two people: obtaining a second weight according to the minimum distance between two persons in the monitoring video image and equipment in the dangerous area, and correcting the distance difference between the two persons by using the second weight to obtain a correction loss; summing the personnel position information difference and the correction loss to obtain a second loss; and correcting the difference of the pixel value of each image corresponding to the monitoring video image sequence and the decompressed image sequence by using the weight mask of the monitoring video image sequence in the dangerous area, and summing the corrected values to obtain a third loss.

And summing the first loss, the second loss and the third loss to obtain the loss of the compression network module.

It should be noted that: the precedence order of the above embodiments of the present invention is only for description, and does not represent the merits of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments.

Claims

1. A dangerous area monitoring video compression storage method based on artificial intelligence is characterized by comprising the following steps:

stacking the monitoring video image sequences of the dangerous areas and inputting the stacked monitoring video image sequences into a compression network to obtain a characteristic diagram of monitoring video information as compression data, wherein the loss of the compression network comprises the following steps: correcting the weight mask of each image of the monitoring video image sequence by using the minimum distance between personnel in each image and equipment in the dangerous area to obtain a first correction weight mask, correcting the difference of pixel values of each image corresponding to the monitoring video image sequence and the decompressed image sequence by using the first correction weight mask, and summing the correction values to obtain a first loss;

acquiring the personnel position information difference of each image corresponding to the monitoring video image sequence and the decompressed image sequence and the distance difference between any two personnel positions; for the distance difference between any two person positions: obtaining a second weight according to the minimum distance between two persons in the monitoring video image and equipment in the dangerous area, and correcting the distance difference between the positions of the two persons by using the second weight to obtain a correction loss; summing the personnel position information difference and the correction loss to obtain a second loss;

the first loss is summed with the second loss to obtain a loss of the compressed network.

2. The method as claimed in claim 1, wherein the step of compressing and storing the dangerous area surveillance video based on artificial intelligence further comprises: and correcting the difference of the pixel value of each image corresponding to the monitoring video image sequence and the decompressed image sequence by using the uncorrected weight mask of the monitoring video image sequence of the dangerous area, and summing the corrected values to obtain a third loss.

3. The method as claimed in claim 1, wherein the loss of the compressed network is the sum of the first loss, the second loss and the third loss.

4. The method as claimed in claim 1, wherein the compression network includes a primary compression sub-network, the sequence of monitoring video images of the dangerous area is input into the primary compression sub-network, and a single-channel feature map of each image with greater concern about personnel information in the dangerous area is output under the supervision of a loss function of the primary compression sub-network.

5. The method as claimed in claim 4, wherein the loss of the primary compression sub-network comprises:

and correcting the pixel value difference of each image corresponding to the monitoring video image sequence and the image sequence after the initial decompression by using the uncorrected weight mask of the monitoring video image sequence in the dangerous area to obtain the loss of the initial compression sub-network.

6. The method as claimed in claim 4, wherein the compressed network further comprises a first encoder:

and stacking the single-channel feature maps together to form a multi-channel feature map, inputting the multi-channel feature map into a first encoder, and outputting the feature map of the monitoring video information.

7. A dangerous area monitoring video compression storage system based on artificial intelligence is characterized by comprising:

the compression network module is used for analyzing the input of the monitoring video image sequence in the dangerous area after being stacked to obtain a characteristic diagram of monitoring video information as compressed data;

the loss of the compressed network module comprises: correcting the weight mask of each image of the monitoring video image sequence by using the minimum distance between personnel in each image and equipment in the dangerous area to obtain a first correction weight mask, correcting the difference of pixel values of each image corresponding to the monitoring video image sequence and the decompressed image sequence by using the first correction weight mask, and summing the correction values to obtain a first loss;

the first loss is summed with the second loss to obtain a loss of the compression network module.

8. The system of claim 7, wherein the loss of the compressed network module further comprises: and correcting the difference of the pixel value of each image corresponding to the monitoring video image sequence and the decompressed image sequence by using the uncorrected weight mask of the monitoring video image sequence of the dangerous area, and summing the corrected values to obtain a third loss.

9. The system according to claim 7, wherein the loss of the compressed network module is a sum of the first loss, the second loss and the third loss.

10. The artificial intelligence based dangerous area monitoring video compression storage system according to claim 7, wherein the compression network module comprises a primary compression sub-network module, the sequence of monitoring video images of the dangerous area is input into the primary compression sub-network module, and a single-channel feature map of each image with greater concern about personnel information in the dangerous area is output under the supervision of a loss function of the primary compression sub-network module.