CN115601678A

CN115601678A - Remote video conference safety guarantee method

Info

Publication number: CN115601678A
Application number: CN202211306722.0A
Authority: CN
Inventors: 卢晓彦; 梅灿; 于鹏; 黄永振; 徐宁; 李喆
Original assignee: Jinan Tongyu Network Security Technology Co ltd; Yantai Tongyu Network Security Technology Co ltd; Shandong Tongyu Network Security Technology Co ltd
Current assignee: Jinan Tongyu Network Security Technology Co ltd; Yantai Tongyu Network Security Technology Co ltd; Shandong Tongyu Network Security Technology Co ltd
Priority date: 2022-10-25
Filing date: 2022-10-25
Publication date: 2023-01-13

Abstract

The invention relates to a remote video conference safety guarantee method, which solves the technical problem that the existing remote video conference is lack of abnormal behavior control, and comprises the following steps: establishing a conference, adding a probe and starting abnormal behavior detection in a system; the probe collects original videos and audios and uploads the videos and audios to a conference video analysis server; the conference video analysis server detects the video and feeds back the result to the probe; uploading, by the probe, the abnormal behavior record to the system; and for the detected abnormal behaviors, the video conference security probe performs masking control on the video in the conference to ensure that the candid shooting effect is poor and sensitive contents are not leaked. The invention can be widely applied to the security guarantee of the remote video conference.

Description

Remote video conference safety guarantee method

Technical Field

The invention relates to the field of video conferences, in particular to a remote video conference safety guarantee method.

Background

With the continuous improvement of the informatization level, the working modes of video conferences, remote consultation, remote diagnosis and the like greatly improve the office efficiency, and the utilization rate of the video conferences is higher and higher. However, in the process of holding a sensitive indoor conference, various security risks relating to sensitive words, candid shooting of a conference screen, leakage of a conference video and the like are faced, so that great challenges are brought to the use of remote videos, and the use of the remote videos is also limited. At present, a private network is needed for a video conference in a secret-related video conference, and a private network can be established on a public network by a VPN private network to carry out encryption communication. The method is widely applied to the secret-related enterprise network. The VPN gateway realizes remote access through encryption of a data packet and conversion of a data packet target address, but abnormal behaviors such as candid shooting in a conference cannot be managed and controlled. Therefore, it is necessary to manage abnormal behaviors such as candid shooting in a remote video conference.

Disclosure of Invention

The invention provides a remote video conference safety guarantee method for effectively controlling candid behaviors in order to solve the technical problem that abnormal behaviors are not controlled in the existing remote video conference.

The invention provides a remote video conference safety guarantee method, which comprises the following steps:

step 1, establishing a conference, adding a probe and starting abnormal behavior detection in a system;

step 2, the probe acquires an original video and uploads the original video to a conference video analysis server;

step 3, the conference video analysis server detects the video and feeds the result back to the probe;

step 4, uploading the abnormal behavior record to a system by the probe;

and 5, for the detected abnormal behaviors, the video conference security probe executes masking control on video contents in the conference to ensure that the candid shooting effect is poor and sensitive contents are not leaked.

Preferably, the specific steps of step 1 are:

step 11, establishing a conference and setting a conference watermark;

and 12, adding a probe in equipment management and carrying out equipment entry, wherein the equipment number is consistent with the communication number of the equipment.

Preferably, the specific steps of step 3 are:

step 31, making conference behavior data set pictures, selecting 60% -70% of the conference behavior data set pictures as training data sets, marking the specific targets in the pictures, wherein the data set pictures are character targets with candid behaviors;

step 32, performing model training by using the conference behavior data set, and performing parameter training of the model by using an improved Yolov4 algorithm to obtain an optimal training parameter of the model; the improved YOLOV4 algorithm changes a main network into a lighter-weight network mobileNet, and uses a channel and space attention mechanism CBAM to enhance the extraction of features;

step 33, reading the teleconferencing video, loading a detection model and training parameters, carrying out frame-by-frame detection on images of a video sequence, and if abnormal candid behavior is detected, storing a detection result and returning the detection result to a probe interface for subsequent processing;

step 34, the conference video analysis server and the probe are linked through a folder: when a conference is initiated, the probe stores a photo file in a directory specified by the NFS client, and the file naming format is as follows: IP., serial No. jpg; the component continuously scans the directory shared by the NFS server, performs abnormal behavior analysis if the directory has a file, and after the analysis is completed, if the directory is abnormal, tags and probability values of the pictures are printed on the abnormal picture folder, and the original pictures are deleted; if the image is normal, the original image is directly deleted; acquiring the equipment identification of the probe according to the folder where the picture is located, and feeding back a detection result to the probe;

and step 35, the conference video analysis server obtains the ip address reported by the file by analyzing the file name and splitting, and feeds back the analysis result to the probe.

Preferably, the specific steps of step 31 are:

step 311, acquiring a picture of a person candid behavior comprising different camera angles and light intensities, wherein the picture requires the behaviors of lifting a mobile phone and aligning a video camera by a person target;

and step 312, labeling the picture by using a labeling image labeling tool, wherein the labeling comprises labeling a target category of a target in the image and a target position in the image, the labeled picture is used as a conference behavior data set, the labeled picture is scaled to a size of 960 × 540, and model training is performed, so that the weight of the model training is obtained.

Preferably, the specific steps of step 32 are:

step 321, training and learning the conference behavior data set obtained in step 31 by using a YOLOV4 algorithm model, performing iterative computation on the input sample set for tens of thousands of times, and then adjusting the pre-trained model parameters by using a designed loss function, so as to obtain an optimal model weight parameter file;

step 322, loading the video of the conference scene as input into a detection model to obtain a detection result of an abnormal behavior target in each frame of the video sequence;

for video sequence Frame _t Carrying out cyclic reading, wherein t is the serial number of the video frame and the value range is 1-n;

I _t pixel information representing an image of a t-th frame, wherein I _t The image feature extraction method comprises the steps of including the Width, height and Size of a picture and the pixel value of each pixel point, and providing a data basis for feature extraction of a target;

using DB _t Representing the target detection result of the t-th frame image, DB _t ＝{BB _i I =1,2, …, n }, where BB _i Information indicating an ith detection target in the t-th frame;

with PB _i Denotes BB _i In the Frame _t+1 A predicted target location;

BB _i the specific target information contained is: midpoint coordinates Cent = (x, y) of the object detection envelope box, width and Height of the object detection envelope box, and Size, the object is at the current I _t Pixel information Roi, confidence P, class, frame number t, and Frame in middle envelope box _t+1 A predicted target position PB, where PB has an alignment with BB _i The same attribute.

The invention has the beneficial effects that:

the invention can detect abnormal behaviors such as candid photograph of the meeting place in real time only according to the video image and the audio of the meeting place under the environment without other sensors. And reading the video of the conference site through a probe interface, loading the trained model hyper-parameters and managing the system, and immediately cutting off the real-time input of the video if abnormal candid shooting exists, thereby playing a role in keeping the conference site secret.

Drawings

FIG. 1 is an exemplary image of a teleconference of the present invention;

FIG. 2 is an image of a model training process of the present invention;

FIG. 3 is an image of the test results of the present invention;

fig. 4 is a schematic diagram of an improved Yolov4 model.

Detailed Description

The present invention is further described below in conjunction with the drawings and examples to enable those skilled in the art to practice the present invention.

Example 1: as shown in fig. 1-3, the steps of the present invention are:

(1) Conference basic data such as a conference name, a watermark, a conference start and end time and the like are maintained, and the correspondence between the conference and the probe device is maintained.

(2) The embodiment is specifically a conference video of a certain conference room, the video comprises specific target detection behaviors, namely behaviors that participants take candid shots of a reporter by using camera equipment such as a mobile phone and the like, and the video is loaded into a model as input for detection. 5000 pictures of the behavior of candid shooting of people with different camera angles and light intensities are collected.

(3) And selecting 60-70% of sample images as training samples, wherein the number of the training samples is about 3000, performing model training, and storing the hyper-parameters after 100 rounds of iterative training.

(4) And testing 30% -40% of picture data of the rest samples, loading the trained hyper-parameters, and obtaining a detection result shown in figure 3 through model detection in the scheme, wherein the detection result comprises candid behavior and a detection probability value corresponding to the candid behavior. The detection precision of the model in the scheme under a specific data set can be calculated to reach 95% by using the formula (1).

Where TP is an image where the classifier detects a positive sample and is actually a positive sample, and FP is an image where the classifier detects a positive sample but is not actually a positive sample.

(5) And analyzing the video conference content by loading the trained model hyper-parameters.

(6) And immediately cutting off the real-time input of the video for abnormal candid behavior and carrying out masking control on sensitive content.

The method utilizes a conference behavior data set to train the model, and adopts an improved YOLOV4 algorithm to train the parameters of the model, so as to obtain the optimal training parameters of the model; an improved strategy is to change the backbone network into a more lightweight network mobileNet and use a channel and spatial attention mechanism CBAM to enhance feature extraction, as shown in fig. 4.

According to the implementation of the technical scheme, the control on abnormal behaviors is added in the remote video conference through a set of remote video conference safety guarantee system, so that the risk of sensitive information leakage in the conference can be greatly reduced, and the invention can also ensure that the video can be traced from the leaked video to the video outlet after a leakage event occurs.

The above description is only for the purpose of illustrating preferred embodiments of the present invention and is not to be construed as limiting the present invention, and it is apparent to those skilled in the art that various modifications and variations can be made in the present invention. All modifications, equivalents, improvements and the like which come within the scope of the invention as defined by the claims should be understood as falling within the scope of the invention.

Claims

1. A remote video conference safety guarantee method is characterized by comprising the following steps:

step 3, the conference video analysis server detects the video and feeds back the result to the probe;

step 4, uploading the abnormal behavior record to a system by the probe;

and 5, for the detected abnormal behaviors, the video conference security probe executes masking control on video contents in the conference to ensure that the candid effect is poor and sensitive contents are not leaked.

2. The remote video conference security assurance method according to claim 1, wherein the specific steps of the step 1 are as follows:

step 11, establishing a conference and setting a conference watermark;

and 12, adding a probe in equipment management and performing equipment entry, wherein the equipment number is consistent with the communication number of the equipment.

3. The remote video conference security assurance method according to claim 1, wherein the specific steps of the step 3 are as follows:

step 31, making conference behavior data set pictures, selecting 60% -70% of the conference behavior data set pictures as a training data set, marking specific targets in the pictures, wherein the data set pictures are character targets with candid shooting behaviors;

step 32, performing model training by using the conference behavior data set, performing parameter training of the model by using an improved Yolov4 algorithm, and acquiring an optimal training parameter of the model; the improved YOLOV4 algorithm changes a main network into a lighter-weight network mobileNet, and uses a channel and space attention mechanism CBAM to enhance the extraction of features;

4. The remote video conference security assurance method according to claim 3, wherein the specific steps of the step 31 are as follows:

5. The remote video conference security assurance method according to claim 3, wherein the specific steps of the step 32 are as follows:

step 322, loading the video of the conference scene as input into a detection model to obtain a detection result of the abnormal behavior target in each frame of the video sequence;

with PB _i Denotes BB _i In the Frame _t+1 A predicted target location;