CN110719444A

CN110719444A - Multi-sensor fusion omnidirectional monitoring and intelligent camera method and system

Info

Publication number: CN110719444A
Application number: CN201911078804.2A
Authority: CN
Inventors: 刘通; 程江华; 杨明胜; 罗笑冰; 杜湘瑜; 张亮; 王洋
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2019-11-07
Filing date: 2019-11-07
Publication date: 2020-01-21

Abstract

The invention relates to a multi-sensor fusion omnidirectional monitoring and intelligent camera method and system. The system includes a 360° fixed camera module, a 360° scanning camera module, an intelligent processing and control module and a comprehensive information display terminal. It integrates the perception information of four fixed-focus cameras, a zoom camera and a radar sensor to realize the all-round monitoring of "seeing the whole", "seeing clearly" and "understanding" and the monitoring target of intelligent camera. It can be widely used in important places such as prisons, gun depots, oil depots, etc., to achieve all-round monitoring and intelligent camera in sensitive areas, while supporting the capture and early warning of objects of interest, and improving the intelligence level of security monitoring.

Description

Multi-sensor fusion omnidirectional monitoring and intelligent camera method and system

技术领域technical field

本发明涉及多传感器融合的全方位监控与智能摄像方法与系统。The invention relates to a multi-sensor fusion omnidirectional monitoring and intelligent camera method and system.

背景技术Background technique

随着科技的进步，视频监控被广泛应用到各个领域，例如银行的安监，交通违规的记录，公安破案侦察的天眼系统，考场上的监控，电力系统的检测等。可以说，视频监控融入到我们日常生活的方方面面，因此促成了视频监控当前所拥有的巨大地市场空间。With the advancement of science and technology, video surveillance has been widely used in various fields, such as bank safety supervision, traffic violation records, the sky eye system for public security investigation and investigation, monitoring in the examination room, and power system detection. It can be said that video surveillance is integrated into all aspects of our daily life, thus contributing to the huge market space currently owned by video surveillance.

目前市面上销售的监控设备大致分为两类。一类监控设备是由定焦摄像头组成的监控单元，该类设备结构简单成本较低但仅仅适合小范围区域监控，并且由于焦距固定无法对画面进行一个推近与拉远，因此该类型监控设备无法获得细节信息，例如该设备能监控到有人进入监控区域，但无法对其实现一个清晰的放大拉近处理，以获得目标的细节信息。第二类监控设备是由可变焦摄像头组成的监控单元，该类设备成本较高可变焦距范围较广，市场应用最大可达50倍光学变焦，但该类监控设备变焦均需手动操作控制，如在上述所提情况下需要人手动调整焦距对目标进行锁定与画面拉近即无法实现智能的全自动监控，耗费人力资源。The monitoring equipment currently on the market can be roughly divided into two categories. A type of monitoring equipment is a monitoring unit composed of fixed-focus cameras. This type of equipment has a simple structure and low cost, but is only suitable for monitoring a small area, and because the focal length is fixed, the screen cannot be zoomed in and out. Therefore, this type of monitoring equipment Detailed information cannot be obtained. For example, the device can monitor that someone enters the monitoring area, but it cannot achieve a clear zoom-in and zoom-in process to obtain detailed information of the target. The second type of monitoring equipment is a monitoring unit composed of a variable-focus camera. This type of equipment has a high cost and a wide range of variable focal lengths. The market application can reach a maximum of 50 times optical zoom, but the zoom of this type of monitoring equipment requires manual operation and control. For example, in the above-mentioned situation, it is necessary to manually adjust the focal length to lock the target and zoom in, which means that intelligent automatic monitoring cannot be realized, which consumes human resources.

在安防监控领域，传统的定焦摄像机存在无法获得距离较远目标的细节信息、监控范围有限等问题，可变焦摄像机需要人工控制方位和焦距，或者按照一定规则自动巡航，不能对目标进行主动摄像，也无法实现全方位监控。In the field of security monitoring, traditional fixed-focus cameras have problems such as being unable to obtain detailed information of distant targets and limited monitoring range. Variable-focus cameras need to manually control the orientation and focal length, or automatically cruise according to certain rules, and cannot actively record the target. , and can not achieve comprehensive monitoring.

发明内容SUMMARY OF THE INVENTION

针对上述问题，本发明特提出一种多传感器融合的全方位监控与智能摄像方法与系统。In view of the above problems, the present invention proposes a multi-sensor fusion omnidirectional monitoring and intelligent camera method and system.

本发明提出的多传感器融合的全方位监控与智能摄像系统包括360°固定摄像模组、360°扫描摄像模组、智能处理及控制模组和综合信息显示终端；The multi-sensor fusion omnidirectional monitoring and intelligent camera system proposed by the present invention includes a 360° fixed camera module, a 360° scanning camera module, an intelligent processing and control module, and a comprehensive information display terminal;

所述360°固定摄像模组包括4个定焦摄像机，实现全方位监控显示，使得监控场景看得更全面；The 360° fixed camera module includes 4 fixed-focus cameras to realize all-round monitoring and display, so that the monitoring scene can be seen more comprehensively;

所述360°扫描摄像模组包括变焦摄像机、雷达、云台，由云台实现360°扫描，由雷达感知目标距离，自动控制变焦摄像机调整焦距，实现不同距离目标的光学放大，使得监控目标看得更清楚；The 360° scanning camera module includes a zoom camera, a radar, and a pan/tilt. The pan/tilt realizes 360° scanning. The radar senses the target distance, automatically controls the zoom camera to adjust the focal length, and realizes optical amplification of targets at different distances, so that the monitoring target can see the target. more clearly;

所述智能处理及控制模组部署目标检测与识别算法，智能识别监控场景中的感兴趣目标，使得监控场景看得更易懂；The intelligent processing and control module deploys a target detection and identification algorithm to intelligently identify the target of interest in the monitoring scene, so that the monitoring scene is easier to understand;

智能处理及控制模组通过PWM脉冲频率为50Hz占空比可调的矩形波发生器实现云台控制，不同占空比对应云台标定零度角旋转固定角度，矩形波占空比为5%-95%，对应的云台旋转角度0°-360°，控制云台旋转；The intelligent processing and control module realizes the gimbal control through a rectangular wave generator with a PWM pulse frequency of 50Hz and an adjustable duty cycle. Different duty cycles correspond to the zero-degree rotation of the gimbal and a fixed angle. The duty cycle of the rectangular wave is 5%- 95%, the corresponding gimbal rotation angle is 0°-360°, which controls the gimbal rotation;

智能处理及控制模组通过USB转TTL接口接激光测距信号，将其换算成目标的距离信息；变焦相机采用预置位方式调焦，共有6个预置位分别对应6个距离段，通过距离信息得到预置位，然后通过USB串口控制变焦相机调整焦距，获得清晰图像；The intelligent processing and control module connects the laser ranging signal through the USB to TTL interface, and converts it into the distance information of the target; the zoom camera uses the preset position to adjust the focus, and there are 6 preset positions corresponding to 6 distance segments respectively. The distance information is preset, and then the zoom camera is controlled to adjust the focus through the USB serial port to obtain a clear image;

智能处理及控制模组接收到激光信号并控制变焦相机调焦之后，对变焦相机抓拍采集的图像进行智能识别，具体是在ARM上部署多模型融合的目标检测侧算法，对算法检测结果为行人的图像进行抓拍；After the intelligent processing and control module receives the laser signal and controls the zoom camera to focus, it intelligently recognizes the images captured by the zoom camera. Specifically, a multi-model fusion target detection side algorithm is deployed on the ARM, and the detection result of the algorithm is pedestrians. to capture the image;

所述综合信息显示终端，显示全方位监控信息并进行智能摄像，满足高安全等级的视频监控需求。The comprehensive information display terminal displays all-round monitoring information and performs intelligent photography, so as to meet the video monitoring requirements of high security levels.

本发明融合四个定焦摄像机、一个可变焦摄像机和一个雷达传感器的感知信息，实现“看得全”、“看得清”、“看得懂”的全方位监控和智能摄像的监控目标。The invention integrates the perception information of four fixed-focus cameras, one variable-focus camera and one radar sensor, and realizes the all-round monitoring of "seeing the whole", "seeing clearly" and "understanding" and the monitoring target of intelligent camera.

附图说明Description of drawings

图1 360°固定摄像模组示意图（俯视），Figure 1 Schematic diagram of the 360° fixed camera module (top view),

图2 360°扫描摄像模组示意图，Figure 2 Schematic diagram of 360° scanning camera module,

图3 VGG网络结构，Figure 3 VGG network structure,

图4 ResNet50网络结构，Figure 4 ResNet50 network structure,

图5 多模型融合架构图，Figure 5 Multi-model fusion architecture diagram,

图6综合信息显示终端系统界面。Figure 6 Comprehensive information display terminal system interface.

具体实施方式Detailed ways

本发明提出一种多传感器融合的全方位监控与智能摄像方法与系统，有益效果是融合四个定焦摄像机、一个可变焦摄像机和一个雷达传感器的感知信息，实现“看得全”、“看得清”、“看得懂”的全方位监控和智能摄像的监控目标，可广泛应用于监狱、枪械库、油库等重要场所，实现敏感区域的全方位监控和智能摄像，同时支持对感兴趣目标进行抓拍和预警，提高安防监控的智能化水平。The invention provides a multi-sensor fusion omnidirectional monitoring and intelligent camera method and system. It can be widely used in important places such as prisons, gun depots, oil depots and other important places to realize all-round monitoring and intelligent camera in sensitive areas. The target is captured and early-warned to improve the intelligence level of security monitoring.

（1）360°固定摄像模组(1) 360° fixed camera module

360°固定摄像模组负责拍摄水平方向360°全景画面，由4台定焦摄像机组装而成，4台摄像机相邻之间的夹角是90°，摄像机镜头的焦距是3.6mm，监控范围为85°～105°，监控距离是0～10m。各台摄像机的俯仰角一致，都是水平方向向下倾斜30°。这样，4台摄像机水平方向监控角度大于360°，可以实现水平方向360°全方位监控，达到“看的全”的监控目标。相邻2台摄像机水平方向的重合角度大于10°，便于实现相邻摄像机拍摄图像的精确拼接。360°固定摄像模组组装示意图如图1所示。The 360° fixed camera module is responsible for shooting 360° panorama images in the horizontal direction. It is assembled by 4 fixed-focus cameras. The angle between the adjacent 4 cameras is 90°, the focal length of the camera lens is 3.6mm, and the monitoring range is 85°～105°, the monitoring distance is 0～10m. The pitch angles of each camera are the same, and they are all inclined downward by 30° in the horizontal direction. In this way, the horizontal monitoring angle of the four cameras is greater than 360°, which can realize 360° all-round monitoring in the horizontal direction, and achieve the monitoring goal of "seeing the whole". The overlapping angle of the two adjacent cameras in the horizontal direction is greater than 10°, which facilitates accurate stitching of images captured by adjacent cameras. The schematic diagram of 360° fixed camera module assembly is shown in Figure 1.

（2）360°扫描摄像模组(2) 360° scanning camera module

360°扫描摄像模组负责拍摄感兴趣目标的高清图像，由云台、激光测距传感器和变焦摄像头组成，如图2所示。其中，云台可以360°旋转，激光测距传感器和变焦摄像头固定在同一支撑柱上，水平方向方位相同，垂直方向倾斜角度相同，都是向下倾斜30°。中心服务器通过串口获取到激光测距传感器测量的距离信息，并进行分析，计算出变焦摄像头的焦距段，将控制指令发送至控制模组，实现对变焦摄像头焦距的调整，获得清晰的图像，达到“看得清”的监控目的。The 360° scanning camera module is responsible for capturing high-definition images of the target of interest, and consists of a PTZ, a laser ranging sensor and a zoom camera, as shown in Figure 2. Among them, the gimbal can be rotated 360°, the laser ranging sensor and the zoom camera are fixed on the same support column, the horizontal orientation is the same, and the vertical tilt angle is the same, both of which are inclined downward by 30°. The central server obtains the distance information measured by the laser ranging sensor through the serial port, analyzes it, calculates the focal length of the zoom camera, and sends the control command to the control module to adjust the focal length of the zoom camera, obtain clear images, and achieve "See clearly" monitoring purpose.

（3）智能处理及控制模组(3) Intelligent processing and control module

智能处理及控制模组的处理器是ARM板卡，主要实现三个功能：The processor of the intelligent processing and control module is an ARM board, which mainly realizes three functions:

（3.1）云台控制(3.1) PTZ control

智能处理及控制模组通过PWM脉冲频率为50Hz占空比可调的矩形波发生器实现云台电机的控制，不同占空比对应电机标定零度角旋转固定角度，矩形波占空比为5%-95%，对应的云台电机旋转角度0°-360°，控制云台旋转。The intelligent processing and control module realizes the control of the gimbal motor through a rectangular wave generator with a PWM pulse frequency of 50Hz and an adjustable duty cycle. Different duty cycles correspond to a fixed angle of zero-degree rotation of the motor, and the duty cycle of the rectangular wave is 5%. -95%, the corresponding gimbal motor rotation angle is 0°-360°, which controls the gimbal rotation.

（3.2）激光测距信号处理及变焦相机控制(3.2) Laser ranging signal processing and zoom camera control

智能处理及控制模组通过USB转TTL接口接激光测距信号，将其换算成目标的距离信息；变焦相机采用预置位方式调焦，共有6个预置位分别对应6个距离段，通过距离信息得到预置位，然后通过USB串口控制变焦相机调整焦距，获得清晰图像。The intelligent processing and control module connects the laser ranging signal through the USB to TTL interface, and converts it into the distance information of the target; the zoom camera uses the preset position to adjust the focus, and there are 6 preset positions corresponding to 6 distance segments respectively. The distance information is preset, and then the zoom camera is controlled to adjust the focus through the USB serial port to obtain a clear image.

（3.3）智能识别及抓拍(3.3) Intelligent recognition and snapshot

智能处理及控制模组接收到激光信号并控制变焦相机调焦之后，对变焦相机抓拍采集的图像进行智能识别，具体是在ARM上部署多模型融合的目标检测侧算法，对算法检测结果为行人的图像进行抓拍。多模型融合的目标检测侧算法利用迁移学习的方法，将官方网站预训练好用于目标检测的模型迁移到用于到特定目标识别上来，冻结深度卷积神经网络中全连接层中的网络，让这些被冻结的网络层次中的参数在模型的训练过程中不进行梯度更新，能够被优化的参数仅仅是没有被冻结的特征提取器的全部参数，我们称作特征提取器，得到特征图。然后，将特征图进行连接，重新构造承担整个模型输出分类工作的全连接层，采用3层全连接层，用随机失活的优化方法，进行参数优化，提高模型的鲁棒性。具体地，本发明将VGG16网络与ResNet50网络进行融合。VGG16特征提取器指的是VGG16网络的卷积块结构，ResNet50特征提取器指的是ResNet50的残差块结构；After the intelligent processing and control module receives the laser signal and controls the zoom camera to focus, it intelligently recognizes the images captured by the zoom camera. Specifically, a multi-model fusion target detection side algorithm is deployed on the ARM, and the detection result of the algorithm is pedestrians. image is captured. The target detection side algorithm of multi-model fusion uses the transfer learning method to transfer the model pre-trained for target detection on the official website to the specific target recognition, freezing the network in the fully connected layer of the deep convolutional neural network, The parameters in these frozen network layers are not updated by gradient during the training process of the model, and the parameters that can be optimized are only all the parameters of the feature extractor that is not frozen, which we call the feature extractor to obtain the feature map. Then, the feature maps are connected, and the fully connected layer responsible for the output classification of the entire model is reconstructed. Three fully connected layers are used, and the optimization method of random deactivation is used to optimize the parameters and improve the robustness of the model. Specifically, the present invention integrates the VGG16 network and the ResNet50 network. The VGG16 feature extractor refers to the convolution block structure of the VGG16 network, and the ResNet50 feature extractor refers to the residual block structure of ResNet50;

VGG16网络由13个卷积层和3个全连接层组成，网络结构如图3所示，该最大的特点是通过3×3滤波器的组合与堆叠来提取图像特征。对于特定目标而言，这种方式提取丰富的细节特征，增强特征对感兴趣区域和非感兴趣区域的区分能力。本发明使用的VGG16网络特征提取器为图3中虚线框的卷积块结构部分。The VGG16 network consists of 13 convolutional layers and 3 fully connected layers. The network structure is shown in Figure 3. The biggest feature is the combination and stacking of 3×3 filters to extract image features. For a specific target, this method extracts rich detailed features and enhances the feature's ability to distinguish between regions of interest and regions of non-interest. The VGG16 network feature extractor used in the present invention is the convolution block structure part of the dotted box in FIG. 3 .

ResNet50网络包含了49个卷积层和1个全连接层，网络结构如图4所示。由于该网络加入了恒等映射层，直接连接浅层网络与深层网络，采用这种方式的连接方法其作用是随着网络深度的增加而不退化，并且收敛效果好。利用这一特性，可以解决VGG16网络特征丢失和欠拟合的问题。本发明使用的ResNet50网络特征提取器为图4中虚线框的残差块结构部分。The ResNet50 network contains 49 convolutional layers and 1 fully connected layer, and the network structure is shown in Figure 4. Since the identity mapping layer is added to the network, the shallow network and the deep network are directly connected. The function of the connection method in this way is that it does not degenerate with the increase of the network depth, and the convergence effect is good. Using this feature, the problem of missing and under-fitting VGG16 network features can be solved. The ResNet50 network feature extractor used in the present invention is the residual block structure part of the dashed box in FIG. 4 .

VGG16与ResNet50特征提取器融合后的模型图如5所示。The model diagram after fusion of VGG16 and ResNet50 feature extractor is shown in Fig. 5.

传统机器学习框架需要大量标定训练数据，这将会耗费大量的人力与物力。而没有大量的标注数据，会使得很多与学习相关研究与应用无法开展。传统的机器学习假设训练数据与测试数据服从相同的数据分布，而在许多情况下，这种同分布假设并不满足。从另一个方面看，如果我们有了大量的、在不同分布下的训练数据，完全丢弃这些数据也是非常浪费的。如何合理的利用这些数据就是迁移学习主要解决的问题。迁移学习可以从现有的数据中迁移知识，用来帮助将来的学习。迁移学习的目标是将从一个环境中学到的知识用来帮助新环境中的学习任务。因此，迁移学习不会像传统机器学习那样作同分布假设。目前迁移学习方面的工作可以分为以下三个部分：同构空间下基于实例的迁移学习，同构空间下基于特征的迁移学习与异构空间下的迁移学习。本发明采用的是同构空间下基于特征的迁移学习，寻找源领域和目标领域特征空间中共同的特征表示，缩小两个领域之间差异，提高模型在目标领域的检测性能。由于本发明所选取的特定目标为人和车，与相对完善的目标检测领域数据集相比，高层网络特征层在形状、纹理等方面存在着普适性，利用目标检测领域中含有的大量标注数据进行迁移学习，可以很大程度上减少训练模型的时间，并可以取得良好的效果。不同模型的特征提取层提取到的特征在空间的分布是不同的，采用迁移学习的方法能够将不同模型的特征提取能力进行迁移，搭建新的全连接层，在此基础上在新的目标领域重新进行训练，更好将不同模型拟合。本发明采用Keras库自带的VGG16、ResNet50模型，并将来自于ImageNet数据集训练好的预训练权重用于迁移学习。Traditional machine learning frameworks require a large amount of calibration training data, which will consume a lot of manpower and material resources. Without a large amount of labeled data, many studies and applications related to learning cannot be carried out. Traditional machine learning assumes that the training data and test data follow the same data distribution, and in many cases, this same distribution assumption is not satisfied. On the other hand, if we have a large amount of training data under different distributions, it is also very wasteful to discard this data completely. How to use these data reasonably is the main problem solved by transfer learning. Transfer learning can transfer knowledge from existing data to aid future learning. The goal of transfer learning is to use the knowledge learned from one environment to help the learning task in the new environment. Therefore, transfer learning does not make the same distribution assumption as traditional machine learning. The current work on transfer learning can be divided into the following three parts: instance-based transfer learning in homogeneous space, feature-based transfer learning in homogeneous space and transfer learning in heterogeneous space. The invention adopts feature-based transfer learning in isomorphic space, finds common feature representation in the feature space of source domain and target domain, narrows the difference between the two domains, and improves the detection performance of the model in the target domain. Since the specific targets selected by the present invention are people and vehicles, compared with the relatively complete data sets in the target detection field, the high-level network feature layer has universality in terms of shape, texture, etc., using a large amount of labeled data contained in the target detection field. Performing transfer learning can greatly reduce the time to train the model and achieve good results. The spatial distribution of the features extracted by the feature extraction layers of different models is different. The transfer learning method can transfer the feature extraction capabilities of different models and build a new fully connected layer. Retrain to better fit different models. The present invention adopts the VGG16 and ResNet50 models provided by the Keras library, and uses the pre-training weights trained from the ImageNet data set for migration learning.

基于迁移学习的方法，将VGG16与ResNet50特征提取器从目标检测领域迁移到行人检测领域，将两个模型进行融合，相比于单一模型，多模型融合网络结构共计提取到2560张待检测图像7*7的特征图，其中VGG16特征提取器提取到512张特征图，ResNet50特征提取器提取到2048张，由于ResNet50网络模型采用了跳跃连接机制，将特定目标的图像信息传递到了网络更深层，比如纹理、边缘等信息，使得提取到的行人图像信息更丰富。本发明将迁移后的模型中特征提取器进行融合，重新搭建全连接层，冻结两个模型的特征提取器，采用随机初始化的全连接层参数方式，进行融合后模型的训练。Based on the transfer learning method, the VGG16 and ResNet50 feature extractors are transferred from the target detection field to the pedestrian detection field, and the two models are fused. Compared with a single model, the multi-model fusion network structure extracts a total of 2560 images to be detected7 *7 feature maps, of which the VGG16 feature extractor extracted 512 feature maps, and the ResNet50 feature extractor extracted 2048. Since the ResNet50 network model adopts the skip connection mechanism, the image information of the specific target is passed to the deeper layers of the network, such as The information such as texture and edge makes the extracted pedestrian image information richer. The invention fuses the feature extractors in the migrated model, re-builds the fully connected layer, freezes the feature extractors of the two models, and adopts the method of randomly initializing the fully connected layer parameters to train the model after fusion.

多模型融合的目标检测侧算法利用迁移学习的方法加快融合后模型的训练速度，采用多模型融合策略提高行人检测可靠性。The target detection side algorithm of multi-model fusion uses the transfer learning method to speed up the training speed of the fusion model, and adopts the multi-model fusion strategy to improve the reliability of pedestrian detection.

（4）综合信息显示终端(4) Integrated information display terminal

综合信息显示终端显示360°固定摄像模组、360°扫描摄像模组的监控数据以及智能处理及控制模组的智能分析与抓拍结果。如图6所示，系统界面分为“看得全”、“看得清”和“看得懂”三个区域，如图所示。“看得全”区域显示4路定焦摄像头获得的全方位监控信息；“看得清”区域显示变焦摄像头获取的清晰图像，能够自动进行焦距的调整；“看得懂”区域显示感兴趣的行人目标的智能抓拍图像。The comprehensive information display terminal displays the monitoring data of the 360° fixed camera module, the 360° scanning camera module, and the intelligent analysis and snapshot results of the intelligent processing and control module. As shown in Figure 6, the system interface is divided into three areas: "see all", "see clearly" and "understand", as shown in the figure. The "see full" area displays the all-round monitoring information obtained by the 4-way fixed-focus cameras; the "clear" area displays the clear images obtained by the zoom camera, which can automatically adjust the focal length; the "understand" area displays interesting Smart capture images of pedestrian targets.

Claims

1. The omni-directional monitoring and intelligent camera system of multi-sensor fusion, including 360° fixed camera module, 360° scanning camera module, intelligent processing and control module and comprehensive information display terminal, is characterized in that,

The 360° fixed camera module includes 4 fixed-focus cameras to realize all-round monitoring and display, so that the monitoring scene can be seen comprehensively;

The included angle between the four adjacent cameras is 90°, the focal length of the camera lens is 3.6mm, the monitoring range is 85°～105°, and the monitoring distance is 0～10m;

The 360° scanning camera module includes a zoom camera, a radar, and a pan/tilt. The pan/tilt realizes 360° scanning. The radar senses the target distance, automatically controls the zoom camera to adjust the focal length, and realizes optical amplification of targets at different distances, so that the monitoring target can see the target. clear;

The pan/tilt rotates 360°, the laser ranging sensor and the zoom camera are fixed on the same support column, the horizontal orientation is the same, and the vertical tilt angle is the same, and both are inclined downward by 30°;

The central server obtains the distance information measured by the laser ranging sensor through the serial port, analyzes it, calculates the focal length of the zoom camera, and sends the control command to the control module to adjust the focal length of the zoom camera, obtain clear images, and achieve "Clear" monitoring purposes;

The intelligent processing and control module deploys a target detection and identification algorithm to intelligently identify the target of interest in the monitoring scene;

The intelligent processing and control module realizes the gimbal control through a rectangular wave generator with a PWM pulse frequency of 50Hz and an adjustable duty cycle. Different duty cycles correspond to the zero-degree rotation of the gimbal and a fixed angle. The duty cycle of the rectangular wave is 5%- 95%, the corresponding gimbal rotation angle is 0°-360°, which controls the gimbal rotation;

The intelligent processing and control module connects the laser ranging signal through the USB to TTL interface, and converts it into the distance information of the target; the zoom camera uses the preset position to adjust the focus, and there are 6 preset positions corresponding to 6 distance segments respectively. The distance information is preset, and then the zoom camera is controlled to adjust the focus through the USB serial port to obtain a clear image;

After the intelligent processing and control module receives the laser signal and controls the zoom camera to focus, it intelligently recognizes the images captured by the zoom camera. Specifically, a multi-model fusion target detection side algorithm is deployed on the ARM, and the detection result of the algorithm is pedestrians. to capture the image;

The comprehensive information display terminal displays all-round monitoring information and performs intelligent photography, so as to meet the video monitoring requirements of high security levels.

2. The omnidirectional monitoring and intelligent camera system of multi-sensor fusion according to claim 1, is characterized in that, the pitch angles of described 4 cameras are the same, and are all horizontally inclined downward by 30°, and 4 cameras are horizontally oriented. The monitoring angle is greater than 360°, enabling 360° all-round monitoring in the horizontal direction, and the overlapping angle of two adjacent cameras in the horizontal direction is greater than 10°, enabling accurate stitching of images captured by adjacent cameras.

3. The omni-directional monitoring and intelligent camera system of multi-sensor fusion according to claim 1, wherein the target detection side algorithm migrates the pre-trained model for target detection to the specific target recognition , freeze the network in the fully connected layer of the deep convolutional neural network, so that the parameters in these frozen network layers do not undergo gradient updates during the training process of the model, and the parameters that can be optimized are only the feature extractors that are not frozen All the parameters of , we call the feature extractor, and get the feature map; then, connect the feature maps to reconstruct the fully connected layer that undertakes the classification work of the entire model output, using 3 layers of fully connected layers, using the optimization method of random deactivation , perform parameter optimization, specifically, fuse the VGG16 network with the ResNet50 network, the VGG16 feature extractor refers to the convolution block structure of the VGG16 network, and the ResNet50 feature extractor refers to the ResNet50 residual block structure;

The VGG16 network consists of 13 convolutional layers and 3 fully connected layers. Image features are extracted by combining and stacking 3×3 filters.

The ResNet50 network contains 49 convolutional layers and 1 fully connected layer. The identity mapping layer is added to the network to directly connect the shallow network and the deep network to solve the problems of feature loss and underfitting of the VGG16 network.

4. The omni-directional monitoring and intelligent camera system of multi-sensor fusion according to claim 3, wherein the transfer learning adopts feature-based transfer learning in a homogeneous space, and finds the feature space of the source domain and the target domain To reduce the differences between the two fields, the VGG16 and ResNet50 models that come with the Keras library are used, and the pre-trained weights trained from the ImageNet dataset are used for transfer learning.