CN116866522A

CN116866522A - Remote monitoring method

Info

Publication number: CN116866522A
Application number: CN202310846783.4A
Authority: CN
Inventors: 周晓洪; 胡慧
Original assignee: Guangzhou Tuwei Information Technology Service Co ltd
Current assignee: Guangzhou Tuwei Information Technology Service Co ltd
Priority date: 2023-07-11
Filing date: 2023-07-11
Publication date: 2023-10-10

Abstract

The invention discloses a remote monitoring method, which comprises the steps of selecting a target area and constructing a triangular camera group by taking every three cameras as a group; each camera group transmits the collected image data to a monitoring center, and independent cameras in each group of cameras perform data synchronization; the method comprises the steps that a monitoring center synthesizes three-dimensional images of received image data to obtain three-dimensional image data; transmitting the synthesized stereo image data to display equipment for decoding and displaying to form a remote stereo monitoring image; the remote monitoring method adopts a three-dimensional image monitoring mode, can avoid dead angles or dead areas in a monitoring area, can acquire the movement and posture information of a target object in a three-dimensional space more accurately, and can also shield the target object by other structures, so that the accuracy of a monitoring result is ensured.

Description

Remote monitoring method

Technical Field

The invention relates to a remote monitoring method.

Background

Remote monitoring refers to the process of observing and controlling a particular location, device or environment in real time through a network or other means of communication. Remote monitoring can provide real-time monitoring for users, and wherever they are, the users can monitor through remote equipment as long as there is network connection.

The existing remote monitoring method is mainly aimed at monitoring a planar image, the planar image is usually monitored by using a single camera or a limited number of cameras, the visual angle is limited, dead angles or dead areas can possibly exist in a monitoring area, the target area cannot be covered completely, the planar image monitoring cannot accurately acquire the motion and posture information of a target object in a three-dimensional space, the capability of analyzing the behavior and judging the state of the target object is limited, and moreover, when the target object is blocked by other objects or structures, the planar image monitoring cannot easily acquire the complete information of the target object, so that the inaccuracy of a monitoring result is caused.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a remote monitoring method which adopts a three-dimensional image monitoring mode, can avoid dead angles or dead areas in a monitoring area, can acquire the movement and posture information of a target object in a three-dimensional space more accurately, can also shield the target object by other structures, and ensures the accuracy of a monitoring result.

The technical scheme adopted for solving the technical problems is as follows:

a method for remote monitoring of a person in need of such a process,

selecting a target area and constructing a camera group which is distributed in a triangular manner by taking every three cameras as a group;

each camera group transmits the collected image data to a monitoring center, and independent cameras in each group of cameras perform data synchronization;

the method comprises the steps that a monitoring center synthesizes three-dimensional images of received image data to obtain three-dimensional image data;

and transmitting the synthesized stereo image data to display equipment for decoding and displaying to form a remote stereo monitoring image.

The independent cameras in the camera group face the same direction, and form an overlapping area capable of capturing a common scene.

And carrying out data synchronization on the independent cameras in each group of cameras in a time stamp mode so as to form synchronous image data.

The method for synthesizing the stereo image of the received image data by the monitoring center comprises the following steps:

calibrating images of three cameras by capturing a calibration plate and using a camera calibration method;

extracting feature points or feature descriptors from the images of each camera, and matching each image with other two images by using a feature matching algorithm;

according to the result of feature matching, calculating parallax between different cameras of each feature point;

calculating the depth value of each pixel point according to the principle of triangulation by utilizing parallax information to obtain a depth map;

the generation of the stereoscopic image is performed using the depth map and the original image data.

The method for calibrating the images of the three cameras by capturing the calibration plate and using the camera calibration method comprises the following steps:

selecting a square grid plate with specific marks and black and white intervals as a calibration plate, and placing the calibration plate in a target area;

shooting a plurality of images containing the calibration plate by each camera at different positions and angles;

extracting the corner coordinates of the calibration plate from each image by using a corner detection algorithm, wherein the corner is a point at the black-white juncture of the calibration plate;

for all images, finding out corresponding corner pairs by matching corner points;

calibrating the camera by using the known geometric relationship between the corner pairs and the camera;

and calculating correction parameters according to the camera calibration result, and carrying out distortion correction on the camera image by using the correction parameters so as to remove distortion in the image and obtain a calibration image.

The method for extracting the feature points or the feature descriptors from the image of each camera comprises the following steps:

introducing a cv2 module by using a SIFT algorithm of an OpenCV library;

reading an image file using a cv2.imread () function;

creating SIFT feature extractor objects using cv2.xfeature2d.sift_create ();

using a detectantdc template () method of a SIFT feature extractor, detecting key points and calculating feature descriptors at the same time;

keypoints,descriptors＝sift.detectAndCompute(image,None)

wherein, image is the input image data, keypoints are key point list, and descriptors are corresponding feature descriptors;

and repeating the steps for the image of each camera, and respectively extracting key points and feature descriptors.

The method for matching each image with the other two images by using the feature matching algorithm comprises the following steps:

importing a cv2 module matching algorithm library;

loading key points and feature descriptors of each camera in the extracted key points and feature descriptor files;

creating a matcher object using cv2.bfmatcher_create ();

performing feature matching by using a match () method of a Brute-Force matcher;

matches＝matcher.match(descriptors1,descriptors2)

wherein, descriptors1 is the feature descriptor of the first image, descriptors2 is the feature descriptor of the second image, and matches is the matching result list;

performing matching screening based on a distance threshold, removing matching pairs with overlarge distance, and only reserving N matching pairs with shortest distance;

and repeating the steps for each image, and respectively performing feature matching with the other two images.

The method for calculating the parallax of each feature point between different cameras comprises the following steps:

obtaining matching point pairs from the feature matching result, wherein each matching point pair comprises coordinates of corresponding feature points in the two images;

converting the pixel coordinates of each image point into normalized coordinates, namely converting the normalized coordinates from an image plane to a camera coordinate system, and calculating three-dimensional coordinates of the two image points by using a triangulation algorithm according to the normalized coordinates of the two image points and a camera projection matrix;

for the three-dimensional coordinates of each matching point pair, the horizontal displacement of the feature points in the two views is calculated by converting the three-dimensional coordinates into the coordinate system of the adjacent cameras, and the horizontal displacement is converted into a parallax value in pixel units according to the camera calibration parameters.

The method for calculating the depth value of each pixel point and obtaining the depth map comprises the following steps:

setting an internal reference matrix and a base line length of the cameras, wherein the internal reference matrix comprises parameters such as focal length, principal point coordinates and the like of the cameras, and the base line length refers to the distance between the two cameras;

according to the pixel coordinates and the disparity value, calculating the coordinates of the feature points under a camera coordinate system;

converting the characteristic points from a camera coordinate system to a world coordinate system by using an internal reference matrix of the camera;

according to the length of the base lines between cameras, calculating the depth value of the feature points according to the triangulation principle;

for each pixel point, constructing a depth image according to the calculated depth value, wherein the depth image is an image with the same size as the original image, and the value of each pixel point represents the depth value of the corresponding feature point.

The method for generating the stereoscopic image comprises the following steps:

setting an internal reference matrix and a base line length of the camera;

ensuring that the depth image and the original image have the same coordinate system and resolution ratio by calibrating the camera, and aligning each pixel point in the depth image with a corresponding pixel point in the original image;

traversing each pixel point in the depth image;

for each pixel point, calculating a parallax value according to the depth value, and mapping the depth value into a proper parallax range by using a scaling or mapping relation;

finding out corresponding pixels in the original image according to the parallax value and the corresponding pixel positions in the original image, copying the pixels to the corresponding positions of the stereoscopic image, and processing the non-integer parallax value by using an interpolation method;

repeating the steps until all the pixel points in the depth image are traversed.

The beneficial effects of the invention are as follows:

stereoscopic feeling: by constructing the camera groups distributed in a triangular manner, the stereoscopic image data of the target area can be acquired. Therefore, the monitoring center and the display device can provide more realistic stereoscopic images, and the real feeling of a user on a monitored scene is enhanced.

Omnibearing coverage: the camera group constructed by taking every three cameras as a group enables the monitoring area to be more comprehensively covered. Thus, no matter where the target object is located, the target object can be captured by a certain camera in the camera group, and the probability of missing the target is reduced.

Data synchronization: in each group of cameras, each independent camera performs data synchronization to ensure that acquired image data are consistent in time. Thus, the subsequent image synthesis processing can more accurately fuse the images with multiple visual angles to obtain a high-quality stereoscopic image.

Remote monitoring capability: the collected image data is transmitted to a monitoring center and display equipment, so that remote monitoring is realized. Therefore, monitoring personnel can remotely view the stereoscopic image of the target area at any time, and can acquire comprehensive monitoring information without being on the scene.

Improving recognition and analysis accuracy: the stereo image data provides more spatial information, and can enhance the recognition and analysis accuracy of the target object.

Detailed Description

The principles and features of the present invention are described below in connection with the following examples which are provided for the purpose of illustrating the invention and are not intended to limit the scope of the invention. The invention is more specifically described by way of example in the following paragraphs. Advantages and features of the invention will become more apparent from the following description and from the claims.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.

Examples

A method for remote monitoring of a person in need of such a process,

The independent cameras in the camera group face the same direction and form an overlapping area capable of capturing a common scene, the same scene is observed at different angles through the directions of the independent cameras, so that image data of multiple visual angles can be obtained, the complete scene of a target area can be restored better through the image data of the overlapping area, and the image data acquired by the cameras can be synthesized by utilizing an image splicing and fusion algorithm to generate a wider and complete monitoring picture.

The independent cameras in each group of cameras are subjected to data synchronization in a time stamp mode to form synchronous image data, the image data collected by each independent camera can be precisely aligned and synthesized through time synchronization, the image frames recorded at the same time are found by using the time stamp, and the image frames are subjected to registration, fusion and other processes to obtain a synthesized image with higher quality.

By capturing the calibration plate and using the camera calibration method, the internal and external parameters of each camera can be accurately calibrated. Therefore, distortion in the image can be eliminated, accurate relative position and posture information of the camera can be obtained, and an accurate basis is provided for subsequent image processing and depth calculation.

By extracting feature points or feature descriptors from the images of each camera and matching each image with other two images by using a feature matching algorithm, feature points shared under different camera angles can be accurately found. Thus, reliable parallax information can be provided, and accurate data can be provided for depth map calculation.

By calculating the parallax between different cameras for each feature point and using the principle of triangulation, the depth value for each pixel point can be calculated. In this way, an accurate depth map can be generated, providing distance information for each point in the scene, making subsequent depth analysis and stereoscopic perception feasible.

The generation of stereoscopic images can be performed by the depth map and the original image data. And the depth information is overlapped with the original image, so that the image effect with more reality and strong stereoscopic impression can be obtained. This is useful in applications requiring three-dimensional shape analysis, virtual reality applications, and the like.

Through calibration and depth map generation, images of a plurality of cameras can be fused, and more comprehensive and accurate visual information is provided. This may support a wider range of applications such as object tracking, behavioral analysis, three-dimensional reconstruction, etc.

The camera calibration can be accurately performed by shooting an image containing the calibration plate and extracting the angular point coordinates of the calibration plate. The calibration result can provide the internal and external parameter information of the camera, including focal length, distortion coefficient and the like, thereby providing an accurate basis for subsequent image processing and calculation.

And calculating correction parameters according to the calibration result of the camera. By applying the parameters to carry out distortion correction on the camera image, the distortion in the image can be removed, so that the straight line in the image becomes more visual and accurate, and the image quality is improved.

After correction, an accurate calibration image can be obtained. The images have more real colors, contrast and shapes, and can provide more reliable data for subsequent tasks such as image analysis, visual identification and the like.

Through camera calibration, the geometric relationship between cameras, including position and attitude, can be obtained. Thus, the consistency of images among different cameras is ensured, and a reliable basis is provided for the subsequent multi-view fusion, three-dimensional reconstruction and other applications.

In the scheme, only a square grid plate with black and white phases and a camera are used for image shooting, and then a calibration result is obtained through calculation and correction. The whole process is relatively simple, easy to implement, and does not require complex equipment or software.

introducing a cv2 module by using a SIFT algorithm of an OpenCV library;

reading an image file using a cv2.imread () function;

creating SIFT feature extractor objects using cv2.xfeature2d.sift_create ();

keypoints,descriptors＝sift.detectAndCompute(image,None)

SIFT is a commonly used feature extraction algorithm with good scale invariance and rotational invariance. The method can effectively detect stable key points in the image and generate feature descriptors corresponding to the stable key points, and can simultaneously obtain a key point list and the corresponding feature descriptors in the image by using a detectantdCommpute () method of the SIFT feature extractor. The key points can be used for subsequent tasks such as feature matching and object recognition, and the feature descriptors can be used for calculating similarity and performing image matching.

The steps can be repeated for the image of each camera, and key points and feature descriptors are extracted, so that feature information under a plurality of camera angles can be obtained for subsequent applications such as stereo matching, three-dimensional reconstruction, multi-view analysis and the like.

OpenCV is a powerful open source computer vision library that contains rich image processing and computing algorithms. Through importing the cv2 module, the SIFT algorithm and other functions provided by the OpenCV can be used for conveniently and rapidly carrying out image processing and analysis, and the SIFT algorithm of the OpenCV is used for carrying out feature extraction, so that proper parameters and configuration can be flexibly selected. At the same time, openCV provides rich image processing functions and tools, making the overall process relatively simple and easy to implement.

importing a cv2 module matching algorithm library;

creating a matcher object using cv2.bfmatcher_create ();

matches＝matcher.match(descriptors1,descriptors2)

By matching with the Brute-Force matcher and the feature descriptors, corresponding feature points can be found among a plurality of camera images. Thus, the characteristics of the same scene or object under different cameras can be accurately identified and matched, and a matching result list corresponding to each image can be obtained through the match () method. The results show the matching degree of the feature points between the two images, can be used for subsequent applications such as image registration, target tracking, stereoscopic vision and the like, can be used for screening matching results based on a distance threshold, and can be used for removing matching pairs with overlarge distances. This eliminates unreliable matches and leaves only the N matching pairs with the shortest distance. By controlling the threshold value and the value of N, flexible adjustment can be performed according to specific requirements.

The above steps can be repeated for each image, feature matching can be performed with the other two images. Thus, the matching result between each image and other images can be obtained, the accuracy and reliability of matching are further improved, and a cv2 module is imported, so that a matching algorithm and tool provided by OpenCV can be used. By creating a matcher object using cv2.bfmatcher_create () and calling the match () method to perform feature matching, the feature matching function can be quickly realized.

By converting the pixel coordinates of the feature points into normalized coordinates and using the camera projection matrices of the two cameras, a triangularization algorithm can be performed to calculate the three-dimensional coordinates of the matching point pairs. Thus, the position information of the object or the scene in the three-dimensional space can be obtained; the three-dimensional coordinates of the matching point pairs can be converted into the coordinate system of the adjacent cameras by using the calibration parameters of the cameras. Thus, the horizontal displacement of the characteristic points in the two views can be obtained; the horizontal displacement of the feature points is calculated and combined with calibration parameters of the camera, so that the horizontal displacement can be converted into a parallax value taking pixels as units. The disparity value is an important index for measuring depth information, and can be used for applications such as stereoscopic vision, distance estimation, three-dimensional reconstruction and the like.

Three-dimensional coordinates of the matching point pairs can be obtained through feature matching and triangulation algorithms, and the parallax value is further calculated. The method is suitable for feature matching and three-dimensional reconstruction among a plurality of cameras, and can be used in fields of multi-view stereoscopic vision, three-dimensional reconstruction, object tracking and the like.

By setting the internal reference matrix of the camera, including parameters such as focal length, principal point coordinates and the like, the imaging characteristics of the camera can be accurately described. This can improve the accuracy and reliability of the depth calculation, with the baseline length referring to the distance between the two cameras, a very important parameter in triangulation. By properly setting the base line length, the depth value of the feature point can be calculated more accurately. This helps to achieve a more accurate three-dimensional reconstruction and depth estimation.

The coordinates of the feature points in the camera coordinate system can be calculated according to the pixel coordinates and the disparity values. In this way, the feature points can be converted from the image plane to the camera coordinate system, preparation is made for the subsequent world coordinate system conversion, and the feature points can be converted from the camera coordinate system to the world coordinate system by utilizing the internal reference matrix of the camera. In this way, the position information of the feature points can be expressed as true three-dimensional coordinates relative to a world coordinate system, and accurate spatial positioning is provided for subsequent applications.

And constructing a depth image by calculating the depth value of the obtained characteristic point. The depth image is an image of the same size as the original image, and the value of each pixel represents the depth value of the corresponding feature point. The depth image is an intuitive visualization tool and can be used for three-dimensional reconstruction, remote sensing analysis and other applications.

The method for generating the stereoscopic image comprises the following steps:

setting an internal reference matrix and a base line length of the camera;

traversing each pixel point in the depth image;

By accurately setting the reference matrix and the base line length of the camera, the accuracy and the reliability of the depth image can be improved. These parameters are important references for depth calculation, accurate setting can obtain more accurate depth values, the camera is calibrated to ensure that the depth image and the original image have the same coordinate system and resolution, and the position of each pixel point in the depth image and the original image can be ensured to be consistent correspondingly. Thus, the depth information and the visual information can be conveniently associated, and a better stereoscopic perception effect is realized.

By calculating disparity values from depth values and mapping the depth values into a suitable disparity range, depth information can be better expressed. The disparity value is the horizontal displacement between the corresponding feature points in the stereoscopic image, a certain relation exists between the disparity value and the depth value, the depth information can be more easily understood and processed through proper scaling or mapping relation, and corresponding pixels are found in the original image according to the disparity value and the corresponding pixel positions in the original image and copied to the corresponding positions in the stereoscopic image. For non-integer disparity values, interpolation methods can be used to maintain image continuity and smoothness. This can obtain a more realistic stereoscopic image effect.

The scheme can be applied to traversing each pixel point in the depth image and processing parallax and depth information of each pixel point. The algorithm has high expandability, can be applied to large-scale image processing tasks, and can improve the processing efficiency in the modes of optimizing the algorithm, parallel computing and the like.

The beneficial effects of the invention are as follows:

The above-mentioned embodiments of the present invention are not intended to limit the scope of the present invention, and the embodiments of the present invention are not limited thereto, and all kinds of modifications, substitutions or alterations made to the above-mentioned structures of the present invention according to the above-mentioned general knowledge and conventional means of the art without departing from the basic technical ideas of the present invention shall fall within the scope of the present invention.

Claims

1. A remote monitoring method is characterized in that,

2. The method of claim 1, wherein each of the individual cameras in the camera set are oriented in the same direction and form an overlapping region that captures a common scene.

3. The remote monitoring method of claim 1, wherein the individual cameras in each group of cameras are data synchronized by means of a time stamp to form synchronized image data.

4. The remote monitoring method according to claim 1, wherein the method for stereoscopic image composition of the received image data at the monitoring center is:

5. The remote monitoring method according to claim 4, wherein the method for calibrating the images of the three cameras by capturing the calibration plate and using the camera calibration method is as follows:

6. The remote monitoring method according to claim 5, wherein the method for extracting feature points or feature descriptors from the image of each camera is as follows:

introducing a cv2 module by using a SIFT algorithm of an OpenCV library;

reading an image file using a cv2.imread () function;

creating SIFT feature extractor objects using cv2.xfeature2d.sift_create ();

keypoints,descriptors＝sift.detectAndCompute(image,None)

7. The remote monitoring method according to claim 6, wherein the method for matching each image with the other two images using a feature matching algorithm is:

importing a cv2 module matching algorithm library;

creating a matcher object using cv2.bfmatcher_create ();

matches＝matcher.match(descriptors1,descriptors2)

8. The remote monitoring method according to claim 7, wherein the method of calculating the parallax between different cameras for each feature point is:

9. The remote monitoring method according to claim 8, wherein the depth value of each pixel is calculated, and the depth map is obtained by:

10. The remote monitoring method according to claim 9, wherein the method for generating the stereoscopic image comprises:

setting an internal reference matrix and a base line length of the camera;

traversing each pixel point in the depth image;