WO2022121024A1

WO2022121024A1 - Unmanned aerial vehicle positioning method and system based on screen optical communication

Info

Publication number: WO2022121024A1
Application number: PCT/CN2020/140729
Authority: WO
Inventors: 文考; 赵毓斌; 须成忠; 刘敦歌
Original assignee: 中国科学院深圳先进技术研究院
Priority date: 2020-12-10
Filing date: 2020-12-29
Publication date: 2022-06-16
Also published as: CN114627398A

Abstract

An unmanned aerial vehicle positioning method and system based on screen optical communication. The method comprises: selecting an electronic screen, and setting marker images on the electronic screen to be symmetrical; arranging an image collector, comprising an unmanned aerial vehicle, an image acquisition module disposed on the unmanned aerial vehicle, and a fixing bracket for adjusting the height of the unmanned aerial vehicle; a camera being arranged on the image acquisition module, and being configured to acquire the marker images on the electronic screen to obtain a video stream comprising marker pictures; constructing a deep learning model, and identifying the marker pictures to obtain a predicted position of the marker images in the pictures; performing image processing to obtain coordinates corresponding to a marker pattern; performing distance measurement to obtain the linear distance between the center of the marker pattern and the camera; and performing angle measurement, and obtaining the actual distance between the camera and the electronic screen, so as to complete the positioning of the unmanned aerial vehicle. The system can improve the positioning accuracy and the effect range of the camera.

Description

A UAV positioning method and system based on screen optical communication

technical field

The invention relates to the technical field of unmanned aerial vehicle positioning, and more particularly, to a method and system for unmanned aerial vehicle positioning based on screen optical communication.

Background technique

Nowadays, the use of drones is becoming more and more popular. There are many applications of drones in outdoor scenes, such as: drones in the agricultural field for seeding and fertilization; in the logistics field, drones are used for automatic delivery; the photography industry uses drones. Aerial photography, etc. In the indoor environment, there is also a large demand for the application of UAV, such as: indoor security, indoor logistics distribution, indoor survey and so on. A major difficulty of UAVs in performing tasks is that the positioning of UAVs is not accurate enough. In some complex scenarios, the position of UAVs needs to be manually adjusted manually, which greatly limits the scope of UAVs and hinders the movement of UAVs. further development of drones.

At present, in outdoor environments, UAVs usually use GPS and autonomous inertial navigation for positioning, which can roughly meet the needs of positioning. However, because GPS signals cannot be used indoors, and because the indoor environment is more complex than outdoor, the positioning of drones indoors is still a major difficulty. In the indoor environment, the UAV mainly adopts the following positioning methods:

1. The positioning method of ultrasonic and optical flow sensors; the UAV is equipped with ultrasonic and optical flow sensors, and the ultrasonic transmitter is used to transmit ultrasonic waves to the surroundings, and the ultrasonic waves reflected from surrounding objects are used to calculate the distance between obstacles and their current position. The optical flow sensor uses the "instantaneous speed" of the pixel motion of the object moving in space on the imaging plane to calculate the horizontal speed of the drone.

The disadvantage of using ultrasonic positioning is that in some complex environments, that is, because the indoor environment is denser and more complex than the outdoor environment, the ultrasonic signal is more likely to be distorted by the multipath effect indoors, and the ultrasonic wave is easily absorbed by the environment. not good. At the same time, since the propagation speed of the sound wave signal is also affected by the temperature of the environment, a certain error is inevitably generated when calculating the distance, that is, the positioning accuracy of the ultrasonic wave itself is not high, and there are problems such as the inability to survey slope obstacles. The optical flow sensor calculates the speed of the object by using the change of the pixel position of the same object on two adjacent frames when the camera shoots a moving object. However, because the camera has a certain shooting distortion, generally speaking, the degree of distortion at the center of the image is small, and the degree of distortion at the edge of the image is large. Therefore, when the same object is in different positions, the pixel itself has a certain offset error. In addition, the image matching algorithm also has calculation errors when calculating the feature points. The speed measurement accuracy of the optical flow sensor is affected by the image processing algorithm, and the speed measurement accuracy is not high. All of these have led to the decline of the positioning accuracy of ultrasonic and optical flow sensors, and it is difficult to meet the high-precision positioning requirements of UAVs in the market.

Generally speaking, the positioning accuracy of this method is low, and it is easily affected by the environment, which cannot meet the high-precision positioning requirements of UAVs.

2. Laser SLAM positioning method; the UAV is equipped with a laser radar. When positioning, the laser radar emits laser light, and the position of the obstacle is determined by calculating the time difference between the time difference between transmitting and receiving the laser beam by using the characteristic of the laser beam to meet the surrounding obstacles and bouncing. In this way, using lidar to determine the position of all surrounding obstacles, SLAM technology can be used to construct a map of surrounding obstacles, so as to determine the relative position of the UAV in the map.

The positioning accuracy of this method is high, which can reach the centimeter level, and the position of the UAV can be located more accurately. However, due to the high cost of lidar, carrying lidar will greatly increase the cost of UAV. In addition, when the reflective surface is rough, the ranging accuracy of lidar will be reduced, so it is not suitable for popularization in the market.

3. The UWB positioning system is adopted; the UWB signal source is mounted on the UAV, and the UBW signal source continuously transmits UWB positioning signals to the surrounding during positioning, and the UWB sensors pre-arranged in different positions of the positioning area receive the signals and use the algorithm (TDOA algorithm) , AOA algorithm) can calculate the relative position of the UAV in the area. The positioning accuracy of this method is high, but due to the high cost of UWB hardware equipment, it is also not suitable for widespread popularization in the market.

4. Visual SLAM method. The positioning is done using the camera mounted on the drone. This positioning method can be divided into two types: monocular camera and multi-eye camera. In the case of a monocular camera, after the camera collects a picture of the surrounding environment, it uses a deep learning model to identify obstacles in the picture, and uses the depth information of the pixels in the picture or the size information of the pixels to calculate the distance of the obstacles. In the case of a binocular camera, the distance calculation can also be performed using the principle of binocular parallax. That is, when the distance between the cameras is known, multiple cameras shoot the same object, and the object has a certain deviation in multiple images. The actual distance between the object and the camera is calculated through this deviation to achieve the purpose of positioning. After obtaining the distance of surrounding obstacles, use SLAM technology to build an environment map, so as to determine the relative position of the UAV in the map.

The advantage of using the visual method is that the cost of the camera is low, and most drones have their own cameras, which will not increase the burden of the drone, that is, it will not affect the weight capacity and cost of the drone. The disadvantage of using this method is that the positioning accuracy is low, and the monocular camera mainly relies on the deep learning model to identify the surrounding obstacles. In indoor environments, the obstacles are complex and changeable, and it is difficult for a model trained in one environment to meet applicable standards in other indoor environments. At the same time, the positioning accuracy of visual SLAM is also limited by the performance of the image matching algorithm, the accuracy of ranging is not high, and this method is also easily disturbed by ambient light, the positioning distance is short, and the ranging range is relatively small. Due to the above defects, this positioning method cannot be well adapted to market demands.

To sum up, the defects of the first and third positioning methods are that the positioning accuracy is not high and depends on the performance of the image matching algorithm. The disadvantage of the second and third positioning methods is that the cost of hardware is high and cannot be widely used in the market. The defects of the current positioning system itself affect the user experience, and at the same time restrict the maturity and popularization of UAVs.

SUMMARY OF THE INVENTION

The purpose of the present invention is to provide a UAV positioning method and system based on screen optical communication, which can improve the positioning accuracy and the scope of action of the camera, aiming at the technical problems existing in the prior art.

In order to solve the problem proposed above, the technical scheme adopted in the present invention is:

The present invention provides a UAV positioning method based on screen optical communication, the positioning method specifically includes:

Select the electronic screen, and set the marker image on the electronic screen to make it symmetrical;

An image collector is provided, including a drone, an image acquisition module arranged on the drone, and a fixed bracket for adjusting the height of the drone; a camera is set on the image acquisition module for collecting markers on the electronic screen The image gets a video stream containing the image of the marker;

constructing a deep learning model, extracting the video stream transmitted by the image collector, and identifying the marker image to obtain the predicted position of the marker image in the image;

According to the predicted position of the marker image, image processing is performed to obtain the coordinates corresponding to the marker pattern therein;

According to the coordinates corresponding to the marker pattern, distance measurement is performed to obtain the straight line distance from the center of the marker pattern to the connection of the camera;

The angle measurement is performed, and the actual distance from the camera to the electronic screen is obtained in combination with the straight-line distance to complete the positioning of the UAV.

Further, the SSD300 model is used to complete the identification of the marker images. The specific identification process is as follows:

In the SSD300 model, input a frame of picture of the video stream transmitted by the image collector;

Preliminarily extract the features of the picture through the VGG-16 network to obtain a feature map;

The locator is used to generate the prediction candidate frame, and the area selected by the candidate frame is used as the feature map to be identified;

The feature map to be identified is processed and transformed to obtain a transformed feature map;

According to the transformed feature map, the fully connected classifier is used to output the estimated value of the position and similarity of the marker image in the picture;

Select the candidate frame whose similarity estimation value is greater than the preset value, and use its pixel position as the final prediction position.

Further, the loss function for the training of the model is defined as the weighted sum of the position error and the confidence error, namely:

Among them, N is the number of candidate frames generated by the locator; x is the indicator parameter; c is the category confidence prediction value; l is the position coordinate of the candidate frame generated by the locator; g is the position coordinate of the manually marked landmark image; α is the weight coefficient, set to 1;

The position error is defined as follows:

The confidence error is defined as follows:

Further, the image processing process includes:

The target area corresponding to the predicted position is enlarged and adjusted so that it can cover the entire marker image to obtain the target area picture;

Binarize the image of the target area, and convert it into a black and white image;

Perform boundary suppression on the marker image according to the black-and-white image, so that it is separated from the background image to obtain a picture of the area with holes;

Filling the perforated area into a whole white pixel area, calculating the center positions and regional positions of all perforated areas, and performing shape detection on the marker patterns therein to obtain the specific positions of the marker patterns;

According to the specific positions of the marker patterns, the marker patterns are sorted to obtain their corresponding coordinates.

Further, the threshold value T of the image conversion is calculated by using the OSTU algorithm, specifically:

For the image I(x, y), the segmentation threshold of its foreground and background is T, the proportion of foreground pixels in the whole image is ω ₀ , and its average gray level μ ₀ ; the proportion of background pixels in the whole image is ω ₁ , its average gray level is μ ₀ ; the total average gray level is μ, and the inter-class variance is g; the size of the image is M×N, and the number of pixels in the image whose gray value is less than the threshold T is N ₀ , which is greater than The number of pixels of the threshold T is N ₁ ;

The following formula is obtained:

ω ₀ =N ₀ /M×N

ω ₁ =N ₁ /M×N

N ₀ +N ₁ =M×N

μ ₀ +μ ₁ =1

μ=ω ₀ ×μ ₀ +ω ₁ ×μ ₁

g=ω ₀ (μ ₀ -μ) ² +ω ₁ (μ ₁ -μ) ²

Obtained from the above formula:

g=ω ₀ ω ₁ (μ ₀ -μ ₁ ) ²

The traversal method is used to take the value of T, and the threshold T when the inter-class variance g is maximized is obtained.

Further, the shape detection process includes the following:

Draw a vertical tangent line from the centroid position of all areas with holes, and record the number of pixel points with pixel values 0 and 1 cut from top to bottom;

The white color block corresponding to the pixel value 1 in the area with the hole is regarded as the peak, and the black color block corresponding to the pixel value 0 is regarded as the trough;

Check the number of peaks and troughs in all areas with holes, and remove image areas that obviously do not conform to the marker pattern;

According to the image area where the number of crests and troughs meet the requirements, the proportional similarity of crests and troughs is calculated by using Euclidean distance;

Traverse each white area, find all the areas that meet the proportional similarity, and use it as the specific location of the marker pattern.

Further, the distance measurement is performed using the parallax principle, specifically:

Consider the two points P _l1 and P _r1 on the electronic screen as a camera, point A on the image collector as an object, and the projections of point A on the imaging planes of P _l1 and P _r1 are denoted as P _l2 and P _l3 , then The following formula is obtained:

According to the similarity principle, ΔAP _l1 P _r1 and ΔAP _l2 P _r2 are similar, and the following equation is obtained:

Among them, the resolution of the camera on the image collector is PPI, the length of the unit pixel on the image collected by the image collector is PXM; the real distance between P _l1 and P _r1 on the electronic screen is B, and the distance on the picture is Z; The focal length of the camera on the image collector is F; the distance between the image collector and the electronic screen is D;

Assume that the measured parallax of the landmark image is Z ₁ when the distance D ₁ is known; for any distance D ₂ , the measured parallax is Z ₂ . According to the known D ₁ and Z ₁ , the measured parallax is obtained:

The obtained D ₂ is the straight-line distance D from the center point of the marker pattern to the connection line of the camera.

Further, angle measurement is performed, including:

When there is no need to deflect the camera to collect the marker image, the angle of the camera obtained is:

Among them, the horizontal distance from the camera to the center of the electronic screen is DX, and the vertical distance is DY;

When the camera needs to be deflected to collect the marker image, the rewritten Music algorithm is used to calculate the offset angle of the camera, specifically:

Four equidistant vertical tangent lines are made on the marker pattern of the obtained specific position, and the distances in the horizontal and vertical directions are X1-X4 and Y1-Y4 respectively, and the distance between the tangent lines in the vertical direction is set as d;

Constructing the incident signal

Among them, Z1, Z2, Z3, and Z4 are:

Z1=0

According to the covariance matrix of the incident signal and decompose it to get:

R(i)=AR _X A ^H +σ ² I

Among them, A is the directional response vector extracted from the incident signal X(i), H is the conjugate transpose of the covariance matrix, σ ² is the noise power, and I is the identity matrix;

The eigenvector corresponding to the eigenvalue γ obtained from the above formula is v(θ), and it is sorted according to the size of the eigenvalue. The eigenvector corresponding to the maximum value is regarded as the signal part space, and the remaining 3 eigenvalues and features The vector is regarded as the noise part space, and the noise matrix _En is obtained, namely:

A ^H υ _i (θ)=0, i=2,3,4

E _n = [υ ₂ (θ), υ ₃ (θ), υ ₄ (θ)]

The offset angle P of the camera in the horizontal direction is:

Among them, a is the signal vector extracted from the incident signal X(i).

Further, when the marker picture obtained by the camera offset angle is deformed, the deformation degree of the upper segment of the picture is converted into a distance, which specifically includes:

Suppose the transformation matrix captured by the camera is:

K=[α _-N ,α _1-N ,α _2-N ,...,α ₀ ,...α _N-2 ,α _N-1 ,α _N ];

Since the degree of distortion of the camera is symmetrical about the center when collecting images, we get:

α _-N =α _N >α _1-N =α _N-1 >...>α ₀

Assuming that the two line segments used to calculate the angle are at p and q on the marker image, the corresponding calculated distances are D _p and D _q , the pixel sizes corresponding to the two line segments are P _p and P _q , and the overall pattern is The side length is L and the camera focal length is F, then the following formula is obtained:

From the above formula, the pixel difference between the two line segments is W, that is:

W=P _p α _P -P _q α _q

Among them, the pixel sizes corresponding to the two line segments are P _p and P _q , the distortion degrees corresponding to the two line segments are α _P and α _q , and the values of p and q are 0-N;

When q=0 and p=N, the pixel difference W of the line segments at both ends is the largest, and the error of the Music algorithm is the smallest.

The present invention also provides a system for a UAV positioning method based on screen optical communication, the system comprising:

Electronic screen selection module: used to select the electronic screen, and set the marker image on the electronic screen to make it symmetrical;

Collector setting module: used to set the image collector, including the drone, the image acquisition module set on the drone, and the fixed bracket for adjusting the height of the drone;

Model building module: used to extract the video stream transmitted by the image collector, identify the marker image, and obtain the predicted position of the marker image in the image;

Image processing module: for performing image processing according to the predicted position of the marker image to obtain the coordinates corresponding to the marker pattern therein;

Distance measurement module: used to measure the distance according to the coordinates corresponding to the marker pattern, and obtain the straight-line distance from the center of the marker pattern to the connection of the camera;

Angle measurement and positioning module: used for angle measurement, and combined with the straight-line distance to obtain the actual distance from the camera to the electronic screen to complete the UAV positioning.

Compared with the prior art, the beneficial effects of the present invention are:

In the UAV positioning method and system provided by the present invention, the marker image with symmetry on the electronic screen is adopted, the marker image is collected by the image collector, the deep learning model is constructed, and the image processing method is used to complete the marker. Accurate identification of object patterns, distance measurement and angle measurement of the camera after identification, to obtain the relative distance between the camera and the electronic screen, and complete the precise positioning of the camera, which is simple, reliable and easy to implement, and greatly improves the positioning accuracy, Solved the problem that the marker image cannot be recognized due to the too small image of the marker at a long distance from the camera. The range of the camera has been improved.

Description of drawings

In order to illustrate the solutions in the present invention more clearly, the following will briefly introduce the accompanying drawings used in the description of the embodiments. Obviously, the accompanying drawings in the following description are some embodiments of the present invention, which are common in the art. As far as technical personnel are concerned, other drawings can also be obtained based on these drawings without any creative effort. in:

Fig. 1 is the flow chart of the UAV positioning method based on screen optical communication of the present invention;

FIG. 2 is a schematic structural diagram of a marker image of the present invention;

Fig. 3 is the schematic diagram of the server connection of the present invention;

4 is a schematic diagram of the SSD300 model for image recognition in the present invention;

5 is a flowchart of an image processing process in the present invention;

6 is a schematic diagram of an image after boundary suppression in the present invention;

Fig. 7 is the image schematic diagram after retaining the area with holes in the present invention;

8 is a schematic diagram of an image in which all areas with holes are filled with white areas in the present invention;

Fig. 9 is the flow chart of shape detection in the present invention;

10 is a schematic diagram of the peak and trough positions of the marker pattern in the present invention;

11 is a schematic diagram of the principle of camera ranging in the present invention;

12 is a schematic diagram of angle measurement in the present invention;

Figure 13 is a schematic diagram of selecting 4 tangents in the present invention;

FIG. 14 is a flow chart of the system of the UAV positioning method based on screen optical communication of the present invention.

Detailed ways

In order to facilitate understanding of the present invention, the present invention will be described more fully hereinafter with reference to the related drawings. Preferred embodiments of the invention are shown in the accompanying drawings. However, the present invention may be embodied in many different forms and is not limited to the embodiments described herein. Rather, these embodiments are provided so that a thorough and complete understanding of the present disclosure is provided.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terms used herein in the description of the present invention are for the purpose of describing specific embodiments only, and are not intended to limit the present invention.

Referring to Figure 1, the present invention provides a UAV positioning method based on screen optical communication, the positioning method specifically includes:

Step S1: selecting an electronic screen, and setting a marker image on the electronic screen; the marker image has symmetry.

In the embodiment of the present invention, the electronic screen adopts a smart electronic screen, and the marker image adopts four square patterns, which are arranged in a square, as shown in FIG. The characteristic is that the marker image has the characteristics of up-down, left-right symmetry. The interval between the four marker patterns is 1/7 of the side length of the marker image (the marker pattern is a square), and parameters such as side length and arrangement spacing of the marker images are all known values. Specifically, the smart electronic screen adopts a 55-inch large-scale display, and the display can display high-definition color patterns.

Step S2: Setting up an image acquisition device, including an image acquisition module with a monocular camera, a height-adjustable fixed bracket, and a drone, wherein the drone is provided with an image acquisition module, which is arranged on the fixed bracket, and is fixed on the drone. The bracket adjusts the height of the drone; the image acquisition module collects the marker image on the electronic screen to obtain the marker picture.

In the embodiment of the present invention, the image collector can realize camera distortion calibration, video stream transmission, and focal length locking. Its range of action is 0-20 meters, and the angle is ±60°. ) to locate. Specifically, the pixels of the camera, the format of the captured image, and the horizontal and vertical resolutions are all set in advance.

Step S3 : constructing a deep learning model, extracting the video stream transmitted by the image collector, and recognizing the marker image therein, to obtain the predicted position of the marker image in the image.

In the embodiment of the present invention, the server is connected to the electronic screen through the network, and controls the display and refresh of the marker pattern on the electronic screen. At the same time, the video stream containing the marker picture transmitted by the image collector is received, as shown in FIG. 3 .

Step S4: Perform image processing according to the predicted position of the marker image to obtain the coordinates corresponding to the marker pattern in the marker image.

Step S5: According to the coordinates corresponding to the marker pattern, distance measurement is performed to obtain a straight line distance from the center point of the marker pattern to the connection line of the camera.

Step S6: Measure the angle, and obtain the actual distance from the camera to the electronic screen in combination with the straight-line distance, so as to complete the positioning of the drone.

In the embodiment of the present invention, an electronic screen is selected, a marker image with symmetry is set, an image collector for collecting the marker image is set, a deep learning model is constructed, and the image processing method is used to complete the accurate marker pattern. Identify and obtain the relative distance between the camera and the electronic screen to complete the precise positioning of the camera. The entire positioning method is simple, reliable and easy to implement, and the positioning accuracy is also high.

In the embodiment of the present invention, in the step S3, the deep learning model adopts the SSD300 model to complete the identification of the marker image, and obtains the predicted position of the marker image in the picture, as shown in FIG. 4, the specific identification process is as follows :

Step S31: Input a frame of picture of the video stream transmitted by the image collector in the SSD300 model. Specifically, the picture size is: 4032×3024×3.

Step S32: Preliminarily extract the features of the picture through the VGG-16 network to obtain a feature map. Specifically, the size of the feature map is: 38×38×512.

Step S33 : using the locator to generate a certain number of prediction candidate frames, and using the region of the feature map selected by the candidate frame as the feature map to be identified.

Step S34: Process and transform the feature map to be identified to obtain a transformed feature map.

Step S35: According to the transformed feature map, output the estimated value of the position and the similarity degree of the marker image in the picture through the fully connected classifier. Specifically, the fully connected classifier adopts a 256×2 fully connected classifier.

Step S36: Select a candidate frame whose similarity degree estimate value is greater than a preset value (set to 0.6), and use its pixel position as the final prediction position. The pixel position of the candidate frame is the pixel position of the identified marker image in the picture collected by the image collector.

In the embodiment of the present invention, the image collected by the image collector is identified by using the SSD300 model, and the predicted position of the marker image is obtained through processing and transformation, and the final positioning calculation of the UAV position is completed based on the obtained predicted position. It is simple, reliable and easy to implement, and also ensures the effectiveness and accuracy of the positioning method.

Further, in the step S34, the processing conversion process is as follows:

Step S341 : Convert the feature map to be identified after being processed by the Conv6 module to obtain a first feature map of 19×19×1024.

Step S342: Convert the first feature map after being processed by the Conv7 module to obtain a second feature map of 19×19×1024.

Step S343: Convert the second feature map after being processed by the Conv8 module to obtain a third feature map of 10×10×512.

Step S344: Convert the third feature map after being processed by the Conv9 module to obtain a fourth feature map of 5×5×256.

Step S345: Convert the fourth feature map after being processed by the Conv10 module to obtain a fifth feature map of 3×3×256.

Step S346: Convert the fifth feature map after being processed by the Conv11 module to obtain a sixth feature map of 1×1×256.

In the embodiment of the present invention, the input picture in the SSD300 model displays a digital matrix, and the model outputs the predicted position and scoring result (similarity degree estimate) of the marker image. The scoring result is a specific value used to judge the similarity between the marker image at the predicted position and the actual marker image, and the value is between 0 and 1. Through the above processing and transformation process, the digital matrix is converted into a specific value between 0 and 1 step by step, which is used for the similarity between the image and the marker in the specific area of the model output picture, so as to ensure the reliability and accuracy of positioning. sex.

Further, when recognizing the marker pictures, it is also necessary to train the constructed deep learning model. The specific training process includes:

The image samples are obtained by collecting marker images at different distances, angles and light intensities through an image collector.

By means of manual marking, the real pixel area of the marker image in each picture sample is marked, and the pixel matrix in the area is used as the correct answer for training. Specifically, the correct answer is the specific position of the marker image obtained by manual marking, which corresponds to the pixel value in an area in the embodiment of the present invention, and the pixel value in this area is given in the form of coordinates, such as (5 , 5, 10, 10), this coordinate represents all the pixels in the square with the starting point at (5, 5) and the end point at (10, 10).

In the embodiment of the present invention, the SSD300 model needs to be trained before it is used for identification. The input marker picture and the corresponding specific position (that is, the correct answer) model can predict a result (this result is also a position) according to the input. The predicted result is compared with the actual specific position, and then the model parameters are adjusted by comparing the predicted value and the actual value, and finally the model can better predict the value close to the correct answer, which ensures the accuracy of the predicted position of the model. and ensure the validity and reliability of the entire positioning method. In the embodiment of the present invention, the initial value of the VGG-16 network in the SSD300 model loads the open-source pre-training parameters on the external Git-hub, and the parameters of the Conv6-Conv11 module are randomly initialized parameters. Since model training takes a long time, it is more efficient to let the model copy the parameters trained by the external model, and then further train on this basis, so that you can quickly train your own model.

Further, the loss function trained by the deep learning model is defined as the weighted sum of the position error and the confidence error, namely:

Among them, N is the number of candidate boxes generated by the locator;

x is an indicator parameter, indicating whether the pixel area corresponding to the current candidate frame matches the artificially marked pixel area, where x _i ∈ {0,1}, x _i =1 indicates that the i-th candidate frame pixel area matches the marker pixel, If it is 0, it does not match;

c is the category confidence prediction value, which is the result of the current candidate frame predicted by the model;

l is the position coordinate of the candidate frame generated by the locator, and four values are used to represent the vertex coordinates (c _x1 , c _y1 ) of the upper left corner, the length w ₁ of the candidate frame and the height h ₁ of the candidate frame;

g is the position coordinate of the artificially marked marker image, and four values are used to represent the vertex coordinates (c _x2 , c _y2 ) of the upper left corner of the area, the length w ₂ of the area and the height h ₂ of the area.

α is the weight coefficient, which is set to 1.

Specifically, the position error is defined as follows:

Specifically, the definition of the confidence error is as follows:

In the embodiment of the present invention, the training target is to minimize the position error L _loc , and the model can be used for the identification of marker pictures after being trained. The video stream transmitted by the image collector is passed to the SSD300 model for detection. If there is a marker picture in the video image, the model finally outputs the coordinate position l and predicted value c of the candidate frame; if there is no marker picture in the video image, the model finally outputs. Do not output any results.

Further, in the step S4, there is a certain error between the predicted position of the marker image (that is, the target area position) obtained through the SSD300 model and the real pixel area of the marker image, so it cannot be directly used for positioning calculation, and it needs to be calculated. Perform image processing and further corrections. Referring to Figure 5, the image processing process includes:

Step S41: Adjusting the image area, that is, obtaining the target area corresponding to the predicted position according to the SSD300 model. Since the data of the predicted position is inaccurate, it is necessary to expand and adjust the scope of the target area, so that the target area can cover the entire marker image, and obtain: Image of the target area.

Specifically, the expansion method of the target area is confirmed according to the distribution of pixel values. The black pixel ratio of the marker image itself is 58.67%. After binarizing the target area output by the SSD300 model, it is determined whether the black pixel ratio in the target area is 58.67%. % nearby to judge. Further, the adopted judgment threshold is 45.67%, that is, if the black pixel ratio is lower than 38.67, the edge of the target area will be expanded outward by 20 pixels; if the black pixel ratio increases, the target area will continue to expand outward until the area The inner black pixel ratio decreases before it stops expanding outward.

Step S42: target segmentation, that is, performing binarization processing on the adjusted image of the target area, converting the color RGB image into a black and white image, and obtaining a black and white image of the target area.

Specifically, the threshold value T of the image conversion is calculated by using the OSTU algorithm, and the process is as follows:

For the image I(x, y), the segmentation thresholds of the foreground (target) and the background are denoted as T, the proportion of the foreground pixels in the entire image is denoted as ω ₀ , and its average gray level μ ₀ ; the background pixels account for The scale of the whole image is ω ₁ , and its average gray level is μ ₀ . The overall average gray level of the image is denoted as μ, and the between-class variance is denoted as g.

The electronic screen adopts a light-colored image background. Since the main body of the marker image is black, the color difference between the background and the marker image is large, and the size of the image is M×N, and the gray value of the pixel in the image is The number of pixels smaller than the threshold T is denoted as N ₀ , and the number of pixels whose pixel grayscale is greater than the threshold T is denoted as N ₁ , then there is the following formula:

ω ₀ =N ₀ /M×N (4)

ω ₁ =N ₁ /M×N (5)

N ₀ +N ₁ =M×N (6)

μ ₀ +μ ₁ =1 (7)

μ=ω ₀ ×μ ₀ +ω ₁ ×μ ₁ (8)

g=ω ₀ (μ ₀ -μ) ² +ω ₁ (μ ₁ -μ) ² (9)

That is, the equivalent formula is obtained as:

g=ω ₀ ω ₁ (μ ₀ -μ ₁ ) ² (10)

The traversal method is used to take the value of T (the range is between 0 and 255), and the threshold T when the variance g between classes is maximized is obtained, which is the desired value.

Step S43: Perform boundary suppression on the marker image according to the binarized black and white image, so that the marker image and the background image are separated.

In the embodiment of the present invention, since the color difference between the preset background image and the marker image is relatively large, the boundary between the marker images is theoretically obvious. In order to further strip the marker image from the background image, it is necessary to Do another boundary suppression operation.

Further, the boundary suppression process is specifically: traverse each pixel in the marker image, and compare the surrounding 8 pixels connected to the pixel (8 directions) (the pixel at the edge of the image will be less than 8). ), if the value of the pixel in other directions except the boundary is 0, the pixel is considered to be a pixel adjacent to the boundary, and the pixel is cleared.

In the embodiment of the present invention, as shown in FIG. 6 , each marker pattern is composed of a black border, a white border and a black square spliced together. After the boundary suppression is completed, many independent pixel areas without holes will be left in the marker image, and all pixel areas without holes will be removed, and finally the area with holes as shown in Figure 7 will remain. The area contains four marker patterns.

Step S44 : perform shape detection, as shown in FIG. 8 , fill all the areas with holes in the image after boundary suppression as a whole white pixel area, that is, only the areas with holes are retained on the binarized marker image , calculate the center position (ie centroid) and area position of all the regions with holes, and perform shape detection on the marker pattern in the marker image, as shown in Figure 9, the specific process includes the following:

Step S441 : Draw a vertical tangent line from the centroid positions of all areas with holes, and record the number of pixel points with pixel values 0 and 1 cut by the tangent line from top to bottom, respectively.

Step S442: The white color block (ie the part with the pixel value of 1) in the hole area is regarded as the peak, and the black color block (ie the part with the pixel value of 0) is regarded as the trough, then the relative width of the peak and the trough is It can be measured by the number of 0s and 1s that the tangent passes through, as shown in Figure 10. Specifically, if the area is a marker pattern, then the number of wave crests and wave troughs that the tangent line passes through must be 3 and 2.

Step S443: Check the number of peaks and troughs in all areas with holes, and remove image areas that obviously do not conform to the marker pattern.

Step S444: Comparing the width ratios of the peaks and troughs according to the image area in which the number of peaks and troughs meets the requirements, to obtain the theoretical ratio similarity between the two.

Specifically, the Euclidean distance is used to measure the calculation to obtain the similarity between the peak and trough ratio and the theoretical peak and trough ratio, specifically:

Assuming that the calculated ratio between peaks and troughs is X1:X2:X3:X4:X5, the similarity y calculated by Euclidean distance is:

Through the experimental simulation, when the value of y is less than 0.8, the effect of detecting the pattern of the marker is better. Therefore, the value of y is less than 0.8 as the criterion for judging that the region is the pattern of the marker.

Step S445: Traverse each white area, find all areas that meet the proportional similarity, and use it as the specific position of the marker pattern.

Step S45: According to the specific positions of the four marker patterns obtained, the bubble sorting algorithm is used to sort the four marker patterns to obtain their corresponding coordinates.

In the embodiment of the present invention, since the arrangement of the marker patterns is known in advance, according to the relative size of the horizontal and vertical coordinates, it can be known that the positions of the four coordinates correspond to the specific four marker patterns respectively. According to the obtained coordinates corresponding to the four marker patterns, it can be used to further obtain the relative position of the image collector to the smart electronic screen as follows.

In the embodiment of the present invention, the recognition of the marker pattern is completed by constructing a deep learning model and combining with the image processing method, which solves the problem that the marker cannot be recognized due to the too small image of the marker at a long distance of the camera. Compared with other cameras The positioning technology greatly improves the scope of the camera positioning, thereby improving the positioning accuracy.

Further, in the step S5, according to the coordinates corresponding to the obtained marker pattern, since the marker pattern is designed as a set of geometric figures with symmetrical characteristics, the distance measurement is performed by using the parallax principle.

In the embodiment of the present invention, the principle of parallax means that there is a certain difference between the positions of the same object in the images captured by two cameras with a fixed distance, and the distance between the two cameras can be calculated more accurately by using this difference. the distance between.

Specifically, two points P _l1 and P _r1 on the electronic screen are regarded as cameras, and point A on the image collector is regarded as an object. P _l2 and P _l3 in the picture collected by the image collector can be regarded as the projection of point A on the imaging planes of P _l1 and P _r1 , so the line segment P _l2 and P _r2 is the point A on the two sides of P _l1 and P _r1 . The parallax under the imaging plane of each "camera" is shown in Figure 11.

Further, let the real distance between P _l1 and P _r1 on the electronic screen be B, in millimeters; the distance between P _l1 and P _r1 on the picture is Z, in pixels; the focal length of the camera on the image collector is F, The unit is mm; the distance between the image collector and the electronic screen is D, the unit is mm; the resolution of the camera on the image collector is recorded as PPI; the length of the unit pixel on the image collected by the image collector is PXM, the unit is pixel/mm ; that is, the following formula is obtained:

The distance D can be obtained from equations (11) and (12).

Further, in the actual operation process, because the focal length of the lens marked by the image collector is not equal to the focal length of the actual shooting, and some processing is performed on the picture after shooting, the parameters of the lens focal length F and resolution PPI of the camera are different from the actual ones. There is a certain deviation, so the calibration method is used to solve the problem caused by inaccurate camera parameters, specifically:

Assume that the measured parallax of the landmark image is Z ₁ when the distance D ₁ is known; for any distance D ₂ , the measured parallax is Z ₂ , and formula (12) is rewritten as follows:

Then, when a set of real values D ₁ and Z ₁ are measured in advance, for any D ₂ there are:

Therefore, the measured D ₂ is the straight-line distance from the center point of the marker pattern to the connection line of the camera.

In the embodiment of the present invention, since the set marker pattern has the characteristic of collective symmetry, the marker pattern with geometric symmetry is used for positioning calculation, so that the monocular camera can also use the parallax principle to perform distance measurement, so that the positioning method does not depend on image matching. The effect of the algorithm can greatly improve the positioning accuracy, and can achieve the ranging accuracy of dual cameras, and also reduce the material cost.

Further, in the step S6, angle measurement is performed, and the actual distance from the camera to the electronic screen is obtained, which specifically includes:

Step S61: When the image collector is near the center axis of the electronic screen, the image of the marker on the electronic screen can be collected without deflecting the camera of the image collector.

Specifically, as shown in FIG. 12, it is assumed that the horizontal distance from the camera to the center of the electronic screen is DX, and the vertical distance is DY; The horizontal pixel difference is X, and the vertical pixel difference is Y; the width of a single marker pattern on the picture is PX pixels, and the height is PY pixels; the actual side length of a single marker pattern is L, that is, the actual horizontal distance is obtained DX and vertical distance DY are:

Assuming that the previously measured distance is D, the formula for calculating the angle of the camera is as follows:

Step S62: When the image collector is far away from the center of the electronic screen, the camera of the image collector needs to be deflected to collect the image of the marker on the electronic screen.

Specifically, the rewritten Music algorithm is used to calculate the offset angle of the camera, and then the actual offset distances DX and DY from the camera to the electronic screen are derived. The calculation process is as follows:

After locating the areas of the four marker patterns, that is, the specific positions, draw four equidistant vertical tangents on the four marker pattern areas, as shown in Figure 13, and measure the four tangents by the method of distance measurement. The distances are X1, X2, X3, and X4. At the same time, four equidistant vertical tangent lines are made on the overall pattern, and the distances measured by these four tangent lines through the distance measurement method are Y1, Y2, Y3, and Y4. When the camera is deflected horizontally, the line segment in the vertical direction is deformed greatly, which is suitable for angle estimation. Therefore, the data on Y1~Y4 are used to measure the horizontal declination angle. Similarly, X1~X4 data are used to estimate the declination angle in the vertical direction.

Further, taking the estimation of the declination angle in the horizontal direction as an example, set the tangent distance in the vertical direction to be d, then construct the incident signal

Among them, Z1, Z2, Z3, and Z4 are:

z1 = 0;

The covariance matrix of the incident signal is as follows:

Rx(i)= _X (i) ^XH (i) (19)

where H represents the conjugate transpose of the covariance matrix.

The eigendecomposition of the covariance matrix obtained by formula (19) can be obtained:

R(i)=AR _x A ^H +σ ² I (20)

Among them, A is the directional response vector extracted from the incident signal X(i), σ ² is the noise power, and I is the identity matrix.

According to formula (20), the eigenvector corresponding to the eigenvalue γ is v(θ), and it is sorted according to the size of the eigenvalue. The eigenvector corresponding to the largest eigenvalue is regarded as the signal part space, and the remaining three eigenvalues and The eigenvectors are regarded as the noise part space, and the noise matrix _En is obtained, namely:

A ^H v _i (θ) = 0, i = 2, 3, 4 (21)

_En = [v ₂ (θ), v ₃ (θ), v ₄ (θ)] (22)

Finally, the offset angle P is estimated, and a is the signal vector extracted from the incident signal X(i), namely:

Step S63: When the camera is deflected by a certain angle, the image of the marker collected by the image collector has a certain degree of deformation, and when the camera is deflected at different angles, the degree of deformation on the picture is often different. Therefore, the degree of deformation on the picture actually contains the information of the offset angle of the camera, and the Music algorithm converts the degree of deformation of the upper segment of the picture into a distance as the input value of the algorithm, thereby estimating the deflection angle of the camera and making the Angle error is minimal.

In the embodiment of the present invention, since the deformation degrees of the marker pictures at different positions are different, the errors calculated by the Music algorithm are also different. When the difference between the deformation degrees of the two line segments on the marker image is the largest, the error calculated by the Music algorithm is the smallest. Therefore, in the process of rotating the camera, there must be an angle of rotation, so that the error estimated by the Music algorithm is the smallest. The specific calculation formula is as follows:

Suppose the transformation matrix captured by the camera is:

K=[α _-N ,α _1-N ,α _2-N ,...,α ₀ ,...α _N-2 ,α _N-1 ,α _N ] (24)

Since the degree of distortion of the camera is symmetrical about the center when collecting images, there are:

α _-N =α _N >α _1-N =α _N-1 >...>α ₀ (25)

Further, it is assumed that the two line segments used for calculating the angle are at p and q on the marker image respectively, the corresponding calculated distances are D _p and D _q , and the pixel sizes corresponding to the two line segments are P _p and P _q , The side length of the overall pattern is L, and the focal length of the camera is F, which is obtained according to formula (9):

Further, for calculating the pixel difference between two line segments as W, then:

W=P _p α _P -P _q α _q (30)

When q=0, q=N, that is, when p is at the farthest distance from q, the pixel difference W of the two line segments is the largest, and the error estimated by the Music algorithm is the smallest.

Therefore, in the actual shooting process, one side of the overall pattern should be close to the center of the image, and the other side should be far from the center of the image, so that the angle error estimated by the music algorithm is the smallest.

In the embodiment of the present invention, by rewriting the input value of the Music algorithm, the classical Music algorithm can be used to estimate the declination angle of the camera, that is, the distance between the different points on the marker pattern and the camera is used to estimate the marker pattern and the camera. The declination angle between the camera and the electronic screen greatly improves the estimation accuracy of the declination angle between the camera and the electronic screen, thereby ensuring the reliability of the positioning method.

Referring to FIG. 14 , an embodiment of the present invention further provides a system for a method for positioning an unmanned aerial vehicle based on screen optical communication. The system includes:

Specifically, the system provided in the embodiment of the present invention is specifically used to execute the foregoing method embodiment, which is not repeated in this embodiment of the present invention.

The UAV positioning method and system provided in the embodiment of the present invention adopts the marker image with symmetry on the electronic screen, and uses the deep learning model and image processing method to complete the identification of the marker pattern, which solves the problem of the long distance of the camera. The image of the lower marker is too small to recognize the problem of the marker. The classical Music algorithm is used to estimate the declination angle of the camera, and then the relative distance from the camera to the electronic screen is obtained. Compared with other camera positioning technologies, the camera positioning is greatly improved. range of action, thereby improving the positioning accuracy of the camera.

In the embodiment of the present invention, while ensuring the equipment cost is acceptable, the positioning accuracy of the UAV within a distance of 20 meters can be guaranteed, and at the same time, it has good performance in complex and diverse scenarios, so that the positioning of the UAV can be ensured. Technology can be widely used.

The above-mentioned embodiments are preferred embodiments of the present invention, but the embodiments of the present invention are not limited by the above-mentioned embodiments, and any other changes, modifications, substitutions, combinations, The simplification should be equivalent replacement manners, which are all included in the protection scope of the present invention.

Claims

A UAV positioning method based on screen optical communication, characterized in that: the positioning method specifically includes:

Select the electronic screen, and set the marker image on the electronic screen to make it symmetrical;

An image collector is provided, including a drone, an image acquisition module arranged on the drone, and a fixed bracket for adjusting the height of the drone; a camera is set on the image acquisition module for collecting markers on the electronic screen The image gets a video stream containing a picture of the marker;

constructing a deep learning model, extracting the video stream transmitted by the image collector, and identifying the marker image to obtain the predicted position of the marker image in the image;

According to the predicted position of the marker image, image processing is performed to obtain the coordinates corresponding to the marker pattern therein;

According to the coordinates corresponding to the marker pattern, distance measurement is performed to obtain the straight line distance from the center of the marker pattern to the connection of the camera;

The angle measurement is performed, and the actual distance from the camera to the electronic screen is obtained in combination with the straight-line distance to complete the positioning of the UAV.
The UAV positioning method based on screen optical communication according to claim 1, is characterized in that: adopts SSD300 model to complete the identification of marker pictures, and the specific identification process is as follows:

In the SSD300 model, input a frame of picture of the video stream transmitted by the image collector;

Preliminarily extract the features of the picture through the VGG-16 network to obtain a feature map;

The locator is used to generate the prediction candidate frame, and the area selected by the candidate frame is used as the feature map to be identified;

The feature map to be identified is processed and transformed to obtain a transformed feature map;

According to the transformed feature map, the fully connected classifier is used to output the estimated value of the position and similarity of the marker image in the picture;

Select the candidate frame whose similarity estimation value is greater than the preset value, and use its pixel position as the final prediction position.
The UAV positioning method based on screen optical communication according to claim 2, wherein the loss function for training the model is defined as the weighted sum of the position error and the confidence error, that is:

Among them, N is the number of candidate frames generated by the locator; x is the indicator parameter; c is the category confidence prediction value; l is the position coordinate of the candidate frame generated by the locator; g is the position coordinate of the manually marked landmark image; α is the weight coefficient, set to 1;

The position error is defined as follows:

The confidence error is defined as follows:
The UAV positioning method based on screen optical communication according to claim 2, wherein the image processing process comprises:

Enlarging and adjusting the target area corresponding to the predicted position so that it can cover the entire marker image to obtain a picture of the target area;

Binarize the image of the target area, and convert it into a black and white image;

Perform boundary suppression on the marker image according to the black-and-white image, so that it is separated from the background image to obtain a picture of the area with holes;

Filling the perforated area into a whole white pixel area, calculating the center position and regional position of all perforated areas, and carrying out shape detection to the marker pattern therein to obtain the specific position of the marker pattern;

According to the specific positions of the marker patterns, the marker patterns are sorted to obtain their corresponding coordinates.
The UAV positioning method based on screen optical communication according to claim 4, characterized in that: the threshold value T of the image conversion is calculated by using the OSTU algorithm, specifically:

For the image I(x, y), the segmentation threshold of its foreground and background is T, the proportion of foreground pixels in the whole image is ω 0 , and its average gray level μ 0 ; the proportion of background pixels in the whole image is ω 1 , its average gray level is μ 0 ; the total average gray level is μ, and the inter-class variance is g; the size of the image is M×N, and the number of pixels in the image whose gray value is less than the threshold T is N 0 , which is greater than The number of pixels of the threshold T is N 1 ; then the following formula is obtained:

ω 0 =N 0 /M×N

ω 1 =N 1 /M×N

N 0 +N 1 =M×N

μ 0 +μ 1 =1

μ=ω 0 ×μ 0 +ω 1 ×μ 1

g=ω 0 (μ 0 -μ) 2 +ω 1 (μ 1 -μ) 2

Obtained from the above formula:

g=ω 0 ω 1 (μ 0 -μ 1 ) 2

The traversal method is used to take the value of T, and the threshold T when the inter-class variance g is maximized is obtained.
The UAV positioning method based on screen optical communication according to claim 4, wherein the process of the shape detection comprises the following steps:

Draw a vertical tangent line from the centroid position of all areas with holes, and record the number of pixel points with pixel values 0 and 1 cut from top to bottom;

The white color block corresponding to pixel value 1 in the area with holes is regarded as a peak, and the black color block corresponding to pixel value 0 is regarded as a trough;

Check the number of peaks and troughs in all areas with holes, and remove image areas that obviously do not conform to the marker pattern;

According to the image area where the number of crests and troughs meet the requirements, the proportional similarity of crests and troughs is calculated by using Euclidean distance;

Traverse each white area, find all the areas that meet the proportional similarity, and use it as the specific location of the marker pattern.
The UAV positioning method based on screen optical communication according to claim 6, is characterized in that: adopting the principle of parallax to carry out distance measurement, specifically:

Consider the two points P l1 and P r1 on the electronic screen as a camera, point A on the image collector as an object, and the projections of point A on the imaging planes of P l1 and P r1 are denoted as P l2 and P l3 , then The following formula is obtained:

According to the similarity principle, ΔAP l1 P r1 and ΔAP l2 P r2 are similar, and the following equation is obtained:

Among them, the resolution of the camera on the image collector is PPI, the length of the unit pixel on the image collected by the image collector is PXM; the real distance between P l1 and P r1 on the electronic screen is B, and the distance on the picture is Z; The focal length of the camera on the image collector is F; the distance between the image collector and the electronic screen is D;

Assume that the measured parallax of the landmark image is Z 1 when the distance D 1 is known; for any distance D 2 , the measured parallax is Z 2 . According to the known D 1 and Z 1 , the measured parallax is obtained:

The obtained D 2 is the straight-line distance D from the center point of the marker pattern to the connection line of the camera.
The UAV positioning method based on screen optical communication according to claim 1, is characterized in that: carrying out angle measurement, specifically comprises:

When there is no need to deflect the camera to collect the marker image, the angle of the camera obtained is:

Among them, the horizontal distance from the camera to the center of the electronic screen is DX, and the vertical distance is DY;

When the camera needs to be deflected to collect the marker image, the rewritten Music algorithm is used to calculate the offset angle of the camera, specifically:

Four equidistant vertical tangent lines are made on the marker pattern of the obtained specific position, and the distances in the horizontal and vertical directions are X1-X4 and Y1-Y4 respectively, and the distance between the tangent lines in the vertical direction is set as d;

Constructing the incident signal
Among them, Z1, Z2, Z3, and Z4 are:

Z1=0

According to the covariance matrix of the incident signal and decompose it to get:

R(i)=AR X A H +σ 2 I

Among them, A is the directional response vector extracted from the incident signal X(i), H is the conjugate transpose of the covariance matrix, σ 2 is the noise power, and I is the identity matrix;

The eigenvector corresponding to the eigenvalue γ obtained from the above formula is v(θ), and it is sorted according to the size of the eigenvalue. The eigenvector corresponding to the maximum value is regarded as the signal part space, and the remaining 3 eigenvalues and features The vector is regarded as the noise part space, and the noise matrix En is obtained, namely:

A H υ i (θ)=0, i=2,3,4

E n = [υ 2 (θ), υ 3 (θ), υ 4 (θ)]

The offset angle P of the camera in the horizontal direction is:

Among them, a is the signal vector extracted from the incident signal X(i).
The UAV positioning method based on screen optical communication according to claim 8, characterized in that: when the marker picture obtained by the camera offset angle collection is deformed, the deformation degree of the upper segment of the picture is converted into a distance, which specifically includes: :

Suppose the transformation matrix captured by the camera is:

K=[α -N ,α 1-N ,α 2-N ,...,α 0 ,...α N-2 ,α N-1 ,α N ];

Since the degree of distortion of the camera is symmetrical about the center when collecting images, we get:

α -N =α N >α 1-N =α N-1 >...>α 0

Assuming that the two line segments used to calculate the angle are at p and q on the marker image, the corresponding calculated distances are D p and D q , the pixel sizes corresponding to the two line segments are P p and P q , and the overall pattern is The side length is L and the camera focal length is F, then the following formula is obtained:

From the above formula, the pixel difference between the two line segments is W, that is:

W=P p α P -P q α q

Among them, the pixel sizes corresponding to the two line segments are P p and P q , the distortion degrees corresponding to the two line segments are α P and α q , and the values of p and q are 0-N;

When q=0 and p=N, the pixel difference W of the line segments at both ends is the largest, and the error of the Music algorithm is the smallest.
A system based on the UAV positioning method based on screen optical communication according to any one of claims 1-9, characterized in that: the system comprises:

Electronic screen selection module: used to select the electronic screen, and set the marker image on the electronic screen to make it symmetrical;

Collector setting module: used to set the image collector, including the drone, the image acquisition module set on the drone, and the fixed bracket for adjusting the height of the drone;

Model building module: used to extract the video stream transmitted by the image collector, identify the marker image, and obtain the predicted position of the marker image in the image;

Image processing module: for performing image processing according to the predicted position of the marker image to obtain the coordinates corresponding to the marker pattern therein;

Distance measurement module: used to measure the distance according to the coordinates corresponding to the marker pattern to obtain the straight-line distance from the center of the marker pattern to the connection of the camera;

Angle measurement and positioning module: used for angle measurement, and combined with the straight-line distance to obtain the actual distance from the camera to the electronic screen, to complete the UAV positioning.