WO2022121024A1 - Unmanned aerial vehicle positioning method and system based on screen optical communication - Google Patents

Unmanned aerial vehicle positioning method and system based on screen optical communication Download PDF

Info

Publication number
WO2022121024A1
WO2022121024A1 PCT/CN2020/140729 CN2020140729W WO2022121024A1 WO 2022121024 A1 WO2022121024 A1 WO 2022121024A1 CN 2020140729 W CN2020140729 W CN 2020140729W WO 2022121024 A1 WO2022121024 A1 WO 2022121024A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
marker
camera
distance
electronic screen
Prior art date
Application number
PCT/CN2020/140729
Other languages
French (fr)
Chinese (zh)
Inventor
文考
赵毓斌
须成忠
刘敦歌
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2022121024A1 publication Critical patent/WO2022121024A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C11/00Photogrammetry or videogrammetry, e.g. stereogrammetry; Photographic surveying
    • G01C11/36Videogrammetry, i.e. electronic processing of video signals from a single source or from different sources to give parallax or range information
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/16Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using electromagnetic waves other than radio waves
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/10Simultaneous control of position or course in three dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content

Definitions

  • the invention relates to the technical field of unmanned aerial vehicle positioning, and more particularly, to a method and system for unmanned aerial vehicle positioning based on screen optical communication.
  • UAVs usually use GPS and autonomous inertial navigation for positioning, which can roughly meet the needs of positioning.
  • GPS signals cannot be used indoors, and because the indoor environment is more complex than outdoor, the positioning of drones indoors is still a major difficulty.
  • the UAV mainly adopts the following positioning methods:
  • the positioning method of ultrasonic and optical flow sensors the UAV is equipped with ultrasonic and optical flow sensors, and the ultrasonic transmitter is used to transmit ultrasonic waves to the surroundings, and the ultrasonic waves reflected from surrounding objects are used to calculate the distance between obstacles and their current position.
  • the optical flow sensor uses the "instantaneous speed" of the pixel motion of the object moving in space on the imaging plane to calculate the horizontal speed of the drone.
  • the disadvantage of using ultrasonic positioning is that in some complex environments, that is, because the indoor environment is denser and more complex than the outdoor environment, the ultrasonic signal is more likely to be distorted by the multipath effect indoors, and the ultrasonic wave is easily absorbed by the environment. not good.
  • the propagation speed of the sound wave signal is also affected by the temperature of the environment, a certain error is inevitably generated when calculating the distance, that is, the positioning accuracy of the ultrasonic wave itself is not high, and there are problems such as the inability to survey slope obstacles.
  • the optical flow sensor calculates the speed of the object by using the change of the pixel position of the same object on two adjacent frames when the camera shoots a moving object.
  • the camera has a certain shooting distortion, generally speaking, the degree of distortion at the center of the image is small, and the degree of distortion at the edge of the image is large. Therefore, when the same object is in different positions, the pixel itself has a certain offset error.
  • the image matching algorithm also has calculation errors when calculating the feature points.
  • the speed measurement accuracy of the optical flow sensor is affected by the image processing algorithm, and the speed measurement accuracy is not high. All of these have led to the decline of the positioning accuracy of ultrasonic and optical flow sensors, and it is difficult to meet the high-precision positioning requirements of UAVs in the market.
  • the positioning accuracy of this method is low, and it is easily affected by the environment, which cannot meet the high-precision positioning requirements of UAVs.
  • Laser SLAM positioning method the UAV is equipped with a laser radar.
  • the laser radar emits laser light
  • the position of the obstacle is determined by calculating the time difference between the time difference between transmitting and receiving the laser beam by using the characteristic of the laser beam to meet the surrounding obstacles and bouncing.
  • lidar to determine the position of all surrounding obstacles
  • SLAM technology can be used to construct a map of surrounding obstacles, so as to determine the relative position of the UAV in the map.
  • the positioning accuracy of this method is high, which can reach the centimeter level, and the position of the UAV can be located more accurately.
  • carrying lidar will greatly increase the cost of UAV.
  • the reflective surface is rough, the ranging accuracy of lidar will be reduced, so it is not suitable for popularization in the market.
  • the UWB positioning system is adopted; the UWB signal source is mounted on the UAV, and the UBW signal source continuously transmits UWB positioning signals to the surrounding during positioning, and the UWB sensors pre-arranged in different positions of the positioning area receive the signals and use the algorithm (TDOA algorithm) , AOA algorithm) can calculate the relative position of the UAV in the area.
  • the positioning accuracy of this method is high, but due to the high cost of UWB hardware equipment, it is also not suitable for widespread popularization in the market.
  • the positioning is done using the camera mounted on the drone.
  • This positioning method can be divided into two types: monocular camera and multi-eye camera.
  • a monocular camera after the camera collects a picture of the surrounding environment, it uses a deep learning model to identify obstacles in the picture, and uses the depth information of the pixels in the picture or the size information of the pixels to calculate the distance of the obstacles.
  • the distance calculation can also be performed using the principle of binocular parallax. That is, when the distance between the cameras is known, multiple cameras shoot the same object, and the object has a certain deviation in multiple images. The actual distance between the object and the camera is calculated through this deviation to achieve the purpose of positioning.
  • SLAM technology After obtaining the distance of surrounding obstacles, use SLAM technology to build an environment map, so as to determine the relative position of the UAV in the map.
  • the advantage of using the visual method is that the cost of the camera is low, and most drones have their own cameras, which will not increase the burden of the drone, that is, it will not affect the weight capacity and cost of the drone.
  • the disadvantage of using this method is that the positioning accuracy is low, and the monocular camera mainly relies on the deep learning model to identify the surrounding obstacles. In indoor environments, the obstacles are complex and changeable, and it is difficult for a model trained in one environment to meet applicable standards in other indoor environments.
  • the positioning accuracy of visual SLAM is also limited by the performance of the image matching algorithm, the accuracy of ranging is not high, and this method is also easily disturbed by ambient light, the positioning distance is short, and the ranging range is relatively small. Due to the above defects, this positioning method cannot be well adapted to market demands.
  • the defects of the first and third positioning methods are that the positioning accuracy is not high and depends on the performance of the image matching algorithm.
  • the disadvantage of the second and third positioning methods is that the cost of hardware is high and cannot be widely used in the market.
  • the defects of the current positioning system itself affect the user experience, and at the same time restrict the maturity and popularization of UAVs.
  • the purpose of the present invention is to provide a UAV positioning method and system based on screen optical communication, which can improve the positioning accuracy and the scope of action of the camera, aiming at the technical problems existing in the prior art.
  • the present invention provides a UAV positioning method based on screen optical communication, the positioning method specifically includes:
  • An image collector including a drone, an image acquisition module arranged on the drone, and a fixed bracket for adjusting the height of the drone; a camera is set on the image acquisition module for collecting markers on the electronic screen The image gets a video stream containing the image of the marker;
  • image processing is performed to obtain the coordinates corresponding to the marker pattern therein;
  • distance measurement is performed to obtain the straight line distance from the center of the marker pattern to the connection of the camera;
  • the angle measurement is performed, and the actual distance from the camera to the electronic screen is obtained in combination with the straight-line distance to complete the positioning of the UAV.
  • the SSD300 model is used to complete the identification of the marker images.
  • the specific identification process is as follows:
  • the locator is used to generate the prediction candidate frame, and the area selected by the candidate frame is used as the feature map to be identified;
  • the feature map to be identified is processed and transformed to obtain a transformed feature map
  • the fully connected classifier is used to output the estimated value of the position and similarity of the marker image in the picture
  • the loss function for the training of the model is defined as the weighted sum of the position error and the confidence error, namely:
  • N is the number of candidate frames generated by the locator;
  • x is the indicator parameter;
  • c is the category confidence prediction value;
  • l is the position coordinate of the candidate frame generated by the locator;
  • g is the position coordinate of the manually marked landmark image;
  • is the weight coefficient, set to 1;
  • the position error is defined as follows:
  • the confidence error is defined as follows:
  • the image processing process includes:
  • the target area corresponding to the predicted position is enlarged and adjusted so that it can cover the entire marker image to obtain the target area picture;
  • the marker patterns are sorted to obtain their corresponding coordinates.
  • the threshold value T of the image conversion is calculated by using the OSTU algorithm, specifically:
  • the segmentation threshold of its foreground and background is T
  • the proportion of foreground pixels in the whole image is ⁇ 0 , and its average gray level ⁇ 0
  • the proportion of background pixels in the whole image is ⁇ 1 , its average gray level is ⁇ 0
  • the total average gray level is ⁇
  • the inter-class variance is g
  • the size of the image is M ⁇ N
  • the number of pixels in the image whose gray value is less than the threshold T is N 0 , which is greater than The number of pixels of the threshold T is N 1 ;
  • N 0 +N 1 M ⁇ N
  • the traversal method is used to take the value of T, and the threshold T when the inter-class variance g is maximized is obtained.
  • the shape detection process includes the following:
  • the white color block corresponding to the pixel value 1 in the area with the hole is regarded as the peak, and the black color block corresponding to the pixel value 0 is regarded as the trough;
  • the proportional similarity of crests and troughs is calculated by using Euclidean distance
  • the distance measurement is performed using the parallax principle, specifically:
  • the resolution of the camera on the image collector is PPI
  • the length of the unit pixel on the image collected by the image collector is PXM
  • the real distance between P l1 and P r1 on the electronic screen is B
  • the distance on the picture is Z
  • the focal length of the camera on the image collector is F
  • the distance between the image collector and the electronic screen is D;
  • the measured parallax of the landmark image is Z 1 when the distance D 1 is known; for any distance D 2 , the measured parallax is Z 2 . According to the known D 1 and Z 1 , the measured parallax is obtained:
  • the obtained D 2 is the straight-line distance D from the center point of the marker pattern to the connection line of the camera.
  • angle measurement is performed, including:
  • the angle of the camera obtained is:
  • the horizontal distance from the camera to the center of the electronic screen is DX, and the vertical distance is DY;
  • the rewritten Music algorithm is used to calculate the offset angle of the camera, specifically:
  • Z1, Z2, Z3, and Z4 are:
  • A is the directional response vector extracted from the incident signal X(i)
  • H is the conjugate transpose of the covariance matrix
  • ⁇ 2 is the noise power
  • I is the identity matrix
  • the eigenvector corresponding to the eigenvalue ⁇ obtained from the above formula is v( ⁇ ), and it is sorted according to the size of the eigenvalue.
  • the eigenvector corresponding to the maximum value is regarded as the signal part space, and the remaining 3 eigenvalues and features
  • the vector is regarded as the noise part space, and the noise matrix En is obtained, namely:
  • the offset angle P of the camera in the horizontal direction is:
  • a is the signal vector extracted from the incident signal X(i).
  • the deformation degree of the upper segment of the picture is converted into a distance, which specifically includes:
  • K [ ⁇ -N , ⁇ 1-N , ⁇ 2-N ,..., ⁇ 0 ,... ⁇ N-2 , ⁇ N-1 , ⁇ N ];
  • the pixel difference between the two line segments is W, that is:
  • the pixel sizes corresponding to the two line segments are P p and P q
  • the distortion degrees corresponding to the two line segments are ⁇ P and ⁇ q
  • the values of p and q are 0-N;
  • the present invention also provides a system for a UAV positioning method based on screen optical communication, the system comprising:
  • Electronic screen selection module used to select the electronic screen, and set the marker image on the electronic screen to make it symmetrical;
  • Collector setting module used to set the image collector, including the drone, the image acquisition module set on the drone, and the fixed bracket for adjusting the height of the drone;
  • Model building module used to extract the video stream transmitted by the image collector, identify the marker image, and obtain the predicted position of the marker image in the image;
  • Image processing module for performing image processing according to the predicted position of the marker image to obtain the coordinates corresponding to the marker pattern therein;
  • Distance measurement module used to measure the distance according to the coordinates corresponding to the marker pattern, and obtain the straight-line distance from the center of the marker pattern to the connection of the camera;
  • Angle measurement and positioning module used for angle measurement, and combined with the straight-line distance to obtain the actual distance from the camera to the electronic screen to complete the UAV positioning.
  • the marker image with symmetry on the electronic screen is adopted, the marker image is collected by the image collector, the deep learning model is constructed, and the image processing method is used to complete the marker.
  • Accurate identification of object patterns, distance measurement and angle measurement of the camera after identification, to obtain the relative distance between the camera and the electronic screen, and complete the precise positioning of the camera, which is simple, reliable and easy to implement, and greatly improves the positioning accuracy Solved the problem that the marker image cannot be recognized due to the too small image of the marker at a long distance from the camera. The range of the camera has been improved.
  • Fig. 1 is the flow chart of the UAV positioning method based on screen optical communication of the present invention
  • FIG. 2 is a schematic structural diagram of a marker image of the present invention
  • Fig. 3 is the schematic diagram of the server connection of the present invention.
  • FIG. 6 is a schematic diagram of an image after boundary suppression in the present invention.
  • Fig. 7 is the image schematic diagram after retaining the area with holes in the present invention.
  • FIG. 8 is a schematic diagram of an image in which all areas with holes are filled with white areas in the present invention.
  • Fig. 9 is the flow chart of shape detection in the present invention.
  • FIG. 10 is a schematic diagram of the peak and trough positions of the marker pattern in the present invention.
  • FIG. 11 is a schematic diagram of the principle of camera ranging in the present invention.
  • Figure 13 is a schematic diagram of selecting 4 tangents in the present invention.
  • FIG. 14 is a flow chart of the system of the UAV positioning method based on screen optical communication of the present invention.
  • the present invention provides a UAV positioning method based on screen optical communication, the positioning method specifically includes:
  • Step S1 selecting an electronic screen, and setting a marker image on the electronic screen; the marker image has symmetry.
  • the electronic screen adopts a smart electronic screen
  • the marker image adopts four square patterns, which are arranged in a square, as shown in FIG.
  • the characteristic is that the marker image has the characteristics of up-down, left-right symmetry.
  • the interval between the four marker patterns is 1/7 of the side length of the marker image (the marker pattern is a square), and parameters such as side length and arrangement spacing of the marker images are all known values.
  • the smart electronic screen adopts a 55-inch large-scale display, and the display can display high-definition color patterns.
  • Step S2 Setting up an image acquisition device, including an image acquisition module with a monocular camera, a height-adjustable fixed bracket, and a drone, wherein the drone is provided with an image acquisition module, which is arranged on the fixed bracket, and is fixed on the drone.
  • the bracket adjusts the height of the drone; the image acquisition module collects the marker image on the electronic screen to obtain the marker picture.
  • the image collector can realize camera distortion calibration, video stream transmission, and focal length locking. Its range of action is 0-20 meters, and the angle is ⁇ 60°. ) to locate. Specifically, the pixels of the camera, the format of the captured image, and the horizontal and vertical resolutions are all set in advance.
  • Step S3 constructing a deep learning model, extracting the video stream transmitted by the image collector, and recognizing the marker image therein, to obtain the predicted position of the marker image in the image.
  • the server is connected to the electronic screen through the network, and controls the display and refresh of the marker pattern on the electronic screen.
  • the video stream containing the marker picture transmitted by the image collector is received, as shown in FIG. 3 .
  • Step S4 Perform image processing according to the predicted position of the marker image to obtain the coordinates corresponding to the marker pattern in the marker image.
  • Step S5 According to the coordinates corresponding to the marker pattern, distance measurement is performed to obtain a straight line distance from the center point of the marker pattern to the connection line of the camera.
  • Step S6 Measure the angle, and obtain the actual distance from the camera to the electronic screen in combination with the straight-line distance, so as to complete the positioning of the drone.
  • an electronic screen is selected, a marker image with symmetry is set, an image collector for collecting the marker image is set, a deep learning model is constructed, and the image processing method is used to complete the accurate marker pattern. Identify and obtain the relative distance between the camera and the electronic screen to complete the precise positioning of the camera.
  • the entire positioning method is simple, reliable and easy to implement, and the positioning accuracy is also high.
  • the deep learning model adopts the SSD300 model to complete the identification of the marker image, and obtains the predicted position of the marker image in the picture, as shown in FIG. 4, the specific identification process is as follows :
  • Step S31 Input a frame of picture of the video stream transmitted by the image collector in the SSD300 model. Specifically, the picture size is: 4032 ⁇ 3024 ⁇ 3.
  • Step S32 Preliminarily extract the features of the picture through the VGG-16 network to obtain a feature map. Specifically, the size of the feature map is: 38 ⁇ 38 ⁇ 512.
  • Step S33 using the locator to generate a certain number of prediction candidate frames, and using the region of the feature map selected by the candidate frame as the feature map to be identified.
  • Step S34 Process and transform the feature map to be identified to obtain a transformed feature map.
  • Step S35 According to the transformed feature map, output the estimated value of the position and the similarity degree of the marker image in the picture through the fully connected classifier.
  • the fully connected classifier adopts a 256 ⁇ 2 fully connected classifier.
  • Step S36 Select a candidate frame whose similarity degree estimate value is greater than a preset value (set to 0.6), and use its pixel position as the final prediction position.
  • the pixel position of the candidate frame is the pixel position of the identified marker image in the picture collected by the image collector.
  • the image collected by the image collector is identified by using the SSD300 model, and the predicted position of the marker image is obtained through processing and transformation, and the final positioning calculation of the UAV position is completed based on the obtained predicted position. It is simple, reliable and easy to implement, and also ensures the effectiveness and accuracy of the positioning method.
  • step S34 the processing conversion process is as follows:
  • Step S341 Convert the feature map to be identified after being processed by the Conv6 module to obtain a first feature map of 19 ⁇ 19 ⁇ 1024.
  • Step S342 Convert the first feature map after being processed by the Conv7 module to obtain a second feature map of 19 ⁇ 19 ⁇ 1024.
  • Step S343 Convert the second feature map after being processed by the Conv8 module to obtain a third feature map of 10 ⁇ 10 ⁇ 512.
  • Step S344 Convert the third feature map after being processed by the Conv9 module to obtain a fourth feature map of 5 ⁇ 5 ⁇ 256.
  • Step S345 Convert the fourth feature map after being processed by the Conv10 module to obtain a fifth feature map of 3 ⁇ 3 ⁇ 256.
  • Step S346 Convert the fifth feature map after being processed by the Conv11 module to obtain a sixth feature map of 1 ⁇ 1 ⁇ 256.
  • the input picture in the SSD300 model displays a digital matrix, and the model outputs the predicted position and scoring result (similarity degree estimate) of the marker image.
  • the scoring result is a specific value used to judge the similarity between the marker image at the predicted position and the actual marker image, and the value is between 0 and 1.
  • the digital matrix is converted into a specific value between 0 and 1 step by step, which is used for the similarity between the image and the marker in the specific area of the model output picture, so as to ensure the reliability and accuracy of positioning. sex.
  • the specific training process includes:
  • the image samples are obtained by collecting marker images at different distances, angles and light intensities through an image collector.
  • the correct answer is the specific position of the marker image obtained by manual marking, which corresponds to the pixel value in an area in the embodiment of the present invention, and the pixel value in this area is given in the form of coordinates, such as (5 , 5, 10, 10), this coordinate represents all the pixels in the square with the starting point at (5, 5) and the end point at (10, 10).
  • the SSD300 model needs to be trained before it is used for identification.
  • the input marker picture and the corresponding specific position (that is, the correct answer) model can predict a result (this result is also a position) according to the input.
  • the predicted result is compared with the actual specific position, and then the model parameters are adjusted by comparing the predicted value and the actual value, and finally the model can better predict the value close to the correct answer, which ensures the accuracy of the predicted position of the model. and ensure the validity and reliability of the entire positioning method.
  • the initial value of the VGG-16 network in the SSD300 model loads the open-source pre-training parameters on the external Git-hub, and the parameters of the Conv6-Conv11 module are randomly initialized parameters. Since model training takes a long time, it is more efficient to let the model copy the parameters trained by the external model, and then further train on this basis, so that you can quickly train your own model.
  • the loss function trained by the deep learning model is defined as the weighted sum of the position error and the confidence error, namely:
  • N is the number of candidate boxes generated by the locator
  • c is the category confidence prediction value, which is the result of the current candidate frame predicted by the model
  • l is the position coordinate of the candidate frame generated by the locator, and four values are used to represent the vertex coordinates (c x1 , c y1 ) of the upper left corner, the length w 1 of the candidate frame and the height h 1 of the candidate frame;
  • g is the position coordinate of the artificially marked marker image, and four values are used to represent the vertex coordinates (c x2 , c y2 ) of the upper left corner of the area, the length w 2 of the area and the height h 2 of the area.
  • is the weight coefficient, which is set to 1.
  • the position error is defined as follows:
  • the definition of the confidence error is as follows:
  • the training target is to minimize the position error L loc
  • the model can be used for the identification of marker pictures after being trained.
  • the video stream transmitted by the image collector is passed to the SSD300 model for detection. If there is a marker picture in the video image, the model finally outputs the coordinate position l and predicted value c of the candidate frame; if there is no marker picture in the video image, the model finally outputs. Do not output any results.
  • the image processing process includes:
  • Step S41 Adjusting the image area, that is, obtaining the target area corresponding to the predicted position according to the SSD300 model. Since the data of the predicted position is inaccurate, it is necessary to expand and adjust the scope of the target area, so that the target area can cover the entire marker image, and obtain: Image of the target area.
  • the expansion method of the target area is confirmed according to the distribution of pixel values.
  • the black pixel ratio of the marker image itself is 58.67%.
  • Step S42 target segmentation, that is, performing binarization processing on the adjusted image of the target area, converting the color RGB image into a black and white image, and obtaining a black and white image of the target area.
  • the threshold value T of the image conversion is calculated by using the OSTU algorithm, and the process is as follows:
  • the segmentation thresholds of the foreground (target) and the background are denoted as T
  • the proportion of the foreground pixels in the entire image is denoted as ⁇ 0
  • the background pixels account for The scale of the whole image is ⁇ 1
  • its average gray level is ⁇ 0
  • the overall average gray level of the image is denoted as ⁇
  • the between-class variance is denoted as g.
  • the electronic screen adopts a light-colored image background. Since the main body of the marker image is black, the color difference between the background and the marker image is large, and the size of the image is M ⁇ N, and the gray value of the pixel in the image is The number of pixels smaller than the threshold T is denoted as N 0 , and the number of pixels whose pixel grayscale is greater than the threshold T is denoted as N 1 , then there is the following formula:
  • N 0 +N 1 M ⁇ N (6)
  • the traversal method is used to take the value of T (the range is between 0 and 255), and the threshold T when the variance g between classes is maximized is obtained, which is the desired value.
  • Step S43 Perform boundary suppression on the marker image according to the binarized black and white image, so that the marker image and the background image are separated.
  • the boundary between the marker images is theoretically obvious. In order to further strip the marker image from the background image, it is necessary to Do another boundary suppression operation.
  • the boundary suppression process is specifically: traverse each pixel in the marker image, and compare the surrounding 8 pixels connected to the pixel (8 directions) (the pixel at the edge of the image will be less than 8). ), if the value of the pixel in other directions except the boundary is 0, the pixel is considered to be a pixel adjacent to the boundary, and the pixel is cleared.
  • each marker pattern is composed of a black border, a white border and a black square spliced together. After the boundary suppression is completed, many independent pixel areas without holes will be left in the marker image, and all pixel areas without holes will be removed, and finally the area with holes as shown in Figure 7 will remain.
  • the area contains four marker patterns.
  • Step S44 perform shape detection, as shown in FIG. 8 , fill all the areas with holes in the image after boundary suppression as a whole white pixel area, that is, only the areas with holes are retained on the binarized marker image , calculate the center position (ie centroid) and area position of all the regions with holes, and perform shape detection on the marker pattern in the marker image, as shown in Figure 9, the specific process includes the following:
  • Step S441 Draw a vertical tangent line from the centroid positions of all areas with holes, and record the number of pixel points with pixel values 0 and 1 cut by the tangent line from top to bottom, respectively.
  • Step S442 The white color block (ie the part with the pixel value of 1) in the hole area is regarded as the peak, and the black color block (ie the part with the pixel value of 0) is regarded as the trough, then the relative width of the peak and the trough is It can be measured by the number of 0s and 1s that the tangent passes through, as shown in Figure 10. Specifically, if the area is a marker pattern, then the number of wave crests and wave troughs that the tangent line passes through must be 3 and 2.
  • Step S443 Check the number of peaks and troughs in all areas with holes, and remove image areas that obviously do not conform to the marker pattern.
  • Step S444 Comparing the width ratios of the peaks and troughs according to the image area in which the number of peaks and troughs meets the requirements, to obtain the theoretical ratio similarity between the two.
  • the Euclidean distance is used to measure the calculation to obtain the similarity between the peak and trough ratio and the theoretical peak and trough ratio, specifically:
  • the value of y is less than 0.8, the effect of detecting the pattern of the marker is better. Therefore, the value of y is less than 0.8 as the criterion for judging that the region is the pattern of the marker.
  • Step S445 Traverse each white area, find all areas that meet the proportional similarity, and use it as the specific position of the marker pattern.
  • Step S45 According to the specific positions of the four marker patterns obtained, the bubble sorting algorithm is used to sort the four marker patterns to obtain their corresponding coordinates.
  • the arrangement of the marker patterns is known in advance, according to the relative size of the horizontal and vertical coordinates, it can be known that the positions of the four coordinates correspond to the specific four marker patterns respectively. According to the obtained coordinates corresponding to the four marker patterns, it can be used to further obtain the relative position of the image collector to the smart electronic screen as follows.
  • the recognition of the marker pattern is completed by constructing a deep learning model and combining with the image processing method, which solves the problem that the marker cannot be recognized due to the too small image of the marker at a long distance of the camera.
  • the positioning technology greatly improves the scope of the camera positioning, thereby improving the positioning accuracy.
  • the distance measurement is performed by using the parallax principle.
  • the principle of parallax means that there is a certain difference between the positions of the same object in the images captured by two cameras with a fixed distance, and the distance between the two cameras can be calculated more accurately by using this difference. the distance between.
  • two points P l1 and P r1 on the electronic screen are regarded as cameras, and point A on the image collector is regarded as an object.
  • P l2 and P l3 in the picture collected by the image collector can be regarded as the projection of point A on the imaging planes of P l1 and P r1 , so the line segment P l2 and P r2 is the point A on the two sides of P l1 and P r1 .
  • the parallax under the imaging plane of each "camera" is shown in Figure 11.
  • the real distance between P l1 and P r1 on the electronic screen be B, in millimeters; the distance between P l1 and P r1 on the picture is Z, in pixels; the focal length of the camera on the image collector is F, The unit is mm; the distance between the image collector and the electronic screen is D, the unit is mm; the resolution of the camera on the image collector is recorded as PPI; the length of the unit pixel on the image collected by the image collector is PXM, the unit is pixel/mm ; that is, the following formula is obtained:
  • the distance D can be obtained from equations (11) and (12).
  • the focal length of the lens marked by the image collector is not equal to the focal length of the actual shooting, and some processing is performed on the picture after shooting, the parameters of the lens focal length F and resolution PPI of the camera are different from the actual ones. There is a certain deviation, so the calibration method is used to solve the problem caused by inaccurate camera parameters, specifically:
  • the measured D 2 is the straight-line distance from the center point of the marker pattern to the connection line of the camera.
  • the marker pattern with geometric symmetry is used for positioning calculation, so that the monocular camera can also use the parallax principle to perform distance measurement, so that the positioning method does not depend on image matching.
  • the effect of the algorithm can greatly improve the positioning accuracy, and can achieve the ranging accuracy of dual cameras, and also reduce the material cost.
  • step S6 angle measurement is performed, and the actual distance from the camera to the electronic screen is obtained, which specifically includes:
  • Step S61 When the image collector is near the center axis of the electronic screen, the image of the marker on the electronic screen can be collected without deflecting the camera of the image collector.
  • the horizontal distance from the camera to the center of the electronic screen is DX, and the vertical distance is DY;
  • the horizontal pixel difference is X, and the vertical pixel difference is Y;
  • the width of a single marker pattern on the picture is PX pixels, and the height is PY pixels;
  • the actual side length of a single marker pattern is L, that is, the actual horizontal distance is obtained DX and vertical distance DY are:
  • Step S62 When the image collector is far away from the center of the electronic screen, the camera of the image collector needs to be deflected to collect the image of the marker on the electronic screen.
  • the rewritten Music algorithm is used to calculate the offset angle of the camera, and then the actual offset distances DX and DY from the camera to the electronic screen are derived.
  • the calculation process is as follows:
  • Z1, Z2, Z3, and Z4 are:
  • the covariance matrix of the incident signal is as follows:
  • H represents the conjugate transpose of the covariance matrix
  • A is the directional response vector extracted from the incident signal X(i)
  • ⁇ 2 is the noise power
  • I is the identity matrix.
  • the eigenvector corresponding to the eigenvalue ⁇ is v( ⁇ ), and it is sorted according to the size of the eigenvalue.
  • the eigenvector corresponding to the largest eigenvalue is regarded as the signal part space, and the remaining three eigenvalues and The eigenvectors are regarded as the noise part space, and the noise matrix En is obtained, namely:
  • Step S63 When the camera is deflected by a certain angle, the image of the marker collected by the image collector has a certain degree of deformation, and when the camera is deflected at different angles, the degree of deformation on the picture is often different. Therefore, the degree of deformation on the picture actually contains the information of the offset angle of the camera, and the Music algorithm converts the degree of deformation of the upper segment of the picture into a distance as the input value of the algorithm, thereby estimating the deflection angle of the camera and making the Angle error is minimal.
  • the errors calculated by the Music algorithm are also different.
  • the difference between the deformation degrees of the two line segments on the marker image is the largest, the error calculated by the Music algorithm is the smallest. Therefore, in the process of rotating the camera, there must be an angle of rotation, so that the error estimated by the Music algorithm is the smallest.
  • the specific calculation formula is as follows:
  • one side of the overall pattern should be close to the center of the image, and the other side should be far from the center of the image, so that the angle error estimated by the music algorithm is the smallest.
  • the classical Music algorithm can be used to estimate the declination angle of the camera, that is, the distance between the different points on the marker pattern and the camera is used to estimate the marker pattern and the camera.
  • the declination angle between the camera and the electronic screen greatly improves the estimation accuracy of the declination angle between the camera and the electronic screen, thereby ensuring the reliability of the positioning method.
  • an embodiment of the present invention further provides a system for a method for positioning an unmanned aerial vehicle based on screen optical communication.
  • the system includes:
  • Electronic screen selection module used to select the electronic screen, and set the marker image on the electronic screen to make it symmetrical;
  • Collector setting module used to set the image collector, including the drone, the image acquisition module set on the drone, and the fixed bracket for adjusting the height of the drone;
  • Model building module used to extract the video stream transmitted by the image collector, identify the marker image, and obtain the predicted position of the marker image in the image;
  • Image processing module for performing image processing according to the predicted position of the marker image to obtain the coordinates corresponding to the marker pattern therein;
  • Distance measurement module used to measure the distance according to the coordinates corresponding to the marker pattern, and obtain the straight-line distance from the center of the marker pattern to the connection of the camera;
  • Angle measurement and positioning module used for angle measurement, and combined with the straight-line distance to obtain the actual distance from the camera to the electronic screen to complete the UAV positioning.
  • the system provided in the embodiment of the present invention is specifically used to execute the foregoing method embodiment, which is not repeated in this embodiment of the present invention.
  • the UAV positioning method and system provided in the embodiment of the present invention adopts the marker image with symmetry on the electronic screen, and uses the deep learning model and image processing method to complete the identification of the marker pattern, which solves the problem of the long distance of the camera.
  • the image of the lower marker is too small to recognize the problem of the marker.
  • the classical Music algorithm is used to estimate the declination angle of the camera, and then the relative distance from the camera to the electronic screen is obtained. Compared with other camera positioning technologies, the camera positioning is greatly improved. range of action, thereby improving the positioning accuracy of the camera.
  • the positioning accuracy of the UAV within a distance of 20 meters can be guaranteed, and at the same time, it has good performance in complex and diverse scenarios, so that the positioning of the UAV can be ensured.
  • Technology can be widely used.

Abstract

An unmanned aerial vehicle positioning method and system based on screen optical communication. The method comprises: selecting an electronic screen, and setting marker images on the electronic screen to be symmetrical; arranging an image collector, comprising an unmanned aerial vehicle, an image acquisition module disposed on the unmanned aerial vehicle, and a fixing bracket for adjusting the height of the unmanned aerial vehicle; a camera being arranged on the image acquisition module, and being configured to acquire the marker images on the electronic screen to obtain a video stream comprising marker pictures; constructing a deep learning model, and identifying the marker pictures to obtain a predicted position of the marker images in the pictures; performing image processing to obtain coordinates corresponding to a marker pattern; performing distance measurement to obtain the linear distance between the center of the marker pattern and the camera; and performing angle measurement, and obtaining the actual distance between the camera and the electronic screen, so as to complete the positioning of the unmanned aerial vehicle. The system can improve the positioning accuracy and the effect range of the camera.

Description

一种基于屏幕光通信的无人机定位方法及系统A UAV positioning method and system based on screen optical communication 技术领域technical field
本发明涉及无人机定位技术领域,更具体的说,特别涉及一种基于屏幕光通信的无人机定位方法及系统。The invention relates to the technical field of unmanned aerial vehicle positioning, and more particularly, to a method and system for unmanned aerial vehicle positioning based on screen optical communication.
背景技术Background technique
现如今,无人机的运用越来越普及,无人机在室外场景下有大量应用如:农业领域无人机的播种、施肥;物流领域利用无人机进行自动送货;摄影行业利用无人机进行航拍等。在室内环境下,对于无人机的应用也同样有大量需求,如:室内安防、室内物流配送、室内勘测等等。无人机在执行任务时的一大难点在于无人机的定位不够精确,在一些复杂的场景下需要人工手动调整无人机的位置,这就极大限制了无人机的活动范围,阻碍了无人机的进一步发展。Nowadays, the use of drones is becoming more and more popular. There are many applications of drones in outdoor scenes, such as: drones in the agricultural field for seeding and fertilization; in the logistics field, drones are used for automatic delivery; the photography industry uses drones. Aerial photography, etc. In the indoor environment, there is also a large demand for the application of UAV, such as: indoor security, indoor logistics distribution, indoor survey and so on. A major difficulty of UAVs in performing tasks is that the positioning of UAVs is not accurate enough. In some complex scenarios, the position of UAVs needs to be manually adjusted manually, which greatly limits the scope of UAVs and hinders the movement of UAVs. further development of drones.
目前,在室外环境,无人机通常利用GPS和自主惯性导航的方式进行定位,大致能满足定位的需求。但是由于GPS信号无法作用在室内,并且由于室内的环境相较于室外更加复杂,无人机在室内的定位依旧是一大难点。在室内环境下,无人机主要采取以下几种定位方式:At present, in outdoor environments, UAVs usually use GPS and autonomous inertial navigation for positioning, which can roughly meet the needs of positioning. However, because GPS signals cannot be used indoors, and because the indoor environment is more complex than outdoor, the positioning of drones indoors is still a major difficulty. In the indoor environment, the UAV mainly adopts the following positioning methods:
1.超声波和光流传感器配合定位的方式;在无人机上搭载超声波和光流传感器,利用超声波发射器向周围发射超声波,利用周围物体反射回的超声波计算障碍物和自己当前位置的距离。光流传感器则是利用空间中运动的物体在成像平面上像素运动的“瞬时速度”来计算无人机的水平方向上速度。1. The positioning method of ultrasonic and optical flow sensors; the UAV is equipped with ultrasonic and optical flow sensors, and the ultrasonic transmitter is used to transmit ultrasonic waves to the surroundings, and the ultrasonic waves reflected from surrounding objects are used to calculate the distance between obstacles and their current position. The optical flow sensor uses the "instantaneous speed" of the pixel motion of the object moving in space on the imaging plane to calculate the horizontal speed of the drone.
利用超声波定位的缺陷在于,在某些环境复杂的场景下,即由于室内环境较 室外更加密集和复杂,超声波信号在室内更容易受到多径效应影响而失真,超声波容易被环境吸收,测距效果不好。同时,由于音波信号的传播速度还受到环境的温度的影响,因此在计算距离时不可避免的产生一定的误差,即超声波本身的定位精度不高,存在无法勘测斜面障碍物等问题。光流传感器则是利用摄像头拍摄运动物体时,同一物体在相邻两帧上的像素位置变化来计算物体的速度。而由于摄像头存在一定的拍摄畸变,一般来说图像中心处畸变程度小,图像边缘处畸变程度大。因此同一物体在不同位置时,像素本身存在一定的偏移误差。此外图像匹配算法在计算特征点时也存在计算上的误差,光流传感器的测速精度受图像处理算法影响,测速精度不高。这些都导致了超声波和光流传感器方式定位的精度下降,难以满足市场上无人机高精度的定位需求。The disadvantage of using ultrasonic positioning is that in some complex environments, that is, because the indoor environment is denser and more complex than the outdoor environment, the ultrasonic signal is more likely to be distorted by the multipath effect indoors, and the ultrasonic wave is easily absorbed by the environment. not good. At the same time, since the propagation speed of the sound wave signal is also affected by the temperature of the environment, a certain error is inevitably generated when calculating the distance, that is, the positioning accuracy of the ultrasonic wave itself is not high, and there are problems such as the inability to survey slope obstacles. The optical flow sensor calculates the speed of the object by using the change of the pixel position of the same object on two adjacent frames when the camera shoots a moving object. However, because the camera has a certain shooting distortion, generally speaking, the degree of distortion at the center of the image is small, and the degree of distortion at the edge of the image is large. Therefore, when the same object is in different positions, the pixel itself has a certain offset error. In addition, the image matching algorithm also has calculation errors when calculating the feature points. The speed measurement accuracy of the optical flow sensor is affected by the image processing algorithm, and the speed measurement accuracy is not high. All of these have led to the decline of the positioning accuracy of ultrasonic and optical flow sensors, and it is difficult to meet the high-precision positioning requirements of UAVs in the market.
总体上来讲,利用这种方法定位的精度低,并且容易受到环境影响,不能满足无人机高精度的定位需求。Generally speaking, the positioning accuracy of this method is low, and it is easily affected by the environment, which cannot meet the high-precision positioning requirements of UAVs.
2.激光SLAM定位方式;在无人机上搭载激光雷达,定位时激光雷达发出激光,利用激光束遇见周围的障碍物反弹的特性,通过计算发射和接受激光时的时间差来确定障碍物的位置。这样,利用激光雷达确定周围所有障碍物的位置情况,就可以利用SLAM技术构造周围的障碍物地图,从而确定无人机在地图中的相对位置。2. Laser SLAM positioning method; the UAV is equipped with a laser radar. When positioning, the laser radar emits laser light, and the position of the obstacle is determined by calculating the time difference between the time difference between transmitting and receiving the laser beam by using the characteristic of the laser beam to meet the surrounding obstacles and bouncing. In this way, using lidar to determine the position of all surrounding obstacles, SLAM technology can be used to construct a map of surrounding obstacles, so as to determine the relative position of the UAV in the map.
利用这种方式定位的精度高,能够达到厘米级,能够比较精确的定位出无人机的位置,但是由于激光雷达的成本高,搭载激光雷达会大大增加无人机的成本。此外,在反射面粗糙的情况下,激光雷达会出现测距精度会降低,因此并不适合在市场普及。The positioning accuracy of this method is high, which can reach the centimeter level, and the position of the UAV can be located more accurately. However, due to the high cost of lidar, carrying lidar will greatly increase the cost of UAV. In addition, when the reflective surface is rough, the ranging accuracy of lidar will be reduced, so it is not suitable for popularization in the market.
3.采用UWB定位系统方式;在无人机上搭载UWB信号源,在定位时UBW 信号源不断向周围发射UWB定位信号,预先布置在定位区域不同位置的UWB传感器接收到信号后利用算法(TDOA算法、AOA算法)就可以计算出无人机在区域内的相对位置。利用这种方式定位的精度高,但是由于UWB硬件设备成本高,同样不适合在市场上广泛普及。3. The UWB positioning system is adopted; the UWB signal source is mounted on the UAV, and the UBW signal source continuously transmits UWB positioning signals to the surrounding during positioning, and the UWB sensors pre-arranged in different positions of the positioning area receive the signals and use the algorithm (TDOA algorithm) , AOA algorithm) can calculate the relative position of the UAV in the area. The positioning accuracy of this method is high, but due to the high cost of UWB hardware equipment, it is also not suitable for widespread popularization in the market.
4.视觉SLAM方式。利用搭载在无人机上的摄像头完成定位。这种定位方式又可以分为单目摄像头和多目摄像头两种。在单目摄像头的情况下,摄像头采集到周围环境的图片后,利用深度学习模型来识别图片中的障碍物,利用图片中像素的深度信息或者像素的尺寸讯息来计算障碍物的距离。在双目摄像头的情况下,还可以利用双目视差原理来进行距离计算。即,在已知摄像头之间间距的情况下,多个摄像头拍摄同一物体,物体在多张图像上存在一定的偏差,通过这个偏差计算出物体和摄像头之间的实际距离,达到定位目的。在得到周围障碍物距离后,利用SLAM技术构建环境地图,从而确定无人机在地图中的相对位置。4. Visual SLAM method. The positioning is done using the camera mounted on the drone. This positioning method can be divided into two types: monocular camera and multi-eye camera. In the case of a monocular camera, after the camera collects a picture of the surrounding environment, it uses a deep learning model to identify obstacles in the picture, and uses the depth information of the pixels in the picture or the size information of the pixels to calculate the distance of the obstacles. In the case of a binocular camera, the distance calculation can also be performed using the principle of binocular parallax. That is, when the distance between the cameras is known, multiple cameras shoot the same object, and the object has a certain deviation in multiple images. The actual distance between the object and the camera is calculated through this deviation to achieve the purpose of positioning. After obtaining the distance of surrounding obstacles, use SLAM technology to build an environment map, so as to determine the relative position of the UAV in the map.
采用视觉的方式,优势在于摄像头的成本低,并且大多数无人机本身自带摄像头,不会额外增加无人机的负担,即不会影响无人机的负重能力和成本。利用这种方式的缺陷在于,定位精度低,单目摄像头主要依赖于深度学习模型来识别周围的障碍物。而室内环境下,障碍物复杂多变,在一个环境下训练的模型难以在其他的室内环境下达到适用标准。同时,视觉SLAM的定位精度同样受限于图像匹配算法的性能,测距的精度不高,也这种方法也容易受到环境光的干扰,定位的距离较短,测距的范围相对较小。由于以上缺陷,这种定位方式也不能很好的适应于市场需求。The advantage of using the visual method is that the cost of the camera is low, and most drones have their own cameras, which will not increase the burden of the drone, that is, it will not affect the weight capacity and cost of the drone. The disadvantage of using this method is that the positioning accuracy is low, and the monocular camera mainly relies on the deep learning model to identify the surrounding obstacles. In indoor environments, the obstacles are complex and changeable, and it is difficult for a model trained in one environment to meet applicable standards in other indoor environments. At the same time, the positioning accuracy of visual SLAM is also limited by the performance of the image matching algorithm, the accuracy of ranging is not high, and this method is also easily disturbed by ambient light, the positioning distance is short, and the ranging range is relatively small. Due to the above defects, this positioning method cannot be well adapted to market demands.
综上所述,第一、三种定位方式的缺陷在于定位的精度不高,且依赖于图像匹配的算法的性能。第二、三种定位方式的缺陷在于硬件的成本较高,不能在市 场上广泛普及。目前的定位系统自身的缺陷影响了用户体验,同时制约了无人机的成熟和普及。To sum up, the defects of the first and third positioning methods are that the positioning accuracy is not high and depends on the performance of the image matching algorithm. The disadvantage of the second and third positioning methods is that the cost of hardware is high and cannot be widely used in the market. The defects of the current positioning system itself affect the user experience, and at the same time restrict the maturity and popularization of UAVs.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于针对现有技术存在的技术问题,提供一种基于屏幕光通信的无人机定位方法及系统,能够提高定位精度和摄像头的作用范围。The purpose of the present invention is to provide a UAV positioning method and system based on screen optical communication, which can improve the positioning accuracy and the scope of action of the camera, aiming at the technical problems existing in the prior art.
为了解决以上提出的问题,本发明采用的技术方案为:In order to solve the problem proposed above, the technical scheme adopted in the present invention is:
本发明提供一种基于屏幕光通信的无人机定位方法,该定位方法具体包括:The present invention provides a UAV positioning method based on screen optical communication, the positioning method specifically includes:
选取电子屏,并设置电子屏上的标志物图像使其具有对称性;Select the electronic screen, and set the marker image on the electronic screen to make it symmetrical;
设置图像采集器,包括无人机、设置在无人机上的图像采集模块、以及用于调节无人机高度的固定支架;所述图像采集模块上设置摄像头,用于采集电子屏上的标志物图像得到包含标志物图片的视频流;An image collector is provided, including a drone, an image acquisition module arranged on the drone, and a fixed bracket for adjusting the height of the drone; a camera is set on the image acquisition module for collecting markers on the electronic screen The image gets a video stream containing the image of the marker;
构建深度学习模型,提取所述图像采集器传输的视频流,并对标志物图片进行识别,得到标志物图像在图片中的预测位置;constructing a deep learning model, extracting the video stream transmitted by the image collector, and identifying the marker image to obtain the predicted position of the marker image in the image;
根据所述标志物图像的预测位置,进行图像处理得到其中标志物图案对应的坐标;According to the predicted position of the marker image, image processing is performed to obtain the coordinates corresponding to the marker pattern therein;
根据所述标志物图案对应的坐标,进行距离测量,得到标志物图案中心到摄像头连接的直线距离;According to the coordinates corresponding to the marker pattern, distance measurement is performed to obtain the straight line distance from the center of the marker pattern to the connection of the camera;
进行角度测量,并结合所述直线距离得到摄像头到电子屏的实际距离,完成无人机定位。The angle measurement is performed, and the actual distance from the camera to the electronic screen is obtained in combination with the straight-line distance to complete the positioning of the UAV.
进一步的,采用SSD300模型来完成对标志物图片的识别,具体识别过程如下:Further, the SSD300 model is used to complete the identification of the marker images. The specific identification process is as follows:
在SSD300模型中输入图像采集器传输视频流的一帧图片;In the SSD300 model, input a frame of picture of the video stream transmitted by the image collector;
将所述图片经过VGG-16网络进行特征的初步抽取,得到特征图;Preliminarily extract the features of the picture through the VGG-16 network to obtain a feature map;
采用定位器生成预测候选框,将候选框选中的区域作为待识别特征图;The locator is used to generate the prediction candidate frame, and the area selected by the candidate frame is used as the feature map to be identified;
将待识别特征图进行处理转化,得到转化后的特征图;The feature map to be identified is processed and transformed to obtain a transformed feature map;
根据转化后的特征图,通过全连接分类器输出图片中标志物图像的位置和相似程度估计值;According to the transformed feature map, the fully connected classifier is used to output the estimated value of the position and similarity of the marker image in the picture;
选取相似程度估计值大于预设值的候选框,并将其像素位置作为最终的预测位置。Select the candidate frame whose similarity estimation value is greater than the preset value, and use its pixel position as the final prediction position.
进一步的,所述模型进行训练的损失函数定义为位置误差与置信度误差的加权和,即:Further, the loss function for the training of the model is defined as the weighted sum of the position error and the confidence error, namely:
Figure PCTCN2020140729-appb-000001
Figure PCTCN2020140729-appb-000001
其中,N为定位器生成的候选框数量;x为指示参数;c为类别置信度预测值;l是定位器生成的候选框的位置坐标;g是人工标记的标志物图片的位置坐标;α为权重系数,设为1;Among them, N is the number of candidate frames generated by the locator; x is the indicator parameter; c is the category confidence prediction value; l is the position coordinate of the candidate frame generated by the locator; g is the position coordinate of the manually marked landmark image; α is the weight coefficient, set to 1;
所述位置误差的定义如下:The position error is defined as follows:
Figure PCTCN2020140729-appb-000002
Figure PCTCN2020140729-appb-000002
Figure PCTCN2020140729-appb-000003
Figure PCTCN2020140729-appb-000003
所述置信度误差的定义如下:The confidence error is defined as follows:
Figure PCTCN2020140729-appb-000004
Figure PCTCN2020140729-appb-000004
进一步的,所述图像处理过程包括:Further, the image processing process includes:
将所述预测位置对应的目标区域进行扩大调整,使其能够覆盖整个标志物图 像,得到目标区域图片;The target area corresponding to the predicted position is enlarged and adjusted so that it can cover the entire marker image to obtain the target area picture;
对所述目标区域图片进行二值化处理,将其转换为黑白图片;Binarize the image of the target area, and convert it into a black and white image;
根据所述黑白图片对所述标志物图像进行边界抑制,使其与背景图片剥离,得到带孔区域图片;Perform boundary suppression on the marker image according to the black-and-white image, so that it is separated from the background image to obtain a picture of the area with holes;
将所述带孔区域填充为整块白色的像素区,计算所有带孔区域的中心位置和区域位置,并对其中的标志物图案进行形状检测,得到标志物图案的具体位置;Filling the perforated area into a whole white pixel area, calculating the center positions and regional positions of all perforated areas, and performing shape detection on the marker patterns therein to obtain the specific positions of the marker patterns;
根据所述标志物图案的具体位置,对标志物图案进行排序得到其对应的坐标。According to the specific positions of the marker patterns, the marker patterns are sorted to obtain their corresponding coordinates.
进一步的,所述图像转换的阈值T采用OSTU算法进行计算,具体为:Further, the threshold value T of the image conversion is calculated by using the OSTU algorithm, specifically:
对于图像I(x,y),其前景和背景的分割阈值作T,前景像素点数占整幅图像的比例为ω 0,其平均灰度μ 0;背景像素点数占整幅图像的比例为ω 1,其平均灰度为μ 0;总平均灰度为μ,类间方差为g;图像的大小为M×N,图像中像素的灰度值小于阈值T的像素个数为N 0,大于阈值T的像素个数为N 1For the image I(x, y), the segmentation threshold of its foreground and background is T, the proportion of foreground pixels in the whole image is ω 0 , and its average gray level μ 0 ; the proportion of background pixels in the whole image is ω 1 , its average gray level is μ 0 ; the total average gray level is μ, and the inter-class variance is g; the size of the image is M×N, and the number of pixels in the image whose gray value is less than the threshold T is N 0 , which is greater than The number of pixels of the threshold T is N 1 ;
则得到如下公式:The following formula is obtained:
ω 0=N 0/M×N ω 0 =N 0 /M×N
ω 1=N 1/M×N ω 1 =N 1 /M×N
N 0+N 1=M×N N 0 +N 1 =M×N
μ 01=1 μ 01 =1
μ=ω 0×μ 01×μ 1 μ=ω 0 ×μ 01 ×μ 1
g=ω 00-μ) 211-μ) 2 g=ω 00 -μ) 211 -μ) 2
由上述公式得到:Obtained from the above formula:
g=ω 0ω 101) 2 g=ω 0 ω 101 ) 2
采用遍历的方法取T值,得到使类间方差g最大时的阈值T。The traversal method is used to take the value of T, and the threshold T when the inter-class variance g is maximized is obtained.
进一步的,所述形状检测的过程包括如下:Further, the shape detection process includes the following:
从所有带孔区域的质心位置作一条竖直的切线,分别记录从上往下切过的像素值为0、1的像素点个数;Draw a vertical tangent line from the centroid position of all areas with holes, and record the number of pixel points with pixel values 0 and 1 cut from top to bottom;
将带孔区域中像素值1对应的白色色块看作是波峰,像素值0对应的黑色色块看作是波谷;The white color block corresponding to the pixel value 1 in the area with the hole is regarded as the peak, and the black color block corresponding to the pixel value 0 is regarded as the trough;
检查所有带孔区域的波峰、波谷个数,剔除掉明显不符合标志物图案的图像区域;Check the number of peaks and troughs in all areas with holes, and remove image areas that obviously do not conform to the marker pattern;
根据波峰、波谷个数符合要求的图像区域,采用欧几里德距离计算得到波峰、波谷的比例相似度;According to the image area where the number of crests and troughs meet the requirements, the proportional similarity of crests and troughs is calculated by using Euclidean distance;
遍历每一块白色区域,找到所有符合比例相似度的区域,并将其作为标志物图案的具体位置。Traverse each white area, find all the areas that meet the proportional similarity, and use it as the specific location of the marker pattern.
进一步的,采用视差原理来进行距离测量,具体为:Further, the distance measurement is performed using the parallax principle, specifically:
将电子屏上的两个点P l1和P r1看作摄像头,图像采集器上的A点看作物体,A点在P l1和P r1成像平面上的投影记为P l2和P l3,则得到如下公式: Consider the two points P l1 and P r1 on the electronic screen as a camera, point A on the image collector as an object, and the projections of point A on the imaging planes of P l1 and P r1 are denoted as P l2 and P l3 , then The following formula is obtained:
Figure PCTCN2020140729-appb-000005
Figure PCTCN2020140729-appb-000005
由相似原理可知,ΔAP l1P r1和ΔAP l2P r2相似,得到如下等式: According to the similarity principle, ΔAP l1 P r1 and ΔAP l2 P r2 are similar, and the following equation is obtained:
Figure PCTCN2020140729-appb-000006
Figure PCTCN2020140729-appb-000006
其中,图像采集器上摄像头的分辨率为PPI,图像采集器采集的图像上单位像素的长度为PXM;P l1和P r1在电子屏上的真实距离为B,在图片上的间距为Z;图像采集器上摄像头的焦距为F;图像采集器距电子屏的距离为D; Among them, the resolution of the camera on the image collector is PPI, the length of the unit pixel on the image collected by the image collector is PXM; the real distance between P l1 and P r1 on the electronic screen is B, and the distance on the picture is Z; The focal length of the camera on the image collector is F; the distance between the image collector and the electronic screen is D;
设标志物图像在已知距离D 1时,测得的视差为Z 1;对于任意距离D 2,测得的视差为Z 2,根据已知的D 1和Z 1,进行转换得到: Assume that the measured parallax of the landmark image is Z 1 when the distance D 1 is known; for any distance D 2 , the measured parallax is Z 2 . According to the known D 1 and Z 1 , the measured parallax is obtained:
Figure PCTCN2020140729-appb-000007
Figure PCTCN2020140729-appb-000007
Figure PCTCN2020140729-appb-000008
Figure PCTCN2020140729-appb-000008
得到的D 2即为标志物图案中心点到摄像头连线的直线距离D。 The obtained D 2 is the straight-line distance D from the center point of the marker pattern to the connection line of the camera.
进一步的,进行角度测量,具体包括:Further, angle measurement is performed, including:
当不需要偏转摄像头采集标志物图像时,得到摄像头的角度为:When there is no need to deflect the camera to collect the marker image, the angle of the camera obtained is:
Figure PCTCN2020140729-appb-000009
Figure PCTCN2020140729-appb-000009
Figure PCTCN2020140729-appb-000010
Figure PCTCN2020140729-appb-000010
其中,摄像头到电子屏中心的水平距离为DX,竖直距离为DY;Among them, the horizontal distance from the camera to the center of the electronic screen is DX, and the vertical distance is DY;
当需要偏转摄像头采集标志物图像时,采用改写后的Music算法来计算摄像头的偏移角度,具体为:When the camera needs to be deflected to collect the marker image, the rewritten Music algorithm is used to calculate the offset angle of the camera, specifically:
在得到具体位置的标志物图案上作等距的4条竖直切线,其水平和竖直方向的距离分别为X1~X4和Y1~Y4,设竖直方向上的切线间距为d;Four equidistant vertical tangent lines are made on the marker pattern of the obtained specific position, and the distances in the horizontal and vertical directions are X1-X4 and Y1-Y4 respectively, and the distance between the tangent lines in the vertical direction is set as d;
构造入射信号
Figure PCTCN2020140729-appb-000011
其中Z1、Z2、Z3、Z4分别为:
Constructing the incident signal
Figure PCTCN2020140729-appb-000011
Among them, Z1, Z2, Z3, and Z4 are:
Z1=0Z1=0
Figure PCTCN2020140729-appb-000012
Figure PCTCN2020140729-appb-000012
Figure PCTCN2020140729-appb-000013
Figure PCTCN2020140729-appb-000013
Figure PCTCN2020140729-appb-000014
Figure PCTCN2020140729-appb-000014
根据入射信号的协方差矩阵并将其分解得到:According to the covariance matrix of the incident signal and decompose it to get:
R(i)=AR XA H2I R(i)=AR X A H2 I
其中,A为方向响应向量由入射信号X(i)中提取得到,H表示协方差矩阵的共轭转置,σ 2为噪声功率,I为单位矩阵; Among them, A is the directional response vector extracted from the incident signal X(i), H is the conjugate transpose of the covariance matrix, σ 2 is the noise power, and I is the identity matrix;
由上述公式得到特征值为γ对应的特征向量为v(θ),并按特征值的大小排序,将其最大值对应的特征向量看作信号部分空间,将剩下的3个特征值和特征向量看作噪声部分空间,得到噪声矩阵E n,即: The eigenvector corresponding to the eigenvalue γ obtained from the above formula is v(θ), and it is sorted according to the size of the eigenvalue. The eigenvector corresponding to the maximum value is regarded as the signal part space, and the remaining 3 eigenvalues and features The vector is regarded as the noise part space, and the noise matrix En is obtained, namely:
A Hυ i(θ)=0,i=2,3,4 A H υ i (θ)=0, i=2,3,4
E n=[υ 2(θ),υ 3(θ),υ 4(θ)] E n = [υ 2 (θ), υ 3 (θ), υ 4 (θ)]
摄像头在水平方向的偏移角度P为:The offset angle P of the camera in the horizontal direction is:
Figure PCTCN2020140729-appb-000015
Figure PCTCN2020140729-appb-000015
其中,a为信号向量由入射信号X(i)中提取得到。Among them, a is the signal vector extracted from the incident signal X(i).
进一步的,当摄像头偏移角度采集得到的标志物图片产生形变时,将图片上线段的形变程度转化成距离,具体包括:Further, when the marker picture obtained by the camera offset angle is deformed, the deformation degree of the upper segment of the picture is converted into a distance, which specifically includes:
假设摄像头拍摄的转换矩阵为:Suppose the transformation matrix captured by the camera is:
K=[α -N1-N2-N,…,α 0,…α N-2N-1N]; K=[α -N1-N2-N ,...,α 0 ,...α N-2N-1N ];
由于摄像头在采集图像时畸变程度关于中心左右对称,得到:Since the degree of distortion of the camera is symmetrical about the center when collecting images, we get:
α -N=α N>α 1-N=α N-1>……>α 0 α -NN1-NN-1 >...>α 0
设用于计算角度的两条线段分别在标志物图片上的p处和q处,对应的计算距离为D p和D q,两条线段对应的像素大小为P p和P q,整体图案的边长为L,相机焦距为F,则得到如下公式: Assuming that the two line segments used to calculate the angle are at p and q on the marker image, the corresponding calculated distances are D p and D q , the pixel sizes corresponding to the two line segments are P p and P q , and the overall pattern is The side length is L and the camera focal length is F, then the following formula is obtained:
Figure PCTCN2020140729-appb-000016
Figure PCTCN2020140729-appb-000016
Figure PCTCN2020140729-appb-000017
Figure PCTCN2020140729-appb-000017
Figure PCTCN2020140729-appb-000018
Figure PCTCN2020140729-appb-000018
Figure PCTCN2020140729-appb-000019
Figure PCTCN2020140729-appb-000019
由上述公式得到两条线段之间的像素差值为W,即:From the above formula, the pixel difference between the two line segments is W, that is:
W=P pα P-P qα q W=P p α P -P q α q
其中,两条线段对应的像素大小为P p和P q,两条线段对应的畸变程度为α P和α q,p和q的取值为0-N; Among them, the pixel sizes corresponding to the two line segments are P p and P q , the distortion degrees corresponding to the two line segments are α P and α q , and the values of p and q are 0-N;
令q=0、p=N时,两端线段的像素差值W最大,则Music算法的误差最小。When q=0 and p=N, the pixel difference W of the line segments at both ends is the largest, and the error of the Music algorithm is the smallest.
本发明还提供一种基于屏幕光通信的无人机定位方法的系统,该系统包括:The present invention also provides a system for a UAV positioning method based on screen optical communication, the system comprising:
电子屏选取模块:用于选取电子屏,并设置电子屏上的标志物图像使其具有对称性;Electronic screen selection module: used to select the electronic screen, and set the marker image on the electronic screen to make it symmetrical;
采集器设置模块:用于设置图像采集器,包括无人机、设置在无人机上的图像采集模块、以及用于调节无人机高度的固定支架;Collector setting module: used to set the image collector, including the drone, the image acquisition module set on the drone, and the fixed bracket for adjusting the height of the drone;
模型构建模块:用于提取所述图像采集器传输的视频流,并对标志物图片进行识别,得到标志物图像在图片中的预测位置;Model building module: used to extract the video stream transmitted by the image collector, identify the marker image, and obtain the predicted position of the marker image in the image;
图像处理模块:用于根据所述标志物图像的预测位置,进行图像处理得到其中标志物图案对应的坐标;Image processing module: for performing image processing according to the predicted position of the marker image to obtain the coordinates corresponding to the marker pattern therein;
距离测量模块:用于根据所述标志物图案对应的坐标,进行距离测量,得到标志物图案中心到摄像头连接的直线距离;Distance measurement module: used to measure the distance according to the coordinates corresponding to the marker pattern, and obtain the straight-line distance from the center of the marker pattern to the connection of the camera;
角度测量定位模块:用于进行角度测量,并结合所述直线距离得到摄像头到电子屏的实际距离,完成无人机定位。Angle measurement and positioning module: used for angle measurement, and combined with the straight-line distance to obtain the actual distance from the camera to the electronic screen to complete the UAV positioning.
与现有技术相比,本发明的有益效果在于:Compared with the prior art, the beneficial effects of the present invention are:
本发明提供的无人机定位方法及系统,通过采用电子屏上具有对称性的标志物图像,并通过图像采集器采集得到标志物图片,构建深度学习模型并结合图像处理的方法来完成了标志物图案的精确识别,识别后对摄像头进行距离测量和角度测量,得到摄像头和电子屏之间的相对距离,完成了摄像头的精确定位,其简单、可靠也易于实现,并大大提高了定位精度,解决了由于摄像头远距离下的标志物图像过小,无法识别标志物的问题。提高了摄像头的作用范围。In the UAV positioning method and system provided by the present invention, the marker image with symmetry on the electronic screen is adopted, the marker image is collected by the image collector, the deep learning model is constructed, and the image processing method is used to complete the marker. Accurate identification of object patterns, distance measurement and angle measurement of the camera after identification, to obtain the relative distance between the camera and the electronic screen, and complete the precise positioning of the camera, which is simple, reliable and easy to implement, and greatly improves the positioning accuracy, Solved the problem that the marker image cannot be recognized due to the too small image of the marker at a long distance from the camera. The range of the camera has been improved.
附图说明Description of drawings
为了更清楚地说明本发明中的方案,下面将对实施例描述中所需要使用的附图作一个简单介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。其中:In order to illustrate the solutions in the present invention more clearly, the following will briefly introduce the accompanying drawings used in the description of the embodiments. Obviously, the accompanying drawings in the following description are some embodiments of the present invention, which are common in the art. As far as technical personnel are concerned, other drawings can also be obtained based on these drawings without any creative effort. in:
图1为本发明基于屏幕光通信的无人机定位方法的流程图;Fig. 1 is the flow chart of the UAV positioning method based on screen optical communication of the present invention;
图2为本发明标志物图像的结构示意图;FIG. 2 is a schematic structural diagram of a marker image of the present invention;
图3为本发明服务器连接的示意图;Fig. 3 is the schematic diagram of the server connection of the present invention;
图4为本发明中SSD300模型进行图片识别的原理图;4 is a schematic diagram of the SSD300 model for image recognition in the present invention;
图5为本发明中图像处理过程的流程图;5 is a flowchart of an image processing process in the present invention;
图6为本发明中边界抑制后的图像示意图;6 is a schematic diagram of an image after boundary suppression in the present invention;
图7为本发明中保留带孔区域后的图像示意图;Fig. 7 is the image schematic diagram after retaining the area with holes in the present invention;
图8为本发明中所有带孔区域填充为白色区域的图像示意图;8 is a schematic diagram of an image in which all areas with holes are filled with white areas in the present invention;
图9为本发明中进行形状检测的流程图;Fig. 9 is the flow chart of shape detection in the present invention;
图10为本发明中标志物图案波峰波谷位置的示意图;10 is a schematic diagram of the peak and trough positions of the marker pattern in the present invention;
图11为本发明中摄像头测距原理的示意图;11 is a schematic diagram of the principle of camera ranging in the present invention;
图12为本发明中角度测量的原理图;12 is a schematic diagram of angle measurement in the present invention;
图13为本发明中选取4条切线的示意图;Figure 13 is a schematic diagram of selecting 4 tangents in the present invention;
图14为本发明基于屏幕光通信的无人机定位方法的系统的流程图。FIG. 14 is a flow chart of the system of the UAV positioning method based on screen optical communication of the present invention.
具体实施方式Detailed ways
为了便于理解本发明,下面将参照相关附图对本发明进行更全面的描述。附图中给出了本发明的较佳实施例。但是,本发明可以以许多不同的形式来实现,并不限于本文所描述的实施例。相反地,提供这些实施例的目的是使对本发明的公开内容的理解更加透彻全面。In order to facilitate understanding of the present invention, the present invention will be described more fully hereinafter with reference to the related drawings. Preferred embodiments of the invention are shown in the accompanying drawings. However, the present invention may be embodied in many different forms and is not limited to the embodiments described herein. Rather, these embodiments are provided so that a thorough and complete understanding of the present disclosure is provided.
除非另有定义,本文所使用的所有的技术和科学术语与属于本发明的技术领域的技术人员通常理解的含义相同。本文中在本发明的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本发明。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terms used herein in the description of the present invention are for the purpose of describing specific embodiments only, and are not intended to limit the present invention.
参阅图1所示,本发明提供一种基于屏幕光通信的无人机定位方法,该定位方法具体包括:Referring to Figure 1, the present invention provides a UAV positioning method based on screen optical communication, the positioning method specifically includes:
步骤S1:选取电子屏,并设置电子屏上标志物图像;所述标志物图像具有对称性。Step S1: selecting an electronic screen, and setting a marker image on the electronic screen; the marker image has symmetry.
本发明实施例中,所述电子屏采用智慧电子屏,所述标志物图像采用四个正方形图案,并呈正方形排列,如附图2所示,同时排列后的组合图案同样具有上下左右对称的特点,即标志物图像具有上下左右对称的特点。四个标志物图案之间的间隔为标志物图像边长的1/7(标志物图案为正方形),标志物图像的边长、排列间距等参数均为已知值。具体的,所述智慧电子屏采用一块55寸大型显示 器,所述显示器可显示高清的彩色图案。In the embodiment of the present invention, the electronic screen adopts a smart electronic screen, and the marker image adopts four square patterns, which are arranged in a square, as shown in FIG. The characteristic is that the marker image has the characteristics of up-down, left-right symmetry. The interval between the four marker patterns is 1/7 of the side length of the marker image (the marker pattern is a square), and parameters such as side length and arrangement spacing of the marker images are all known values. Specifically, the smart electronic screen adopts a 55-inch large-scale display, and the display can display high-definition color patterns.
步骤S2:设置图像采集器,包括带有单目摄像头的图像采集模块、高度可调节的固定支架、以及无人机,其中无人机上设置有图像采集模块,并设置在固定支架上,通过固定支架调节无人机的高度;所述图像采集模块采集电子屏上的标志物图像得到标志物图片。Step S2: Setting up an image acquisition device, including an image acquisition module with a monocular camera, a height-adjustable fixed bracket, and a drone, wherein the drone is provided with an image acquisition module, which is arranged on the fixed bracket, and is fixed on the drone. The bracket adjusts the height of the drone; the image acquisition module collects the marker image on the electronic screen to obtain the marker picture.
本发明实施例中,所述图像采集器能够实现摄像头畸变校准、视频流的传输、焦距锁定,其作用范围为0-20米,角度为±60°,分别在区域内不同的位置(包含边界)进行定位。具体的,所述摄像头的像素、拍摄的图像格式、水平和垂直分辨率均预先进行设置。In the embodiment of the present invention, the image collector can realize camera distortion calibration, video stream transmission, and focal length locking. Its range of action is 0-20 meters, and the angle is ±60°. ) to locate. Specifically, the pixels of the camera, the format of the captured image, and the horizontal and vertical resolutions are all set in advance.
步骤S3:构建深度学习模型,提取所述图像采集器传输的视频流,并对其中的标志物图片进行识别,得到标志物图像在图片中的预测位置。Step S3 : constructing a deep learning model, extracting the video stream transmitted by the image collector, and recognizing the marker image therein, to obtain the predicted position of the marker image in the image.
本发明实施例中,服务器通过网络和电子屏连接,控制电子屏上的标志物图案的显示和刷新。同时,接收由图像采集器传输的包含标志物图片的视频流,如附图3所示。In the embodiment of the present invention, the server is connected to the electronic screen through the network, and controls the display and refresh of the marker pattern on the electronic screen. At the same time, the video stream containing the marker picture transmitted by the image collector is received, as shown in FIG. 3 .
步骤S4:根据所述标志物图像的预测位置进行图像处理,得到标志物图像中标志物图案对应的坐标。Step S4: Perform image processing according to the predicted position of the marker image to obtain the coordinates corresponding to the marker pattern in the marker image.
步骤S5:根据所述标志物图案对应的坐标,进行距离测量,得到标志物图案中心点到摄像头连线的直线距离。Step S5: According to the coordinates corresponding to the marker pattern, distance measurement is performed to obtain a straight line distance from the center point of the marker pattern to the connection line of the camera.
步骤S6:进行角度测量,并结合所述直线距离得到摄像头到电子屏的实际距离,完成无人机的定位。Step S6: Measure the angle, and obtain the actual distance from the camera to the electronic screen in combination with the straight-line distance, so as to complete the positioning of the drone.
本发明实施例中通过选取电子屏,并设置具有对称性的标志物图像,设置用于采集标志物图像的图像采集器,构建深度学习模型并结合图像处理的方法来完 成了标志物图案的精确识别,并得到摄像头和电子屏之间的相对距离,完成摄像头的精确定位,整个定位方法简单、可靠也易于实现,定位精度也高。In the embodiment of the present invention, an electronic screen is selected, a marker image with symmetry is set, an image collector for collecting the marker image is set, a deep learning model is constructed, and the image processing method is used to complete the accurate marker pattern. Identify and obtain the relative distance between the camera and the electronic screen to complete the precise positioning of the camera. The entire positioning method is simple, reliable and easy to implement, and the positioning accuracy is also high.
本发明实施例中,所述步骤S3中,所述深度学习模型采用SSD300模型来完成对标志物图片的识别,得到标志物图像在图片中的预测位置,参阅图4所示,具体识别过程如下:In the embodiment of the present invention, in the step S3, the deep learning model adopts the SSD300 model to complete the identification of the marker image, and obtains the predicted position of the marker image in the picture, as shown in FIG. 4, the specific identification process is as follows :
步骤S31:在SSD300模型中输入图像采集器传输视频流的一帧图片。具体的,图片尺寸为:4032×3024×3。Step S31: Input a frame of picture of the video stream transmitted by the image collector in the SSD300 model. Specifically, the picture size is: 4032×3024×3.
步骤S32:将所述图片经过VGG-16网络进行特征的初步抽取,得到特征图。具体的,所述特征图尺寸为:38×38×512。Step S32: Preliminarily extract the features of the picture through the VGG-16 network to obtain a feature map. Specifically, the size of the feature map is: 38×38×512.
步骤S33:采用定位器生成一定数量的预测候选框,并将候选框选中的特征图的区域作为待识别特征图。Step S33 : using the locator to generate a certain number of prediction candidate frames, and using the region of the feature map selected by the candidate frame as the feature map to be identified.
步骤S34:将待识别特征图进行处理转化,得到转化后的特征图。Step S34: Process and transform the feature map to be identified to obtain a transformed feature map.
步骤S35:根据转化后的特征图,通过全连接分类器输出图片中标志物图像的位置和相似程度估计值。具体的,所述全连接分类器采用256×2的全连接分类器。Step S35: According to the transformed feature map, output the estimated value of the position and the similarity degree of the marker image in the picture through the fully connected classifier. Specifically, the fully connected classifier adopts a 256×2 fully connected classifier.
步骤S36:选取相似程度估计值大于预设值(设为0.6)的候选框,并将其像素位置作为最终的预测位置。其中,候选框的像素位置为识别出来的标志物图像在图像采集器采集到图片中的像素位置。Step S36: Select a candidate frame whose similarity degree estimate value is greater than a preset value (set to 0.6), and use its pixel position as the final prediction position. The pixel position of the candidate frame is the pixel position of the identified marker image in the picture collected by the image collector.
本发明实施例中,通过采用SSD300模型来识别图像采集器采集的图片,并经过处理转化得到标志物图像的预测位置,基于所得到的预测位置来完成最后的无人机位置的定位计算,其简单、可靠也易于实现,也保证了定位方法实现的有效性和准确性。In the embodiment of the present invention, the image collected by the image collector is identified by using the SSD300 model, and the predicted position of the marker image is obtained through processing and transformation, and the final positioning calculation of the UAV position is completed based on the obtained predicted position. It is simple, reliable and easy to implement, and also ensures the effectiveness and accuracy of the positioning method.
进一步的,所述步骤S34中,所述处理转化过程如下:Further, in the step S34, the processing conversion process is as follows:
步骤S341:将待识别特征图经过Conv6模块处理后转化,得到19×19×1024的第一特征图。Step S341 : Convert the feature map to be identified after being processed by the Conv6 module to obtain a first feature map of 19×19×1024.
步骤S342:将第一特征图经过Conv7模块处理后转化,得到19×19×1024的第二特征图。Step S342: Convert the first feature map after being processed by the Conv7 module to obtain a second feature map of 19×19×1024.
步骤S343:将第二特征图经过Conv8模块处理后转化,得到10×10×512的第三特征图。Step S343: Convert the second feature map after being processed by the Conv8 module to obtain a third feature map of 10×10×512.
步骤S344:将第三特征图经过Conv9模块处理后转化,得到5×5×256的第四特征图。Step S344: Convert the third feature map after being processed by the Conv9 module to obtain a fourth feature map of 5×5×256.
步骤S345:将第四特征图经过Conv10模块处理后转化,得到3×3×256的第五特征图。Step S345: Convert the fourth feature map after being processed by the Conv10 module to obtain a fifth feature map of 3×3×256.
步骤S346:将第五特征图经过Conv11模块处理后转化,得到1×1×256的第六特征图。Step S346: Convert the fifth feature map after being processed by the Conv11 module to obtain a sixth feature map of 1×1×256.
本发明实施例中,在SSD300模型中输入图片显示的是一个数字矩阵,而模型输出的是标志物图像的预测位置和评分结果(相似程度估计值)。所述评分结果为具体的值,用于判断所述预测位置下的标志物图像与实际的标志物图像的相似程度,取值在0-1之间。通过上述处理转化过程,将数字矩阵一步一步地转化为一个具体在0-1之间的数值,用于模型输出图片中特定区域的图像与标志物的相似程度,来保证定位的可靠性和准确性。In the embodiment of the present invention, the input picture in the SSD300 model displays a digital matrix, and the model outputs the predicted position and scoring result (similarity degree estimate) of the marker image. The scoring result is a specific value used to judge the similarity between the marker image at the predicted position and the actual marker image, and the value is between 0 and 1. Through the above processing and transformation process, the digital matrix is converted into a specific value between 0 and 1 step by step, which is used for the similarity between the image and the marker in the specific area of the model output picture, so as to ensure the reliability and accuracy of positioning. sex.
进一步的,在识别标志物图片时,还需要对构建的深度学习模型进行训练,具体训练过程包括:Further, when recognizing the marker pictures, it is also necessary to train the constructed deep learning model. The specific training process includes:
通过图像采集器采集不同距离、不同角度和不同光照强度下的标志物图像, 得到图片样本。The image samples are obtained by collecting marker images at different distances, angles and light intensities through an image collector.
通过人工标记的方式,标记出每张图片样本中标志物图像的真实像素区域,所述区域内的像素矩阵作为训练的正确答案。具体的,所述正确答案为人工标记得到标志物图像的具体位置,本发明实施例中对应的是一块区域内的像素值,这个区域内的像素值是以坐标的形式给出,比如(5,5,10,10),这个坐标表示的是起点在(5,5),终点在(10,10)这个位置上的正方形里的所有像素。By means of manual marking, the real pixel area of the marker image in each picture sample is marked, and the pixel matrix in the area is used as the correct answer for training. Specifically, the correct answer is the specific position of the marker image obtained by manual marking, which corresponds to the pixel value in an area in the embodiment of the present invention, and the pixel value in this area is given in the form of coordinates, such as (5 , 5, 10, 10), this coordinate represents all the pixels in the square with the starting point at (5, 5) and the end point at (10, 10).
本发明实施例中,SSD300模型在用于识别之前需要进行训练,输入标志物图片以及对应的具体位置(即正确答案)模型根据所述输入能够预测一个结果(这个结果也是一个位置),将模型预测出结果和实际的具体位置进行比较,然后通过比较预测的值和实际的值来调整模型参数,最终让模型能较好地预测出接近正确答案的值,这样保证了模型得到预测位置的准确性,并保证整个定位方法的有效性和可靠性。本发明实施例中,所述SSD300模型中VGG-16网络的初始值加载外部Git-hub上开源的预训练参数,所述Conv6-Conv11模块的参数为随机初始化的参数。由于模型训练耗时长,比较高效的做法是让模型先复制外部模型训练的参数,再在这个基础上进一步训练,这样可以快速的训练好自己的模型。In the embodiment of the present invention, the SSD300 model needs to be trained before it is used for identification. The input marker picture and the corresponding specific position (that is, the correct answer) model can predict a result (this result is also a position) according to the input. The predicted result is compared with the actual specific position, and then the model parameters are adjusted by comparing the predicted value and the actual value, and finally the model can better predict the value close to the correct answer, which ensures the accuracy of the predicted position of the model. and ensure the validity and reliability of the entire positioning method. In the embodiment of the present invention, the initial value of the VGG-16 network in the SSD300 model loads the open-source pre-training parameters on the external Git-hub, and the parameters of the Conv6-Conv11 module are randomly initialized parameters. Since model training takes a long time, it is more efficient to let the model copy the parameters trained by the external model, and then further train on this basis, so that you can quickly train your own model.
进一步的,所述深度学习模型训练的损失函数定义为位置误差与置信度误差的加权和,即:Further, the loss function trained by the deep learning model is defined as the weighted sum of the position error and the confidence error, namely:
Figure PCTCN2020140729-appb-000020
Figure PCTCN2020140729-appb-000020
其中,N为定位器生成的候选框数量;Among them, N is the number of candidate boxes generated by the locator;
x为指示参数,表明当前候选框对应的像素区域和人工标记的像素区域是否匹配,其中x i∈{0,1},x i=1表示第i个候选框像素区域与标志物像素匹配,为0则不匹配; x is an indicator parameter, indicating whether the pixel area corresponding to the current candidate frame matches the artificially marked pixel area, where x i ∈ {0,1}, x i =1 indicates that the i-th candidate frame pixel area matches the marker pixel, If it is 0, it does not match;
c为类别置信度预测值,为当前候选框经过模型预测后的结果;c is the category confidence prediction value, which is the result of the current candidate frame predicted by the model;
l是定位器生成的候选框的位置坐标,用4个值表示分别是左上角的顶点坐标(c x1,c y1)、候选框的长度w 1和候选框的高度h 1l is the position coordinate of the candidate frame generated by the locator, and four values are used to represent the vertex coordinates (c x1 , c y1 ) of the upper left corner, the length w 1 of the candidate frame and the height h 1 of the candidate frame;
g是人工标记的标志物图片的位置坐标,用4个值表示分别是区域左上角的顶点坐标(c x2,c y2),区域的长度w 2和区域的高度h 2g is the position coordinate of the artificially marked marker image, and four values are used to represent the vertex coordinates (c x2 , c y2 ) of the upper left corner of the area, the length w 2 of the area and the height h 2 of the area.
α为权重系数,设为1。α is the weight coefficient, which is set to 1.
具体的,所述位置误差的定义如下:Specifically, the position error is defined as follows:
Figure PCTCN2020140729-appb-000021
Figure PCTCN2020140729-appb-000021
具体的,所述置信度误差的定义如下:Specifically, the definition of the confidence error is as follows:
Figure PCTCN2020140729-appb-000022
Figure PCTCN2020140729-appb-000022
本发明实施例中,训练的目标为使位置误差L loc最小,模型经过训练后即可用于标志物图片的识别。将图像采集器传输的视频流传递给SSD300模型检测,如果视频图像中存在标志物图片,模型最终输出候选框的坐标位置l和预测值c;如果视频图像中不存在标志物图片,则模型最终不输出任何结果。 In the embodiment of the present invention, the training target is to minimize the position error L loc , and the model can be used for the identification of marker pictures after being trained. The video stream transmitted by the image collector is passed to the SSD300 model for detection. If there is a marker picture in the video image, the model finally outputs the coordinate position l and predicted value c of the candidate frame; if there is no marker picture in the video image, the model finally outputs. Do not output any results.
进一步的,所述步骤S4中,由于经过SSD300模型得到标志物图像的预测位置(即目标区域位置)和标志物图片的真实像素区域存在一定的误差,因此无法直接用于定位计算,需要对其进行图像处理并进一步修正。参阅图5所示,所述图像处理的过程包括:Further, in the step S4, there is a certain error between the predicted position of the marker image (that is, the target area position) obtained through the SSD300 model and the real pixel area of the marker image, so it cannot be directly used for positioning calculation, and it needs to be calculated. Perform image processing and further corrections. Referring to Figure 5, the image processing process includes:
步骤S41:图像区域调整,即根据所述SSD300模型得到预测位置对应的目标区域,由于预测位置的数据并不准确,因此需要扩大调整目标区域范围,使得 目标区域范围能够覆盖整个标志物图像,得到目标区域图片。Step S41: Adjusting the image area, that is, obtaining the target area corresponding to the predicted position according to the SSD300 model. Since the data of the predicted position is inaccurate, it is necessary to expand and adjust the scope of the target area, so that the target area can cover the entire marker image, and obtain: Image of the target area.
具体的,目标区域扩大的方式根据像素值分布来确认,标志物图像本身的黑色像素比例为58.67%,将SSD300模型输出的目标区域二值化后,通过判断目标区域中黑色像素比值是否在58.67%附近来判断。进一步的,采用的判断阈值为45.67%,即如果黑色像素比值低于38.67,就将目标区域的边缘向外扩大20个像素;如果黑色像素比例增加,就将目标区域继续向外扩大,直到区域内黑色像素比值减少,才停止向外扩大。Specifically, the expansion method of the target area is confirmed according to the distribution of pixel values. The black pixel ratio of the marker image itself is 58.67%. After binarizing the target area output by the SSD300 model, it is determined whether the black pixel ratio in the target area is 58.67%. % nearby to judge. Further, the adopted judgment threshold is 45.67%, that is, if the black pixel ratio is lower than 38.67, the edge of the target area will be expanded outward by 20 pixels; if the black pixel ratio increases, the target area will continue to expand outward until the area The inner black pixel ratio decreases before it stops expanding outward.
步骤S42:目标分割,即对调整后的目标区域图片进行二值化处理,将彩色的RGB图像转换为黑白图像,得到目标区域的黑白图片。Step S42: target segmentation, that is, performing binarization processing on the adjusted image of the target area, converting the color RGB image into a black and white image, and obtaining a black and white image of the target area.
具体的,所述图像转换的阈值T采用OSTU算法计算得到,过程如下:Specifically, the threshold value T of the image conversion is calculated by using the OSTU algorithm, and the process is as follows:
对于图像I(x,y),其前景(即目标)和背景的分割阈值记作T,前景的像素点数占整幅图像的比例记为ω 0,其平均灰度μ 0;背景像素点数占整幅图像的比例为ω 1,其平均灰度为μ 0。图像的总平均灰度记为μ,类间方差记为g。 For the image I(x, y), the segmentation thresholds of the foreground (target) and the background are denoted as T, the proportion of the foreground pixels in the entire image is denoted as ω 0 , and its average gray level μ 0 ; the background pixels account for The scale of the whole image is ω 1 , and its average gray level is μ 0 . The overall average gray level of the image is denoted as μ, and the between-class variance is denoted as g.
电子屏采取浅颜色的图像背景,由于标志物图像的主体为黑色,这样做使得背景和标志物图像之间的色彩差别较大,并且图像的大小为M×N,图像中像素的灰度值小于阈值T的像素个数记作N 0,像素灰度大于阈值T的像素个数记作N 1,则有如下公式: The electronic screen adopts a light-colored image background. Since the main body of the marker image is black, the color difference between the background and the marker image is large, and the size of the image is M×N, and the gray value of the pixel in the image is The number of pixels smaller than the threshold T is denoted as N 0 , and the number of pixels whose pixel grayscale is greater than the threshold T is denoted as N 1 , then there is the following formula:
ω 0=N 0/M×N           (4) ω 0 =N 0 /M×N (4)
ω 1=N 1/M×N           (5) ω 1 =N 1 /M×N (5)
N 0+N 1=M×N           (6) N 0 +N 1 =M×N (6)
μ 01=1                   (7) μ 01 =1 (7)
μ=ω 0×μ 01×μ 1               (8) μ=ω 0 ×μ 01 ×μ 1 (8)
g=ω 00-μ) 211-μ) 2               (9) g=ω 00 -μ) 211 -μ) 2 (9)
即得到等价公式为:That is, the equivalent formula is obtained as:
g=ω 0ω 101) 2               (10) g=ω 0 ω 101 ) 2 (10)
采用遍历的方法取T值(范围为0-255之间),得到使类间方差g最大时的阈值T,即为所求。The traversal method is used to take the value of T (the range is between 0 and 255), and the threshold T when the variance g between classes is maximized is obtained, which is the desired value.
步骤S43:根据所述二值化后的黑白图片,对标志物图像进行边界抑制,使所述标志物图像与背景图片进行剥离。Step S43: Perform boundary suppression on the marker image according to the binarized black and white image, so that the marker image and the background image are separated.
本发明实施例中,由于预设的背景图片和标志物图像之间的色彩差异较大,标志物图像之间的边界理论上比较明显,为了进一步将标志物图像从背景图片中剥离,还需要再作一个边界抑制的操作。In the embodiment of the present invention, since the color difference between the preset background image and the marker image is relatively large, the boundary between the marker images is theoretically obvious. In order to further strip the marker image from the background image, it is necessary to Do another boundary suppression operation.
进一步的,所述边界抑制过程具体为:遍历标志物图像中的每一个像素点,比较与该像素点相连(8个方向)的周围8个像素点(图像边缘的像素点会少于8个),假如除边界外的其他方向上的像素点值均为0,则认为该像素点为与边界相邻的像素点,并将该像素点清除。Further, the boundary suppression process is specifically: traverse each pixel in the marker image, and compare the surrounding 8 pixels connected to the pixel (8 directions) (the pixel at the edge of the image will be less than 8). ), if the value of the pixel in other directions except the boundary is 0, the pixel is considered to be a pixel adjacent to the boundary, and the pixel is cleared.
本发明实施例中,如图6所示,每个标志物图案都由一个黑色边框、白色边框和黑色方块拼接组成。在完成边界抑制后,标志物图像中会留下很多独立的没有孔的像素区域,将所有不带孔的像素区域都清除掉,最后剩下如图7所示的带孔区域,这些带孔区域就包含了四个标志物图案。In the embodiment of the present invention, as shown in FIG. 6 , each marker pattern is composed of a black border, a white border and a black square spliced together. After the boundary suppression is completed, many independent pixel areas without holes will be left in the marker image, and all pixel areas without holes will be removed, and finally the area with holes as shown in Figure 7 will remain. The area contains four marker patterns.
步骤S44:进行形状检测,如图8所示,将边界抑制后的图像中所有带孔区域填充为一个整块白色的像素区,即在二值化后的标志物图像上只保留带孔区域的图像区,计算所有带孔区域的中心位置(即质心)和区域位置,并对标志物图像中的标志物图案进行形状检测,参阅图9所示,具体过程包括如下:Step S44 : perform shape detection, as shown in FIG. 8 , fill all the areas with holes in the image after boundary suppression as a whole white pixel area, that is, only the areas with holes are retained on the binarized marker image , calculate the center position (ie centroid) and area position of all the regions with holes, and perform shape detection on the marker pattern in the marker image, as shown in Figure 9, the specific process includes the following:
步骤S441:从所有带孔区域的质心位置作一条竖直的切线,分别记录切线从上往下切过的像素值为0、1的像素点个数。Step S441 : Draw a vertical tangent line from the centroid positions of all areas with holes, and record the number of pixel points with pixel values 0 and 1 cut by the tangent line from top to bottom, respectively.
步骤S442:将带孔区域的白色色块(即像素值为1的部分)看作是波峰,黑色色块(即像素值为0的部分)看作是波谷,那么波峰、波谷的相对宽度就可以用切线经过的0、1个数来衡量,如附图10所示。具体的,如果该区域是标志物图案,那么切线经过的波峰、波谷的个数必定是3个和2个。Step S442: The white color block (ie the part with the pixel value of 1) in the hole area is regarded as the peak, and the black color block (ie the part with the pixel value of 0) is regarded as the trough, then the relative width of the peak and the trough is It can be measured by the number of 0s and 1s that the tangent passes through, as shown in Figure 10. Specifically, if the area is a marker pattern, then the number of wave crests and wave troughs that the tangent line passes through must be 3 and 2.
步骤S443:检查所有带孔区域的波峰、波谷个数,剔除掉明显不符合标志物图案的图像区域。Step S443: Check the number of peaks and troughs in all areas with holes, and remove image areas that obviously do not conform to the marker pattern.
步骤S444:根据波峰、波谷个数符合要求的图像区域,对比波峰、波谷的宽度比例,得到两者理论上的比例相似度。Step S444: Comparing the width ratios of the peaks and troughs according to the image area in which the number of peaks and troughs meets the requirements, to obtain the theoretical ratio similarity between the two.
具体的,用欧几里德距离来衡量计算,得到波峰、波谷比例和理论上的波峰、波谷比例的相似度,具体为:Specifically, the Euclidean distance is used to measure the calculation to obtain the similarity between the peak and trough ratio and the theoretical peak and trough ratio, specifically:
假设计算得到的波峰、波谷之间比例为X1:X2:X3:X4:X5,采用欧几里德距离计算的相似度y为:Assuming that the calculated ratio between peaks and troughs is X1:X2:X3:X4:X5, the similarity y calculated by Euclidean distance is:
Figure PCTCN2020140729-appb-000023
Figure PCTCN2020140729-appb-000023
通过实验模拟得出,当y的值小于0.8时,对于检测标志物图案的效果较好,因此,取y值小于0.8作为该区域为标志物图案的判断标准。Through the experimental simulation, when the value of y is less than 0.8, the effect of detecting the pattern of the marker is better. Therefore, the value of y is less than 0.8 as the criterion for judging that the region is the pattern of the marker.
步骤S445:遍历每一块白色区域,找到所有符合比例相似度的区域,并将其作为标志物图案的具体位置。Step S445: Traverse each white area, find all areas that meet the proportional similarity, and use it as the specific position of the marker pattern.
步骤S45:根据得到4个标志物图案的具体位置,用冒泡排序算法对4个标志物图案进行排序,得到其对应的坐标。Step S45: According to the specific positions of the four marker patterns obtained, the bubble sorting algorithm is used to sort the four marker patterns to obtain their corresponding coordinates.
本发明实施例中,由于事先知道标志物图案的排列方式,根据横纵坐标的相 对大小,即可知道4个坐标的位置分别对应于具体的4个标志物图案。根据所得到4个标志物图案对应的坐标,就可用于下述进一步求取图像采集器到智慧电子屏的相对位置。In the embodiment of the present invention, since the arrangement of the marker patterns is known in advance, according to the relative size of the horizontal and vertical coordinates, it can be known that the positions of the four coordinates correspond to the specific four marker patterns respectively. According to the obtained coordinates corresponding to the four marker patterns, it can be used to further obtain the relative position of the image collector to the smart electronic screen as follows.
本发明实施例中,通过构建深度学习模型并结合图像处理方法,完成了标志物图案的识别,解决了由于摄像头远距离下标志物图像过小,无法识别标志物的问题,相比于其他摄像头定位技术,大大提升了摄像头定位的作用范围,从而提高了定位精度。In the embodiment of the present invention, the recognition of the marker pattern is completed by constructing a deep learning model and combining with the image processing method, which solves the problem that the marker cannot be recognized due to the too small image of the marker at a long distance of the camera. Compared with other cameras The positioning technology greatly improves the scope of the camera positioning, thereby improving the positioning accuracy.
进一步的,所述步骤S5中,根据所得到标志物图案对应的坐标,由于标志物图案被设计为一组具有对称特点的几何图形,采用视差原理来进行距离测量。Further, in the step S5, according to the coordinates corresponding to the obtained marker pattern, since the marker pattern is designed as a set of geometric figures with symmetrical characteristics, the distance measurement is performed by using the parallax principle.
本发明实施例中,视差原理是指同一物体在距离固定的两个摄像头之间拍摄出来的图像中的位置存在一定的差值,利用这个差值就能较为精确的计算出物体距离两个摄像头之间的距离。In the embodiment of the present invention, the principle of parallax means that there is a certain difference between the positions of the same object in the images captured by two cameras with a fixed distance, and the distance between the two cameras can be calculated more accurately by using this difference. the distance between.
具体的,将电子屏上的两个点P l1和P r1看作摄像头,图像采集器上的A点看作物体。图像采集器采集到的画面中的P l2和P l3则可以看作是A点在P l1和P r1成像平面上的投影,因此线段P l2P r2即为A点在P l1和P r1两个“摄像头”成像平面下的视差,如附图11所示。 Specifically, two points P l1 and P r1 on the electronic screen are regarded as cameras, and point A on the image collector is regarded as an object. P l2 and P l3 in the picture collected by the image collector can be regarded as the projection of point A on the imaging planes of P l1 and P r1 , so the line segment P l2 and P r2 is the point A on the two sides of P l1 and P r1 . The parallax under the imaging plane of each "camera" is shown in Figure 11.
进一步的,设P l1和P r1在电子屏上的真实距离为B,单位为毫米;P l1和P r1在图片上的间距为Z,单位为像素;图像采集器上摄像头的焦距为F,单位为毫米;图像采集器距电子屏的距离为D,单位为毫米;图像采集器上摄像头的分辨率记为PPI;图像采集器采集的图像上单位像素的长度为PXM,单位为像素/毫米;即得到如下公式: Further, let the real distance between P l1 and P r1 on the electronic screen be B, in millimeters; the distance between P l1 and P r1 on the picture is Z, in pixels; the focal length of the camera on the image collector is F, The unit is mm; the distance between the image collector and the electronic screen is D, the unit is mm; the resolution of the camera on the image collector is recorded as PPI; the length of the unit pixel on the image collected by the image collector is PXM, the unit is pixel/mm ; that is, the following formula is obtained:
Figure PCTCN2020140729-appb-000024
Figure PCTCN2020140729-appb-000024
由相似原理可知,ΔAP l1P r1和ΔAP l2P r2相似,得到如下等式: According to the similarity principle, ΔAP l1 P r1 and ΔAP l2 P r2 are similar, and the following equation is obtained:
Figure PCTCN2020140729-appb-000025
Figure PCTCN2020140729-appb-000025
由公式(11)和(12)可得到距离D。The distance D can be obtained from equations (11) and (12).
进一步的,实际的操作过程中,由于图像采集器标注的镜头焦距并不等于实际拍摄的焦距,并且拍摄后会对图片做一些处理,导致摄像头的镜头焦距F和分辨率PPI的参数和实际的有一定偏差,因此采用标定的方法解决摄像头参数不精确带来的问题,具体为:Further, in the actual operation process, because the focal length of the lens marked by the image collector is not equal to the focal length of the actual shooting, and some processing is performed on the picture after shooting, the parameters of the lens focal length F and resolution PPI of the camera are different from the actual ones. There is a certain deviation, so the calibration method is used to solve the problem caused by inaccurate camera parameters, specifically:
设标志物图像在已知距离D 1时,测得的视差为Z 1;对于任意距离D 2,测得的视差为Z 2,将公式(12)改写如下: Assume that the measured parallax of the landmark image is Z 1 when the distance D 1 is known; for any distance D 2 , the measured parallax is Z 2 , and formula (12) is rewritten as follows:
Figure PCTCN2020140729-appb-000026
Figure PCTCN2020140729-appb-000026
那么,在提前测得一组真实值D 1和Z 1的情况下,对于任意的D 2均有: Then, when a set of real values D 1 and Z 1 are measured in advance, for any D 2 there are:
Figure PCTCN2020140729-appb-000027
Figure PCTCN2020140729-appb-000027
因此,测得的D 2即为标志物图案中心点到摄像头连线的直线距离。 Therefore, the measured D 2 is the straight-line distance from the center point of the marker pattern to the connection line of the camera.
本发明实施例中,由于设置的标志物图案具有集合对称特性,利用具有几何对称特点的标志物图案进行定位计算,使得单目摄像头也能利用视差原理进行距离测算,使得定位方法不依赖图像匹配算法的效果,从而大大提升了定位的精度,并能够达到双摄像头的测距精度,也并且降低了物料成本。In the embodiment of the present invention, since the set marker pattern has the characteristic of collective symmetry, the marker pattern with geometric symmetry is used for positioning calculation, so that the monocular camera can also use the parallax principle to perform distance measurement, so that the positioning method does not depend on image matching. The effect of the algorithm can greatly improve the positioning accuracy, and can achieve the ranging accuracy of dual cameras, and also reduce the material cost.
进一步的,所述步骤S6中,进行角度测量,并得到摄像头到电子屏的实际距离,具体包括:Further, in the step S6, angle measurement is performed, and the actual distance from the camera to the electronic screen is obtained, which specifically includes:
步骤S61:当图像采集器处在电子屏正中心轴线附近时,不需要偏转图像采集器的摄像头,就可以采集到电子屏上的标志物图像。Step S61: When the image collector is near the center axis of the electronic screen, the image of the marker on the electronic screen can be collected without deflecting the camera of the image collector.
具体的,如附图12所示,假设摄像头到电子屏中心的水平距离为DX,竖直距离为DY;摄像头采集的图片中心点到图片上整体图案(包括4个标志物图案)的中心点的水平像素差值为X,竖直像素差值为Y;图片上单个标志物图案的宽为PX像素,高为PY像素;单个标志物图案的实际边长为L,即得到实际的水平距离DX和竖直距离DY为:Specifically, as shown in FIG. 12, it is assumed that the horizontal distance from the camera to the center of the electronic screen is DX, and the vertical distance is DY; The horizontal pixel difference is X, and the vertical pixel difference is Y; the width of a single marker pattern on the picture is PX pixels, and the height is PY pixels; the actual side length of a single marker pattern is L, that is, the actual horizontal distance is obtained DX and vertical distance DY are:
Figure PCTCN2020140729-appb-000028
Figure PCTCN2020140729-appb-000028
Figure PCTCN2020140729-appb-000029
Figure PCTCN2020140729-appb-000029
假设之前测得的距离为D,则摄像头的角度计算公式如下:Assuming that the previously measured distance is D, the formula for calculating the angle of the camera is as follows:
Figure PCTCN2020140729-appb-000030
Figure PCTCN2020140729-appb-000030
Figure PCTCN2020140729-appb-000031
Figure PCTCN2020140729-appb-000031
步骤S62:当图像采集器偏离电子屏正中心较远的距离时,需要偏转图像采集器的摄像头,才能采集到电子屏上的标志物图像。Step S62: When the image collector is far away from the center of the electronic screen, the camera of the image collector needs to be deflected to collect the image of the marker on the electronic screen.
具体的,采用改写后的Music算法来计算摄像头的偏移角度,进而推出摄像头距电子屏的实际偏移距离DX、DY,计算过程如下:Specifically, the rewritten Music algorithm is used to calculate the offset angle of the camera, and then the actual offset distances DX and DY from the camera to the electronic screen are derived. The calculation process is as follows:
在定位出4个标志物图案的区域即具体位置后,在4个标志物图案区域上作等距的4条竖直切线,如图13所示,将这4条切线通过距离测量的方法测得距离为X1、X2、X3、X4。同时在整体图案上作等距的4条竖直切线,将这4条切线通过距离测量的方法测得的距离为Y1、Y2、Y3、Y4。当摄像头水平偏转时,竖直方向上的线段形变较大,适合用作角度估计,因此测水平偏角用Y1~Y4上 的数据估计。同理,测竖直方向上的偏角时用X1~X4数据估计。After locating the areas of the four marker patterns, that is, the specific positions, draw four equidistant vertical tangents on the four marker pattern areas, as shown in Figure 13, and measure the four tangents by the method of distance measurement. The distances are X1, X2, X3, and X4. At the same time, four equidistant vertical tangent lines are made on the overall pattern, and the distances measured by these four tangent lines through the distance measurement method are Y1, Y2, Y3, and Y4. When the camera is deflected horizontally, the line segment in the vertical direction is deformed greatly, which is suitable for angle estimation. Therefore, the data on Y1~Y4 are used to measure the horizontal declination angle. Similarly, X1~X4 data are used to estimate the declination angle in the vertical direction.
进一步的,以水平方向偏角估算为例,设竖直方向上的切线间距为d,则构造入射信号
Figure PCTCN2020140729-appb-000032
其中Z1、Z2、Z3、Z4分别为:
Further, taking the estimation of the declination angle in the horizontal direction as an example, set the tangent distance in the vertical direction to be d, then construct the incident signal
Figure PCTCN2020140729-appb-000032
Among them, Z1, Z2, Z3, and Z4 are:
Z1=0;
Figure PCTCN2020140729-appb-000033
Figure PCTCN2020140729-appb-000034
Figure PCTCN2020140729-appb-000035
z1 = 0;
Figure PCTCN2020140729-appb-000033
Figure PCTCN2020140729-appb-000034
Figure PCTCN2020140729-appb-000035
所述入射信号的协方差矩阵如下:The covariance matrix of the incident signal is as follows:
R x(i)=X(i)X H(i)               (19) Rx(i)= X (i) XH (i) (19)
其中,H表示协方差矩阵的共轭转置。where H represents the conjugate transpose of the covariance matrix.
将公式(19)得到的协方差矩阵进行特征分解,得到:The eigendecomposition of the covariance matrix obtained by formula (19) can be obtained:
R(i)=AR xA H2I           (20) R(i)=AR x A H2 I (20)
其中,A为方向响应向量由入射信号X(i)中提取得到,σ 2为噪声功率,I为单位矩阵。 Among them, A is the directional response vector extracted from the incident signal X(i), σ 2 is the noise power, and I is the identity matrix.
根据公式(20)得到特征值为γ对应的特征向量为v(θ),并按特征值的大小排序,特征值最大对应的特征向量看作信号部分空间,将剩下的3个特征值和特征向量看作噪声部分空间,得到噪声矩阵E n,即: According to formula (20), the eigenvector corresponding to the eigenvalue γ is v(θ), and it is sorted according to the size of the eigenvalue. The eigenvector corresponding to the largest eigenvalue is regarded as the signal part space, and the remaining three eigenvalues and The eigenvectors are regarded as the noise part space, and the noise matrix En is obtained, namely:
A Hv i(θ)=0,i=2,3,4              (21) A H v i (θ) = 0, i = 2, 3, 4 (21)
E n=[v 2(θ),v 3(θ),v 4(θ)]                  (22) En = [v 2 (θ), v 3 (θ), v 4 (θ)] (22)
最后估计得到偏移角度P,a为信号向量由入射信号X(i)中提取得到,即得到:Finally, the offset angle P is estimated, and a is the signal vector extracted from the incident signal X(i), namely:
Figure PCTCN2020140729-appb-000036
Figure PCTCN2020140729-appb-000036
步骤S63:当摄像头偏转一定角度后,图像采集器采集到的标志物图片存在一定程度的形变,而摄像头偏转不同的角度,图片上的形变程度往往不同。因此, 所述图片上的形变程度实际上包含了摄像头偏移角度的信息,所述Music算法将图片上线段的形变程度转化为距离作为算法的输入值,从而估计出摄像头的偏转角度,并使角度误差最小。Step S63: When the camera is deflected by a certain angle, the image of the marker collected by the image collector has a certain degree of deformation, and when the camera is deflected at different angles, the degree of deformation on the picture is often different. Therefore, the degree of deformation on the picture actually contains the information of the offset angle of the camera, and the Music algorithm converts the degree of deformation of the upper segment of the picture into a distance as the input value of the algorithm, thereby estimating the deflection angle of the camera and making the Angle error is minimal.
本发明实施例中,由于标志物图片在不同位置的形变程度不同,Music算法计算得到的误差也不一样。当标志物图片上的两条线段形变程度的差值最大时,Music算法计算得到的误差最小。因此在转动摄像头的过程中,必然存在一个转动的角度,使得Music算法估计的误差最小,具体计算公式如下:In the embodiment of the present invention, since the deformation degrees of the marker pictures at different positions are different, the errors calculated by the Music algorithm are also different. When the difference between the deformation degrees of the two line segments on the marker image is the largest, the error calculated by the Music algorithm is the smallest. Therefore, in the process of rotating the camera, there must be an angle of rotation, so that the error estimated by the Music algorithm is the smallest. The specific calculation formula is as follows:
假设摄像头拍摄的转换矩阵为:Suppose the transformation matrix captured by the camera is:
K=[α -N1-N2-N,…,α 0,…α N-2N-1N]           (24) K=[α -N1-N2-N ,...,α 0 ,...α N-2N-1N ] (24)
由于摄像头在采集图像时畸变程度关于中心左右对称,因此有:Since the degree of distortion of the camera is symmetrical about the center when collecting images, there are:
α -N=α N>α 1-N=α N-1>……>α 0               (25) α -NN1-NN-1 >...>α 0 (25)
进一步的,假设用于计算角度的两条线段分别在标志物图片上的p处和q处,对应的计算距离为D p和D q,两条线段对应的像素大小为P p和P q,整体图案的边长为L,相机焦距为F,根据公式(9)得到: Further, it is assumed that the two line segments used for calculating the angle are at p and q on the marker image respectively, the corresponding calculated distances are D p and D q , and the pixel sizes corresponding to the two line segments are P p and P q , The side length of the overall pattern is L, and the focal length of the camera is F, which is obtained according to formula (9):
Figure PCTCN2020140729-appb-000037
Figure PCTCN2020140729-appb-000037
Figure PCTCN2020140729-appb-000038
Figure PCTCN2020140729-appb-000038
Figure PCTCN2020140729-appb-000039
Figure PCTCN2020140729-appb-000039
Figure PCTCN2020140729-appb-000040
Figure PCTCN2020140729-appb-000040
进一步的,用于计算两条线段之间的像素差值为W,则:Further, for calculating the pixel difference between two line segments as W, then:
W=P pα P-P qα q                     (30) W=P p α P -P q α q (30)
当q=0、q=N时,即p处在距离q处的最远端时,两条线段的像素差值W最大,则Music算法估计的误差最小。When q=0, q=N, that is, when p is at the farthest distance from q, the pixel difference W of the two line segments is the largest, and the error estimated by the Music algorithm is the smallest.
因此,在实际拍摄的过程中,应该使整体图案的一条边靠近图像中心,另一条边远离图像中心,这样用music算法估计出的角度误差最小。Therefore, in the actual shooting process, one side of the overall pattern should be close to the center of the image, and the other side should be far from the center of the image, so that the angle error estimated by the music algorithm is the smallest.
本发明实施例中,通过改写Music算法的输入值,使经典的Music算法能用于估计摄像头的偏角,即通过标志物图案上的不同点到摄像头的距离,来估计出标志物图案与摄像头之间的偏角,极大提高了摄像头到电子屏之间偏角的估计精度,进而保证定位方法的可靠性。In the embodiment of the present invention, by rewriting the input value of the Music algorithm, the classical Music algorithm can be used to estimate the declination angle of the camera, that is, the distance between the different points on the marker pattern and the camera is used to estimate the marker pattern and the camera. The declination angle between the camera and the electronic screen greatly improves the estimation accuracy of the declination angle between the camera and the electronic screen, thereby ensuring the reliability of the positioning method.
参阅图14所示,本发明实施例还提供一种基于屏幕光通信的无人机定位方法的系统,该系统包括:Referring to FIG. 14 , an embodiment of the present invention further provides a system for a method for positioning an unmanned aerial vehicle based on screen optical communication. The system includes:
电子屏选取模块:用于选取电子屏,并设置电子屏上的标志物图像使其具有对称性;Electronic screen selection module: used to select the electronic screen, and set the marker image on the electronic screen to make it symmetrical;
采集器设置模块:用于设置图像采集器,包括无人机、设置在无人机上的图像采集模块、以及用于调节无人机高度的固定支架;Collector setting module: used to set the image collector, including the drone, the image acquisition module set on the drone, and the fixed bracket for adjusting the height of the drone;
模型构建模块:用于提取所述图像采集器传输的视频流,并对标志物图片进行识别,得到标志物图像在图片中的预测位置;Model building module: used to extract the video stream transmitted by the image collector, identify the marker image, and obtain the predicted position of the marker image in the image;
图像处理模块:用于根据所述标志物图像的预测位置,进行图像处理得到其中标志物图案对应的坐标;Image processing module: for performing image processing according to the predicted position of the marker image to obtain the coordinates corresponding to the marker pattern therein;
距离测量模块:用于根据所述标志物图案对应的坐标,进行距离测量,得到标志物图案中心到摄像头连接的直线距离;Distance measurement module: used to measure the distance according to the coordinates corresponding to the marker pattern, and obtain the straight-line distance from the center of the marker pattern to the connection of the camera;
角度测量定位模块:用于进行角度测量,并结合所述直线距离得到摄像头到电子屏的实际距离,完成无人机定位。Angle measurement and positioning module: used for angle measurement, and combined with the straight-line distance to obtain the actual distance from the camera to the electronic screen to complete the UAV positioning.
具体地,本发明实施例提供的系统具体用于执行上述方法实施例,本发明实施例对此不再进行赘述。Specifically, the system provided in the embodiment of the present invention is specifically used to execute the foregoing method embodiment, which is not repeated in this embodiment of the present invention.
本发明实施例中提供的无人机定位方法及系统,采用电子屏上具有对称性的标志物图像,利用深度学习模型和图像处理的手段完成了标志物图案的识别,解决了由于摄像头远距离下标志物图像过小,无法识别标志物的问题,并采用经典的Music算法来估计摄像头的偏角,进而得到摄像头到电子屏的相对距离,相比于其他摄像头定位技术,大大提升了摄像头定位的作用范围,从而提高了摄像头的定位精度。The UAV positioning method and system provided in the embodiment of the present invention adopts the marker image with symmetry on the electronic screen, and uses the deep learning model and image processing method to complete the identification of the marker pattern, which solves the problem of the long distance of the camera. The image of the lower marker is too small to recognize the problem of the marker. The classical Music algorithm is used to estimate the declination angle of the camera, and then the relative distance from the camera to the electronic screen is obtained. Compared with other camera positioning technologies, the camera positioning is greatly improved. range of action, thereby improving the positioning accuracy of the camera.
本发明实施例中在保证设备成本可接受的同时,能够保证其在20米距离内无人机的定位精度,同时在复杂多样的场景下均有较好的使用性能,使得无人机的定位技术能够广泛的应用。In the embodiment of the present invention, while ensuring the equipment cost is acceptable, the positioning accuracy of the UAV within a distance of 20 meters can be guaranteed, and at the same time, it has good performance in complex and diverse scenarios, so that the positioning of the UAV can be ensured. Technology can be widely used.
上述实施例为本发明较佳的实施方式,但本发明的实施方式并不受上述实施例的限制,其他的任何未背离本发明的精神实质与原理下所作的改变、修饰、替代、组合、简化,均应为等效的置换方式,都包含在本发明的保护范围之内。The above-mentioned embodiments are preferred embodiments of the present invention, but the embodiments of the present invention are not limited by the above-mentioned embodiments, and any other changes, modifications, substitutions, combinations, The simplification should be equivalent replacement manners, which are all included in the protection scope of the present invention.

Claims (10)

  1. 一种基于屏幕光通信的无人机定位方法,其特征在于:该定位方法具体包括:A UAV positioning method based on screen optical communication, characterized in that: the positioning method specifically includes:
    选取电子屏,并设置电子屏上的标志物图像使其具有对称性;Select the electronic screen, and set the marker image on the electronic screen to make it symmetrical;
    设置图像采集器,包括无人机、设置在无人机上的图像采集模块、以及用于调节无人机高度的固定支架;所述图像采集模块上设置摄像头,用于采集电子屏上的标志物图像得到包含标志物图片的视频流;An image collector is provided, including a drone, an image acquisition module arranged on the drone, and a fixed bracket for adjusting the height of the drone; a camera is set on the image acquisition module for collecting markers on the electronic screen The image gets a video stream containing a picture of the marker;
    构建深度学习模型,提取所述图像采集器传输的视频流,并对标志物图片进行识别,得到标志物图像在图片中的预测位置;constructing a deep learning model, extracting the video stream transmitted by the image collector, and identifying the marker image to obtain the predicted position of the marker image in the image;
    根据所述标志物图像的预测位置,进行图像处理得到其中标志物图案对应的坐标;According to the predicted position of the marker image, image processing is performed to obtain the coordinates corresponding to the marker pattern therein;
    根据所述标志物图案对应的坐标,进行距离测量,得到标志物图案中心到摄像头连接的直线距离;According to the coordinates corresponding to the marker pattern, distance measurement is performed to obtain the straight line distance from the center of the marker pattern to the connection of the camera;
    进行角度测量,并结合所述直线距离得到摄像头到电子屏的实际距离,完成无人机定位。The angle measurement is performed, and the actual distance from the camera to the electronic screen is obtained in combination with the straight-line distance to complete the positioning of the UAV.
  2. 根据权利要求1所述的基于屏幕光通信的无人机定位方法,其特征在于:采用SSD300模型来完成对标志物图片的识别,具体识别过程如下:The UAV positioning method based on screen optical communication according to claim 1, is characterized in that: adopts SSD300 model to complete the identification of marker pictures, and the specific identification process is as follows:
    在SSD300模型中输入图像采集器传输视频流的一帧图片;In the SSD300 model, input a frame of picture of the video stream transmitted by the image collector;
    将所述图片经过VGG-16网络进行特征的初步抽取,得到特征图;Preliminarily extract the features of the picture through the VGG-16 network to obtain a feature map;
    采用定位器生成预测候选框,将候选框选中的区域作为待识别特征图;The locator is used to generate the prediction candidate frame, and the area selected by the candidate frame is used as the feature map to be identified;
    将待识别特征图进行处理转化,得到转化后的特征图;The feature map to be identified is processed and transformed to obtain a transformed feature map;
    根据转化后的特征图,通过全连接分类器输出图片中标志物图像的位置和相似程度估计值;According to the transformed feature map, the fully connected classifier is used to output the estimated value of the position and similarity of the marker image in the picture;
    选取相似程度估计值大于预设值的候选框,并将其像素位置作为最终的预测位置。Select the candidate frame whose similarity estimation value is greater than the preset value, and use its pixel position as the final prediction position.
  3. 根据权利要求2所述的基于屏幕光通信的无人机定位方法,其特征在于:所述模型进行训练的损失函数定义为位置误差与置信度误差的加权和,即:The UAV positioning method based on screen optical communication according to claim 2, wherein the loss function for training the model is defined as the weighted sum of the position error and the confidence error, that is:
    Figure PCTCN2020140729-appb-100001
    Figure PCTCN2020140729-appb-100001
    其中,N为定位器生成的候选框数量;x为指示参数;c为类别置信度预测值;l是定位器生成的候选框的位置坐标;g是人工标记的标志物图片的位置坐标;α为权重系数,设为1;Among them, N is the number of candidate frames generated by the locator; x is the indicator parameter; c is the category confidence prediction value; l is the position coordinate of the candidate frame generated by the locator; g is the position coordinate of the manually marked landmark image; α is the weight coefficient, set to 1;
    所述位置误差的定义如下:The position error is defined as follows:
    Figure PCTCN2020140729-appb-100002
    Figure PCTCN2020140729-appb-100002
    Figure PCTCN2020140729-appb-100003
    Figure PCTCN2020140729-appb-100003
    所述置信度误差的定义如下:The confidence error is defined as follows:
    Figure PCTCN2020140729-appb-100004
    Figure PCTCN2020140729-appb-100004
  4. 根据权利要求2所述的基于屏幕光通信的无人机定位方法,其特征在于:所述图像处理过程包括:The UAV positioning method based on screen optical communication according to claim 2, wherein the image processing process comprises:
    将所述预测位置对应的目标区域进行扩大调整,使其能够覆盖整个标志物图像,得到目标区域图片;Enlarging and adjusting the target area corresponding to the predicted position so that it can cover the entire marker image to obtain a picture of the target area;
    对所述目标区域图片进行二值化处理,将其转换为黑白图片;Binarize the image of the target area, and convert it into a black and white image;
    根据所述黑白图片对所述标志物图像进行边界抑制,使其与背景图片剥离,得到带孔区域图片;Perform boundary suppression on the marker image according to the black-and-white image, so that it is separated from the background image to obtain a picture of the area with holes;
    将所述带孔区域填充为整块白色的像素区,计算所有带孔区域的中心位置和 区域位置,并对其中的标志物图案进行形状检测,得到标志物图案的具体位置;Filling the perforated area into a whole white pixel area, calculating the center position and regional position of all perforated areas, and carrying out shape detection to the marker pattern therein to obtain the specific position of the marker pattern;
    根据所述标志物图案的具体位置,对标志物图案进行排序得到其对应的坐标。According to the specific positions of the marker patterns, the marker patterns are sorted to obtain their corresponding coordinates.
  5. 根据权利要求4所述的基于屏幕光通信的无人机定位方法,其特征在于:所述图像转换的阈值T采用OSTU算法进行计算,具体为:The UAV positioning method based on screen optical communication according to claim 4, characterized in that: the threshold value T of the image conversion is calculated by using the OSTU algorithm, specifically:
    对于图像I(x,y),其前景和背景的分割阈值作T,前景像素点数占整幅图像的比例为ω 0,其平均灰度μ 0;背景像素点数占整幅图像的比例为ω 1,其平均灰度为μ 0;总平均灰度为μ,类间方差为g;图像的大小为M×N,图像中像素的灰度值小于阈值T的像素个数为N 0,大于阈值T的像素个数为N 1;则得到如下公式: For the image I(x, y), the segmentation threshold of its foreground and background is T, the proportion of foreground pixels in the whole image is ω 0 , and its average gray level μ 0 ; the proportion of background pixels in the whole image is ω 1 , its average gray level is μ 0 ; the total average gray level is μ, and the inter-class variance is g; the size of the image is M×N, and the number of pixels in the image whose gray value is less than the threshold T is N 0 , which is greater than The number of pixels of the threshold T is N 1 ; then the following formula is obtained:
    ω 0=N 0/M×N ω 0 =N 0 /M×N
    ω 1=N 1/M×N ω 1 =N 1 /M×N
    N 0+N 1=M×N N 0 +N 1 =M×N
    μ 01=1 μ 01 =1
    μ=ω 0×μ 01×μ 1 μ=ω 0 ×μ 01 ×μ 1
    g=ω 00-μ) 211-μ) 2 g=ω 00 -μ) 211 -μ) 2
    由上述公式得到:Obtained from the above formula:
    g=ω 0ω 101) 2 g=ω 0 ω 101 ) 2
    采用遍历的方法取T值,得到使类间方差g最大时的阈值T。The traversal method is used to take the value of T, and the threshold T when the inter-class variance g is maximized is obtained.
  6. 根据权利要求4所述的基于屏幕光通信的无人机定位方法,其特征在于:所述形状检测的过程包括如下:The UAV positioning method based on screen optical communication according to claim 4, wherein the process of the shape detection comprises the following steps:
    从所有带孔区域的质心位置作一条竖直的切线,分别记录从上往下切过的像素值为0、1的像素点个数;Draw a vertical tangent line from the centroid position of all areas with holes, and record the number of pixel points with pixel values 0 and 1 cut from top to bottom;
    将带孔区域中像素值1对应的白色色块看作是波峰,像素值0对应的黑色色 块看作是波谷;The white color block corresponding to pixel value 1 in the area with holes is regarded as a peak, and the black color block corresponding to pixel value 0 is regarded as a trough;
    检查所有带孔区域的波峰、波谷个数,剔除掉明显不符合标志物图案的图像区域;Check the number of peaks and troughs in all areas with holes, and remove image areas that obviously do not conform to the marker pattern;
    根据波峰、波谷个数符合要求的图像区域,采用欧几里德距离计算得到波峰、波谷的比例相似度;According to the image area where the number of crests and troughs meet the requirements, the proportional similarity of crests and troughs is calculated by using Euclidean distance;
    遍历每一块白色区域,找到所有符合比例相似度的区域,并将其作为标志物图案的具体位置。Traverse each white area, find all the areas that meet the proportional similarity, and use it as the specific location of the marker pattern.
  7. 根据权利要求6所述的基于屏幕光通信的无人机定位方法,其特征在于:采用视差原理来进行距离测量,具体为:The UAV positioning method based on screen optical communication according to claim 6, is characterized in that: adopting the principle of parallax to carry out distance measurement, specifically:
    将电子屏上的两个点P l1和P r1看作摄像头,图像采集器上的A点看作物体,A点在P l1和P r1成像平面上的投影记为P l2和P l3,则得到如下公式: Consider the two points P l1 and P r1 on the electronic screen as a camera, point A on the image collector as an object, and the projections of point A on the imaging planes of P l1 and P r1 are denoted as P l2 and P l3 , then The following formula is obtained:
    Figure PCTCN2020140729-appb-100005
    Figure PCTCN2020140729-appb-100005
    由相似原理可知,ΔAP l1P r1和ΔAP l2P r2相似,得到如下等式: According to the similarity principle, ΔAP l1 P r1 and ΔAP l2 P r2 are similar, and the following equation is obtained:
    Figure PCTCN2020140729-appb-100006
    Figure PCTCN2020140729-appb-100006
    其中,图像采集器上摄像头的分辨率为PPI,图像采集器采集的图像上单位像素的长度为PXM;P l1和P r1在电子屏上的真实距离为B,在图片上的间距为Z;图像采集器上摄像头的焦距为F;图像采集器距电子屏的距离为D; Among them, the resolution of the camera on the image collector is PPI, the length of the unit pixel on the image collected by the image collector is PXM; the real distance between P l1 and P r1 on the electronic screen is B, and the distance on the picture is Z; The focal length of the camera on the image collector is F; the distance between the image collector and the electronic screen is D;
    设标志物图像在已知距离D 1时,测得的视差为Z 1;对于任意距离D 2,测得的视差为Z 2,根据已知的D 1和Z 1,进行转换得到: Assume that the measured parallax of the landmark image is Z 1 when the distance D 1 is known; for any distance D 2 , the measured parallax is Z 2 . According to the known D 1 and Z 1 , the measured parallax is obtained:
    Figure PCTCN2020140729-appb-100007
    Figure PCTCN2020140729-appb-100007
    Figure PCTCN2020140729-appb-100008
    Figure PCTCN2020140729-appb-100008
    得到的D 2即为标志物图案中心点到摄像头连线的直线距离D。 The obtained D 2 is the straight-line distance D from the center point of the marker pattern to the connection line of the camera.
  8. 根据权利要求1所述的基于屏幕光通信的无人机定位方法,其特征在于:进行角度测量,具体包括:The UAV positioning method based on screen optical communication according to claim 1, is characterized in that: carrying out angle measurement, specifically comprises:
    当不需要偏转摄像头采集标志物图像时,得到摄像头的角度为:When there is no need to deflect the camera to collect the marker image, the angle of the camera obtained is:
    Figure PCTCN2020140729-appb-100009
    Figure PCTCN2020140729-appb-100009
    Figure PCTCN2020140729-appb-100010
    Figure PCTCN2020140729-appb-100010
    其中,摄像头到电子屏中心的水平距离为DX,竖直距离为DY;Among them, the horizontal distance from the camera to the center of the electronic screen is DX, and the vertical distance is DY;
    当需要偏转摄像头采集标志物图像时,采用改写后的Music算法来计算摄像头的偏移角度,具体为:When the camera needs to be deflected to collect the marker image, the rewritten Music algorithm is used to calculate the offset angle of the camera, specifically:
    在得到具体位置的标志物图案上作等距的4条竖直切线,其水平和竖直方向的距离分别为X1~X4和Y1~Y4,设竖直方向上的切线间距为d;Four equidistant vertical tangent lines are made on the marker pattern of the obtained specific position, and the distances in the horizontal and vertical directions are X1-X4 and Y1-Y4 respectively, and the distance between the tangent lines in the vertical direction is set as d;
    构造入射信号
    Figure PCTCN2020140729-appb-100011
    其中Z1、Z2、Z3、Z4分别为:
    Constructing the incident signal
    Figure PCTCN2020140729-appb-100011
    Among them, Z1, Z2, Z3, and Z4 are:
    Z1=0Z1=0
    Figure PCTCN2020140729-appb-100012
    Figure PCTCN2020140729-appb-100012
    Figure PCTCN2020140729-appb-100013
    Figure PCTCN2020140729-appb-100013
    Figure PCTCN2020140729-appb-100014
    Figure PCTCN2020140729-appb-100014
    根据入射信号的协方差矩阵并将其分解得到:According to the covariance matrix of the incident signal and decompose it to get:
    R(i)=AR XA H2I R(i)=AR X A H2 I
    其中,A为方向响应向量由入射信号X(i)中提取得到,H表示协方差矩阵的共轭转置,σ 2为噪声功率,I为单位矩阵; Among them, A is the directional response vector extracted from the incident signal X(i), H is the conjugate transpose of the covariance matrix, σ 2 is the noise power, and I is the identity matrix;
    由上述公式得到特征值为γ对应的特征向量为v(θ),并按特征值的大小排序,将其最大值对应的特征向量看作信号部分空间,将剩下的3个特征值和特征向量看作噪声部分空间,得到噪声矩阵E n,即: The eigenvector corresponding to the eigenvalue γ obtained from the above formula is v(θ), and it is sorted according to the size of the eigenvalue. The eigenvector corresponding to the maximum value is regarded as the signal part space, and the remaining 3 eigenvalues and features The vector is regarded as the noise part space, and the noise matrix En is obtained, namely:
    A Hυ i(θ)=0,i=2,3,4 A H υ i (θ)=0, i=2,3,4
    E n=[υ 2(θ),υ 3(θ),υ 4(θ)] E n = [υ 2 (θ), υ 3 (θ), υ 4 (θ)]
    摄像头在水平方向的偏移角度P为:The offset angle P of the camera in the horizontal direction is:
    Figure PCTCN2020140729-appb-100015
    Figure PCTCN2020140729-appb-100015
    其中,a为信号向量由入射信号X(i)中提取得到。Among them, a is the signal vector extracted from the incident signal X(i).
  9. 根据权利要求8所述的基于屏幕光通信的无人机定位方法,其特征在于:当摄像头偏移角度采集得到的标志物图片产生形变时,将图片上线段的形变程度转化成距离,具体包括:The UAV positioning method based on screen optical communication according to claim 8, characterized in that: when the marker picture obtained by the camera offset angle collection is deformed, the deformation degree of the upper segment of the picture is converted into a distance, which specifically includes: :
    假设摄像头拍摄的转换矩阵为:Suppose the transformation matrix captured by the camera is:
    K=[α -N1-N2-N,…,α 0,…α N-2N-1N]; K=[α -N1-N2-N ,...,α 0 ,...α N-2N-1N ];
    由于摄像头在采集图像时畸变程度关于中心左右对称,得到:Since the degree of distortion of the camera is symmetrical about the center when collecting images, we get:
    α -N=α N>α 1-N=α N-1>……>α 0 α -NN1-NN-1 >...>α 0
    设用于计算角度的两条线段分别在标志物图片上的p处和q处,对应的计算距离为D p和D q,两条线段对应的像素大小为P p和P q,整体图案的边长为L,相机焦距为F,则得到如下公式: Assuming that the two line segments used to calculate the angle are at p and q on the marker image, the corresponding calculated distances are D p and D q , the pixel sizes corresponding to the two line segments are P p and P q , and the overall pattern is The side length is L and the camera focal length is F, then the following formula is obtained:
    Figure PCTCN2020140729-appb-100016
    Figure PCTCN2020140729-appb-100016
    Figure PCTCN2020140729-appb-100017
    Figure PCTCN2020140729-appb-100017
    Figure PCTCN2020140729-appb-100018
    Figure PCTCN2020140729-appb-100018
    Figure PCTCN2020140729-appb-100019
    Figure PCTCN2020140729-appb-100019
    由上述公式得到两条线段之间的像素差值为W,即:From the above formula, the pixel difference between the two line segments is W, that is:
    W=P pα P-P qα q W=P p α P -P q α q
    其中,两条线段对应的像素大小为P p和P q,两条线段对应的畸变程度为α P和α q,p和q的取值为0-N; Among them, the pixel sizes corresponding to the two line segments are P p and P q , the distortion degrees corresponding to the two line segments are α P and α q , and the values of p and q are 0-N;
    令q=0、p=N时,两端线段的像素差值W最大,则Music算法的误差最小。When q=0 and p=N, the pixel difference W of the line segments at both ends is the largest, and the error of the Music algorithm is the smallest.
  10. 一种基于权利要求1-9任一项所述基于屏幕光通信的无人机定位方法的系统,其特征在于:该系统包括:A system based on the UAV positioning method based on screen optical communication according to any one of claims 1-9, characterized in that: the system comprises:
    电子屏选取模块:用于选取电子屏,并设置电子屏上的标志物图像使其具有对称性;Electronic screen selection module: used to select the electronic screen, and set the marker image on the electronic screen to make it symmetrical;
    采集器设置模块:用于设置图像采集器,包括无人机、设置在无人机上的图像采集模块、以及用于调节无人机高度的固定支架;Collector setting module: used to set the image collector, including the drone, the image acquisition module set on the drone, and the fixed bracket for adjusting the height of the drone;
    模型构建模块:用于提取所述图像采集器传输的视频流,并对标志物图片进行识别,得到标志物图像在图片中的预测位置;Model building module: used to extract the video stream transmitted by the image collector, identify the marker image, and obtain the predicted position of the marker image in the image;
    图像处理模块:用于根据所述标志物图像的预测位置,进行图像处理得到其中标志物图案对应的坐标;Image processing module: for performing image processing according to the predicted position of the marker image to obtain the coordinates corresponding to the marker pattern therein;
    距离测量模块:用于根据所述标志物图案对应的坐标,进行距离测量,得到标志物图案中心到摄像头连接的直线距离;Distance measurement module: used to measure the distance according to the coordinates corresponding to the marker pattern to obtain the straight-line distance from the center of the marker pattern to the connection of the camera;
    角度测量定位模块:用于进行角度测量,并结合所述直线距离得到摄像头到电子屏的实际距离,完成无人机定位。Angle measurement and positioning module: used for angle measurement, and combined with the straight-line distance to obtain the actual distance from the camera to the electronic screen, to complete the UAV positioning.
PCT/CN2020/140729 2020-12-10 2020-12-29 Unmanned aerial vehicle positioning method and system based on screen optical communication WO2022121024A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011435354.0 2020-12-10
CN202011435354.0A CN114627398A (en) 2020-12-10 2020-12-10 Unmanned aerial vehicle positioning method and system based on screen optical communication

Publications (1)

Publication Number Publication Date
WO2022121024A1 true WO2022121024A1 (en) 2022-06-16

Family

ID=81895007

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/140729 WO2022121024A1 (en) 2020-12-10 2020-12-29 Unmanned aerial vehicle positioning method and system based on screen optical communication

Country Status (2)

Country Link
CN (1) CN114627398A (en)
WO (1) WO2022121024A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116052003A (en) * 2023-02-07 2023-05-02 中科星图数字地球合肥有限公司 Method and device for measuring antenna angle information and related equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102023003A (en) * 2010-09-29 2011-04-20 清华大学 Unmanned helicopter three-dimensional positioning and mapping method based on laser detection and image recognition
CN106153008A (en) * 2016-06-17 2016-11-23 北京理工大学 A kind of rotor wing unmanned aerial vehicle objective localization method of view-based access control model
CN106681353A (en) * 2016-11-29 2017-05-17 南京航空航天大学 Unmanned aerial vehicle (UAV) obstacle avoidance method and system based on binocular vision and optical flow fusion
US20180096183A1 (en) * 2016-01-22 2018-04-05 International Business Machines Corporation Optical marker for delivery drone cargo delivery
CN109360240A (en) * 2018-09-18 2019-02-19 华南理工大学 A kind of small drone localization method based on binocular vision
CN110017841A (en) * 2019-05-13 2019-07-16 大有智能科技(嘉兴)有限公司 Vision positioning method and its air navigation aid
US20200066142A1 (en) * 2018-08-21 2020-02-27 Here Global B.V. Method and apparatus for using drones for road and traffic monitoring

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102023003A (en) * 2010-09-29 2011-04-20 清华大学 Unmanned helicopter three-dimensional positioning and mapping method based on laser detection and image recognition
US20180096183A1 (en) * 2016-01-22 2018-04-05 International Business Machines Corporation Optical marker for delivery drone cargo delivery
CN106153008A (en) * 2016-06-17 2016-11-23 北京理工大学 A kind of rotor wing unmanned aerial vehicle objective localization method of view-based access control model
CN106681353A (en) * 2016-11-29 2017-05-17 南京航空航天大学 Unmanned aerial vehicle (UAV) obstacle avoidance method and system based on binocular vision and optical flow fusion
US20200066142A1 (en) * 2018-08-21 2020-02-27 Here Global B.V. Method and apparatus for using drones for road and traffic monitoring
CN109360240A (en) * 2018-09-18 2019-02-19 华南理工大学 A kind of small drone localization method based on binocular vision
CN110017841A (en) * 2019-05-13 2019-07-16 大有智能科技(嘉兴)有限公司 Vision positioning method and its air navigation aid

Also Published As

Publication number Publication date
CN114627398A (en) 2022-06-14

Similar Documents

Publication Publication Date Title
CN109931939B (en) Vehicle positioning method, device, equipment and computer readable storage medium
CN115439424B (en) Intelligent detection method for aerial video images of unmanned aerial vehicle
US20210342620A1 (en) Geographic object detection apparatus and geographic object detection method
CN101916437B (en) Method and system for positioning target based on multi-visual information
US8311285B2 (en) Method and system for localizing in urban environments from omni-direction skyline images
US11780465B2 (en) System and method for free space estimation
CN103065323A (en) Subsection space aligning method based on homography transformational matrix
CN106408601A (en) GPS-based binocular fusion positioning method and device
CN111260539B (en) Fish eye pattern target identification method and system thereof
CN115655262B (en) Deep learning perception-based multi-level semantic map construction method and device
CN107221006A (en) A kind of communication single pipe tower slant detection method based on unmanned plane imaging platform
WO2019153855A1 (en) Object information acquisition system capable of 360-degree panoramic orientation and position sensing, and application thereof
WO2022121024A1 (en) Unmanned aerial vehicle positioning method and system based on screen optical communication
Crispel et al. All-sky photogrammetry techniques to georeference a cloud field
CN116147567B (en) Homeland mapping method based on multi-metadata fusion
CN115578539B (en) Indoor space high-precision visual position positioning method, terminal and storage medium
WO2021087751A1 (en) Distance measurement method, distance measurement device, autonomous moving platform, and storage medium
CN113895482B (en) Train speed measuring method and device based on trackside equipment
CN110738706B (en) Rapid robot visual positioning method based on track conjecture
Han et al. Traffic sign detection and positioning based on monocular camera
CN110617800A (en) Emergency remote sensing monitoring method, system and storage medium based on civil aircraft
WO2021253333A1 (en) Vehicle positioning method and apparatus based on screen optical communication, and server
US20230334699A1 (en) System calibration using remote sensor data
Machado Vehicle speed estimation based on license plate detection
CN117011656A (en) Panoramic camera and laser radar fusion method for obstacle avoidance of unmanned boarding bridge

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20964953

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20964953

Country of ref document: EP

Kind code of ref document: A1