WO2022143237A1 - Target positioning method and system, and related device - Google Patents

Target positioning method and system, and related device Download PDF

Info

Publication number
WO2022143237A1
WO2022143237A1 PCT/CN2021/139421 CN2021139421W WO2022143237A1 WO 2022143237 A1 WO2022143237 A1 WO 2022143237A1 CN 2021139421 W CN2021139421 W CN 2021139421W WO 2022143237 A1 WO2022143237 A1 WO 2022143237A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
image
camera
baseline
detection
Prior art date
Application number
PCT/CN2021/139421
Other languages
French (fr)
Chinese (zh)
Inventor
唐道龙
李宏波
李冬虎
常胜
沈建惠
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022143237A1 publication Critical patent/WO2022143237A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/34Route searching; Route guidance
    • G01C21/36Input/output arrangements for on-board computers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/97Determining parameters from multiple pictures

Definitions

  • the present application relates to the field of artificial intelligence (Artificial Intelligence, AI), and in particular, to a method, system and related equipment for target positioning.
  • AI Artificial Intelligence
  • Stereo vision algorithms are currently widely used in intelligent security, autonomous driving, industrial inspection, 3D reconstruction, virtual reality and other fields, reflecting its strong technical competitiveness.
  • Stereo vision algorithms usually use a multi-camera to shoot the target to obtain a multi-channel image of the target, and then determine the parallax of the target according to the multi-channel image.
  • the directional difference produced by a target, the distance between the target and the camera can be calculated based on the distance between the cameras (i.e. the baseline length) and the parallax.
  • the current stereo vision algorithm determines the distance between the target and the camera
  • the target since the target is not a point in the multi-channel image, but an image area, it is necessary to determine the disparity of each pixel in the area.
  • Parallax and the baseline length of the multi-eye camera determine the distance between the target and the camera. This process not only consumes huge computing resources, but also is prone to noise and calculation errors during the calculation process, resulting in poor target positioning accuracy, which in turn affects subsequent 3D reconstruction, Automatic driving, security monitoring and other applications.
  • the present application provides a method, system and related equipment for target positioning, which are used to solve the problems that the target positioning process consumes a huge amount of resources and the target positioning accuracy is poor.
  • a first aspect provides a method for locating a target, the method comprising the following steps: acquiring a first image and a second image, the first image and the second image are obtained by shooting the same target at the same time by a multi-camera camera, and then Perform target detection and matching on the first image and the second image to obtain a first target area of the first image and a second target area of the second image, wherein the first target area and the second target area include targets, and the first target area and the second target area include targets.
  • the target area and the second target area are subjected to feature point detection and matching, and a feature point matching result is obtained, wherein the feature point matching result includes the corresponding relationship between the feature points in the first target area and the feature points in the second target area,
  • the feature points with corresponding relationship describe the same feature of the target, and the position information of the target is determined according to the matching result of the feature points and the parameter information of the multi-camera.
  • the parameter information includes at least the baseline length of the multi-camera camera and the focal length of the multi-camera camera
  • the parallax information of the target can be obtained according to the pixel difference between the feature points with corresponding relationship in the feature point matching result
  • the parallax information includes the first The difference between the pixel coordinates of the feature points in one target area and the pixel coordinates of the corresponding feature points in the second target area, and then according to the disparity information of the target, the baseline length of the multi-camera camera, and the focal length of the multi-camera camera, Determine the distance between the target and the camera, and obtain the position information of the target.
  • the multi-camera camera includes multiple camera groups, and each camera group in the multiple camera groups includes multiple cameras. Based on this, baseline data of the multi-camera camera can be obtained. Including the baseline length between multiple cameras in each group of cameras, according to the measurement accuracy requirements of the target, the target baseline is obtained from the baseline data, and then the first image and the second image are obtained according to the target baseline, wherein the first image and the second image are obtained. The second image is captured by the camera group corresponding to the target baseline.
  • the multi-camera includes N cameras, where N is a positive integer, and the camera numbers are 1, 2, ..., N in sequence, every two cameras can be combined into a binocular camera group corresponding to the baseline length, such as
  • the baseline of the binocular camera group composed of camera 1 and camera N is BL1
  • the baseline of the binocular camera group composed of camera 1 and camera N-1 is BL2
  • C_N ⁇ 2 N(N-1) /2 binocular camera groups
  • the above examples are for illustration, and do not limit the number of multi-camera cameras and the number of cameras included in each camera group.
  • the target baseline can be determined according to the measurement accuracy requirements of the target.
  • the first accuracy index and the second accuracy index of each group of cameras may be determined first, wherein the first accuracy index is inversely proportional to the baseline length of each group of cameras, and the first accuracy index is positively related to the common viewing area of each group of cameras. Proportional relationship.
  • the second accuracy index is proportional to the baseline length and focal length of each group of cameras.
  • the common viewing area is the area captured by multiple cameras in each group of cameras, and then the first accuracy index is determined according to the measurement accuracy requirements of the target. and the weight of the second precision index, and then obtain the comprehensive index of each group of cameras according to the first precision index, the second precision index and the weight, so as to determine the target baseline according to the comprehensive index of each group of cameras.
  • a multi-eye camera with a fixed baseline in order to ensure the ranging accuracy, its ranging range is also fixed. This is because the closer the target is to the camera, the co-viewing area of the multi-eye camera approaches 0, wherein the co-viewing area is It refers to the area that the multi-camera can capture at the same time. At this time, the target may not have imaging points in individual cameras in the multi-camera, so the parallax of the target cannot be calculated. The farther the target is from the camera, the more blurred the target area on the first image and the second image will be, which will affect the parallax calculation. Therefore, a multi-eye camera with a fixed baseline has a fixed ranging range.
  • the common viewing area refers to the area that the multi-eye camera can capture at the same time.
  • the measurement accuracy requirement of the target to be measured may include the approximate distance between the target and the multi-camera camera, in other words, the long-distance target or the short-distance target of the target.
  • the target to be measured is a long-distance target or a short-range target can be determined according to the size of the image area where the target is located in the image collected by the multi-eye camera, and the image area occupied by the long-distance target in the image collected by the camera is very small.
  • the image area occupied by the close-range target in the image collected by the camera is very large, so when the image area is smaller than the first threshold, it can be determined that the target to be measured is a close-range target, and when the image area is smaller than the second threshold, it can be determined
  • the target to be tested is a long-distance target.
  • the measurement accuracy requirement may also include the measurement error threshold of the target to be measured, for example, the measurement error is not less than 1 meter. It should be understood that the above examples are for illustration, and are not specifically limited in the present application.
  • the baseline data collected by the multi-camera camera not only includes the baseline length between multiple cameras in each camera group, but also includes the size of the common viewing area between the multiple cameras, wherein the size of the common viewing area can be determined according to the camera group.
  • the shooting range of each camera is determined, and the shooting range refers to the range of the geographical area recorded in the image captured by the camera.
  • the pixel coordinates of each edge position point can be determined by determining the edge position point that can be displayed in the video picture of each channel of video, and after converting it into geographic coordinates through a camera calibration algorithm, According to the area composed of these geographic coordinates, the shooting range of the video of the road is determined, and then the size of the common viewing area is obtained.
  • a baseline adjustment request carrying a target baseline can be sent to the multi-camera camera, and the baseline adjustment request is used to instruct the multi-camera camera to adjust the baseline length of the camera group included in the multi-camera camera to the above-mentioned target baseline, and then receive the above-mentioned target baseline.
  • the target baseline is determined according to the size of the common viewing area of the multi-camera camera and the measurement accuracy requirements of the target to be measured, and the measurement accuracy can be improved as much as possible while ensuring that the target is within the shooting range of the binocular camera corresponding to the target baseline. It solves the problem that the fixed baseline multi-eye camera can only measure the distance of the target within the fixed distance measurement range, and expands the distance measurement range of the distance measurement and positioning system provided by the present application.
  • the multi-eye camera can shoot the target to obtain the first channel video and the second channel video, and after receiving the first channel video and the second channel video, the first channel video and the second channel video can be obtained.
  • the video and the second channel of video are subjected to time synchronization processing to obtain the first image and the second image at the same moment, wherein the first image is the image frame in the first channel video, and the second image is the image frame in the second channel video .
  • a reference frame can be obtained from the first channel of video, and a plurality of motion frames can be obtained from the second channel of video, wherein the reference frame and the plurality of motion frames include moving objects, and then the reference frame and the plurality of motion frames are combined with each other.
  • Perform feature point matching to obtain synchronization frames in multiple motion frames, wherein the parallelism of the connection between the feature points in the synchronization frame and the corresponding feature points in the reference frame satisfies a preset condition, and then according to the reference frame and the synchronization frame
  • the time synchronization correction is performed on the video of the first channel and the video of the second channel to obtain the first image and the second image at the same moment.
  • satisfying the preset condition may refer to determining the frame with the highest parallelism between the lines as the synchronization frame.
  • the video collected by the camera 2 is higher than the video collected by the camera 1. It is 20ms faster. If the disparity calculation is performed directly on the first video and the second video collected by camera 1 and camera 2, there will be errors in the obtained disparity information, which will lead to obstacles in subsequent applications such as ranging and 3D reconstruction. Time synchronization of the channel image and the second channel image can solve this problem.
  • the above-mentioned reference frame and motion frame may be determined by an optical flow method.
  • the optical flow refers to the instantaneous speed of the pixel motion of the space moving object on the observation imaging plane.
  • the optical flow can also be equivalent to the displacement of the space moving object.
  • the step flow of determining the reference frame and the motion frame can be as follows: first, perform target detection of the synchronization target on each frame of the first video and the second video, and obtain one or more synchronization targets in each frame of the image.
  • the detected synchronization target when the target is detected for each frame of image, the detected synchronization target should be a target that may move, not a certain stationary target, such as a building. Therefore, the synchronization target may be the target to be measured in the foregoing content, or may be other targets, which are not specifically limited in this application.
  • the synchronization target used to achieve time synchronization can be a pedestrian or a vehicle; if the target to be measured is vehicle A, then the synchronization target used to achieve time synchronization can be Vehicles and pedestrians, the above examples are used for illustration and are not limited in this application.
  • the target detection algorithm in this embodiment of the present application may use any of the existing neural network models for target detection with better effects in the industry, for example: one-stage unified real-time target detection (You Only Look Once : Unified, Real-Time Object Detection, Yolo) model, Single Shot multi box Detector (SSD) model, Region ConvolutioNal Neural Network (RCNN) model or fast regional convolutional neural network Network (Fast Region Convolutional Neural Network, Fast-RCNN) models, etc., are not specifically limited in this application.
  • the optical flow method in the embodiments of the present application can also use any one of the optical flow methods that are already available in the industry for calculating optical flow with better effects, such as the Lucas-Kanade (LK) optical flow method.
  • LK Lucas-Kanade
  • the object after obtaining the optical flow of each object in each frame (that is, the instantaneous speed of the object), it can be determined whether the object is a moving object by determining whether the speed of the object has a speed component in the direction of the image line, specifically, Since the multi-camera (such as the multi-camera shown in Figure 1) is fixed at the same height, if the object moves in the row direction, the row coordinates of the object will change, so if the object X in the motion frame Tn is the same as the previous frame Tn If the row coordinates of the same object X in -1 (or the next frame Tn+1) are not equal, it can be determined that the object is a moving object. It can be understood that the vertically moving object only moves in the column direction.
  • the multi-camera such as the multi-camera shown in Figure 1
  • This type of moving object has no velocity component in the image row direction, but only has a velocity component in the column direction. Therefore, the vertically moving object does not contribute to the disparity calculation.
  • the vertically moving object is also regarded as a non-moving object and does not participate in the parallax calculation, thereby reducing the amount of calculation and improving the accuracy and efficiency of the parallax calculation.
  • the moving object in the reference frame and the moving object in the motion frame can be matched with the feature points, and the difference value ⁇ s of the row coordinates of each feature point can be calculated.
  • the two frames are not synchronized, so the accurate synchronization offset time ⁇ t can be calculated according to ⁇ s, and the synchronization offset time ⁇ t can be used as the compensation for the row coordinates of each subsequent frame, so as to obtain the synchronized first video and The second channel video, and then obtain the first image and the second image at each moment, such as the first image and the second image.
  • stereoscopic correction may also be performed on the first channel of video and the second channel of video.
  • the formula used for calculating the parallax is often derived under the assumption that the multi-camera is in an ideal situation, so before using the multi-camera for ranging and positioning, the actually used multi-camera should be corrected to an ideal state.
  • the image planes of the left and right cameras are parallel, the optical axis is perpendicular to the image plane, and the pole is located at infinity.
  • the embodiment of the present application may adopt any one of the existing stereo correction methods with better effects in the industry, such as the bouguet polar line correction method, which is not specifically limited in the present application.
  • the time synchronization processing is performed on the first video and the second video to obtain the first image and the second image, and then the determination of the position information of the target is carried out according to the first image and the second image, which can improve the accuracy of the position information.
  • This improves the accuracy of subsequent applications such as AR, VR, and 3D reconstruction.
  • the first image may be input into the detection and matching model to obtain a first detection and matching result of the first image
  • the second image may be input into the detection and matching model to obtain the second image of the second image.
  • Detection and matching results wherein the first detection and matching results and the second detection and matching results include a target frame and a label, the target frame is used to indicate the area of the target in the image, the label of the same target is the same, and the first detection and matching result is obtained.
  • the target area is obtained according to the second detection and matching result.
  • the target frame in the target detection matching result may be a rectangular frame, a circular frame, an oval frame, etc., which is not specifically limited in this application. It should be understood that if the number of targets to be tested is multiple, the detection and matching results may include multiple target frames of multiple targets. Therefore, in the detection and matching results, the same target may be identified by the same label, and different targets may be identified using different In this way, when the disparity calculation is performed on the target, the same target in different video frames can be identified according to the label, so as to achieve the feature point matching of the same target in the first image and the second image at the same moment. , and then obtain the parallax of the target.
  • the detection matching model may include a feature extraction module and a detection matching module.
  • the feature extraction module is used to extract the features in the input first image and the second image, and generate a high-dimensional feature vector
  • the detection matching module 620 is used to generate a detection matching result including a target frame and a label according to the above-mentioned feature vector.
  • a sample set may be used to train the detection matching model, and the sample set may include the first image sample, the second image sample, and The value includes the true value of target detection and the true value of target matching, wherein the true value of target detection includes the target frame of the target in the first image sample and the second image sample, and the true value of target matching includes the target in the first image sample and the second image sample.
  • the detection matching loss used for backpropagation is determined according to the gap between the output value of the detection matching module and the true value of the sample. According to the detection matching loss pair The parameters of the detection and matching model are adjusted until the above-mentioned detection and matching loss reaches the threshold, and then the trained detection and matching model is obtained.
  • the feature extraction module can be a neural network backbone structure such as VGG, Resnet, etc. for extracting image features.
  • the above detection and matching module can be a target detection network, such as YOLO network, SSD network, RCNN, etc., which is not specifically limited in this application. .
  • the first target area and the second target area can be determined according to whether the labels are the same, rather than the target area.
  • Performing image recognition to determine the same target in the first image and the second image can reduce computational complexity, improve the acquisition efficiency of the first target area and the second target area, and further improve the efficiency of ranging and positioning.
  • a target positioning system in a second aspect, includes: an acquisition unit, configured to acquire a first image and a second image, the first image and the second image are obtained by shooting the same target at the same time by a multi-camera camera ; Detection and matching unit, for carrying out target detection and matching to the first image and the second image, to obtain the first target area of the first image and the second target area of the second image, wherein, the first target area and the second target area
  • the area includes a target; the detection and matching unit is used to detect and match feature points on the first target area and the second target area, and obtain a feature point matching result, wherein the feature point matching result includes the feature point in the first target area and the first target area.
  • the corresponding relationship between the feature points in the two target areas, the feature points with the corresponding relationship describe the same feature of the target; the position determination unit is used to determine the position information of the target according to the matching result of the feature points and the parameter information of the multi-camera camera .
  • a target baseline can be determined according to the target to be measured, and the camera group using the target baseline can collect the target to obtain a first image and a second image, and then perform target detection on the first image and the second image. and matching, obtain the first target area and the second target area where the target is located, and finally detect and match the feature points of the first target area and the second target area to obtain the feature point matching result, and determine each feature point matching result according to the feature point matching result.
  • the disparity information of the feature points is used to determine the position information of the target.
  • the system can flexibly select the target baseline camera group for data collection according to the target to be measured, avoid the problem of limited ranging range caused by fixed baseline multi-eye cameras, and improve the ranging range of the target positioning system.
  • the system determines the location information of the target according to the parallax information of the feature points, and does not need to perform matching and parallax calculation for each pixel in the first image area and the second image area, thereby reducing the computational resources required for positioning and ranging, and avoiding background interference. , noise and other problems, improve the accuracy of ranging and positioning.
  • the parameter information includes at least the baseline length of the multi-camera camera and the focal length of the multi-camera camera; the position determination unit is configured to match the feature points according to the feature points with corresponding relationships between the feature points.
  • the disparity information of the target is obtained, and the disparity information includes the difference between the pixel coordinates of the feature points in the first target area and the pixel coordinates of the feature points that have a corresponding relationship in the second target area; the position determination unit is used for According to the parallax information of the target, the baseline length of the multi-eye camera and the focal length of the multi-eye camera, the distance between the target and the camera is determined, and the position information of the target is obtained.
  • the multi-camera camera includes multiple camera groups, each camera group in the multiple camera groups includes multiple cameras, and the system further includes a baseline determination unit, a baseline determination unit for acquiring Baseline data of multi-camera cameras, the baseline data includes the baseline length between multiple cameras in each group of cameras; the baseline determination unit is used to obtain the target baseline from the baseline data according to the measurement accuracy requirements of the target; the acquisition unit is used for The first image and the second image are acquired according to the target baseline, wherein the first image and the second image are captured by a camera group corresponding to the target baseline.
  • the baseline determination unit is configured to send a baseline adjustment request carrying a target baseline to the multi-camera camera, where the baseline adjustment request is used to instruct the multi-camera camera to adjust the baseline length of the camera group it includes to the target baseline; an acquisition unit configured to receive the first image and the second image captured by the camera group corresponding to the target baseline.
  • a baseline determination unit is configured to determine a first accuracy index and a second accuracy index of each group of cameras, wherein the first accuracy index is inversely proportional to the baseline length of each group of cameras , the first accuracy index is proportional to the common viewing area of each group of cameras, the second accuracy index is proportional to the baseline length and focal length of each group of cameras, and the common viewing area is captured by multiple cameras in each group of cameras. area; a baseline determination unit for determining the weights of the first accuracy index and the second accuracy index according to the measurement accuracy requirements of the target; a baseline determination unit for obtaining each group of The comprehensive index of the camera; the baseline determination unit, which is used to determine the target baseline according to the comprehensive index of each group of cameras.
  • the system further includes a synchronization unit, which is used for receiving the first channel video and the second channel video obtained by shooting the target with the multi-camera camera; the synchronization unit is used for Perform time synchronization processing on the video of the first channel and the video of the second channel to obtain the first image and the second image at the same moment, wherein the first image is the image frame in the video of the first channel, and the second image is the video of the second channel image frame in .
  • the synchronization unit is configured to obtain a reference frame from the first channel of video, and obtain multiple motion frames from the second channel of video, wherein the reference frame and the multiple motion frames are Including a moving object; a synchronization unit is used to match the reference frame with the feature points of a plurality of motion frames, and obtain a synchronization frame in the plurality of motion frames, wherein, between the feature points in the synchronization frame and the corresponding feature points in the reference frame The parallelism of the connection line satisfies the preset condition; the synchronization unit is used to perform time synchronization correction on the video of the first channel and the video of the second channel according to the reference frame and the synchronization frame, and obtain the first image and the second image at the same moment.
  • the detection and matching unit is configured to input the first image into the detection and matching model, obtain a first detection and matching result of the first image, input the second image into the detection and matching model, and obtain the first detection and matching result of the first image.
  • the second detection and matching results of the two images wherein the first detection and matching results and the second detection and matching results include a target frame and a label, and the target frame is used to indicate the area of the target in the image, and the labels of the same target are the same; the detection and matching unit, It is used to obtain the first target area according to the first detection and matching result, and obtain the second target area according to the second detection and matching result.
  • a computer program product comprising a computer program that, when read and executed by a computing device, implements the method described in the first aspect.
  • a computer-readable storage medium comprising instructions that, when executed on a computing device, cause the computing device to implement the method as described in the first aspect.
  • a computing device including a processor and a memory, where the processor executes code in the memory to implement the method described in the first aspect.
  • FIG. 1 is a schematic diagram of an imaging structure based on a binocular camera
  • Fig. 2 is the architecture diagram of a kind of stereo vision system provided by the application
  • FIG. 3 is a schematic diagram of the deployment of a target positioning system provided by the present application.
  • Fig. 5 is the imaging schematic diagram of a kind of binocular camera with too long baseline when shooting the target point
  • FIG. 6 is a schematic flowchart of steps of a target positioning method provided by the present application.
  • Fig. 7 is the step flow diagram of the first video and the second video provided by the application to perform time synchronization
  • FIG. 8 is an example diagram of a target detection matching result provided by the present application.
  • FIG. 9 is a schematic structural diagram of a target detection model provided by the present application.
  • FIG. 11 is a schematic diagram of a feature point matching result of a target positioning method provided by the present application in an application scenario
  • FIG. 12 is a schematic diagram of a textureless object provided by the present application.
  • FIG. 13 is a schematic flowchart of steps for determining a first target area and a second target area under an occlusion scenario provided by the present application;
  • FIG. 14 is a schematic structural diagram of a target positioning system provided by the present application.
  • FIG. 15 is a schematic structural diagram of a computing device provided by the present application.
  • Stereoscopic positioning refers to determining the position of objects in the image in the three-dimensional world space through the video or image information obtained by the image sensor. People can analyze the video information collected by the image sensor to realize target coordinate positioning, target ranging, 3D reconstruction, etc., and feed the results back to the terminal or cloud processor to serve more abundant applications, such as intelligent security and autonomous driving. , industrial testing, intelligent transportation, AR, VR, ADAS, medicine, etc.
  • Stereo vision positioning usually uses a binocular camera to shoot the target to obtain a multi-channel video of the target, first determine the parallax according to the multi-channel video, where the parallax refers to observing the same object from two observation points with a certain distance.
  • the direction difference generated by the target according to the distance between the cameras (ie the baseline length) and the parallax, can calculate the distance between the target and the camera, so as to obtain the accurate position of the target in the three-dimensional world space.
  • the parallax may be the difference in pixel coordinates of the target in images captured by different cameras.
  • the binocular camera includes a left camera and a right camera
  • the pixel coordinates of the target X in the picture captured by the left camera are (x, y)
  • the pixel coordinates in the picture captured by the right camera are (x+d,y)
  • d is the parallax of the target X in the horizontal direction.
  • FIG. 1 The method for determining the distance between the target and the camera according to the baseline length and parallax will be illustrated below with reference to FIG. 1 .
  • the camera parameters such as the focal length and imaging plane length of the two cameras of the binocular camera are the same
  • P is the target point to be detected
  • O L is the optical center of the left camera of the binocular camera
  • OR is the right camera.
  • the optical center of the camera is the imaging plane of the left camera
  • the line segment CD is the imaging plane of the right camera
  • the line segment O L O R is the baseline of the binocular camera
  • the length is b
  • the point PL on the imaging plane AB is the imaging of the target point P on the left camera
  • the point P R on the imaging plane CD is the imaging of the target point P on the right camera
  • the imaging PL and the imaging plane AB are the leftmost
  • the distance XL between the points A of the edge of the target point P is the image abscissa of the image captured by the camera on the left side of the target point P
  • the distance X R between the imaging P R and the point C on the leftmost edge of the imaging plane CD is is the image abscissa of the image captured by the camera on the right side of the target point P
  • the parallax of the target point P to be detected is (X L -X R ).
  • X L - X R is the parallax
  • b is the baseline length
  • f is the focal length of the camera, so the distance z between the target and the camera can be obtained according to the parallax, baseline length and focal length of the multi-eye camera. It should be understood that the calculation process of the derivation process shown in FIG. 1 is used for illustration, and the present application does not limit the specific algorithm for determining the relationship between the target and the camera according to the parallax.
  • the current stereo vision algorithm determines the distance between the target and the camera
  • the target since the target is not a point in the multi-channel image, but an image area, it is necessary to determine the disparity of each pixel in the area.
  • the parallax and the baseline length of the multi-camera are accurate to determine the distance between the target and the camera. This process not only consumes huge computing resources, but also is prone to problems such as noise, calculation errors, and background interference in a large number of calculations, making the accuracy of ranging and positioning impossible. This will affect subsequent 3D reconstruction, automatic driving, security monitoring and other applications.
  • the present application provides a target positioning system, which can flexibly set the baseline length of the multi-eye camera according to the target to be measured, thereby To solve the problem of small ranging range, by performing target detection and matching on the first image and the second image captured by the multi-camera, determine the target area where the target is located in the first image and the second image, and then determine the target area where the target is located.
  • the target area of the target area is detected and matched, and the disparity information of the target is determined according to the pixel difference between the feature points in each target area and the feature points in other target areas, without matching each pixel in the target area, Therefore, the computing resources required for ranging and positioning are reduced, the interference of the background image on the target parallax calculation is avoided, the accuracy of the parallax calculation is improved, and the accuracy of the ranging and positioning is improved.
  • FIG. 2 is a system architecture diagram of an embodiment of the present application.
  • the system architecture for stereo vision positioning provided by the present application may include a target positioning system 110 , a multi-eye camera 120 and an application server 130 .
  • the target positioning system 110 and the multi-camera 120 are connected through a network, and the application server 130 and the target positioning system 110 are connected through a network.
  • the above network may be a wireless network or a wired network, which is not specifically limited in this application.
  • the multi-eye camera 120 includes multiple camera groups, and each camera group in the multiple camera groups includes multiple cameras.
  • the multi-eye camera 120 includes N cameras, where N is a positive integer, and the camera numbers are 1, 2, . . . ,N, every two cameras can be combined into a binocular camera group corresponding to the baseline length, for example, the baseline of the binocular camera group composed of camera 1 and camera N is BL1, and the binocular camera group composed of camera 1 and camera N-1 The baseline is BL2, and so on, you can get A set of binocular cameras. It should be understood that the above examples are for illustration, and are not specifically limited in the present application.
  • the multi-eye camera 120 is used for sending baseline data to the target positioning system 110, wherein the baseline data includes the baseline length between multiple cameras in each camera group. Still taking the above example as an example, the multi-eye camera 120 includes binocular cameras, then the baseline data can include Baseline length of a binocular camera.
  • the multi-eye camera 120 is further configured to receive the target baseline sent by the target positioning system 110 , and send the first image and the second image collected by the camera group corresponding to the target baseline to the target positioning system 110 .
  • the first image and the second image are images obtained by photographing the same target at the same time.
  • the target baseline is the target baseline determined by the target positioning system 110 according to the measurement accuracy requirements of the target.
  • the multi-camera 120 can receive a baseline adjustment request sent by the target positioning system 110, and the baseline adjustment request carries the above-mentioned target baseline, and the multi-camera 120 can obtain the camera corresponding to the target baseline according to the baseline adjustment request.
  • the group captures the first image and the second image, which are then sent to the object positioning system 110 .
  • the target positioning system determines the baseline length of the multi-camera and 120 according to the measurement accuracy requirements of the target. For example, when measuring the distance of a long-distance target, a binocular camera with a longer baseline can be used, and a short-range target can be used for distance measurement and positioning. When , a binocular camera with a short baseline can be used to expand the range of ranging and positioning and solve the problem of small ranging range during ranging and positioning.
  • the multi-eye camera 120 is further configured to use the camera group corresponding to the target baseline to shoot the target to obtain multi-channel videos, such as the first channel video and the second channel video, and then combine the first channel video and the second channel video.
  • the video is sent to the target positioning system 110, wherein the first video and the second video include the above-mentioned first image and the second image, and the target positioning system 110 can perform time synchronization processing on the first video and the second video, The first and second images described above are obtained.
  • first channel video and second channel video may be real-time video collected by the multi-camera 120, or may be a cached historical video.
  • the multi-camera 120 includes 10 cameras, which are located at the door of a certain community. After each camera collects the surveillance video at the entrance of the community from 8:00 am to 8:00 pm, it is transmitted to the target positioning system 110 as the first video and the second video at 9:00 pm for processing, and each camera can also collect real-time monitoring of the community door.
  • the surveillance video is transmitted to the target positioning system 110 in real time through the network for processing, which is not specifically limited in this application.
  • the camera in the multi-eye camera 120 can also be a monocular movable camera.
  • the camera system only includes a camera mounted on a slidable support rod, The camera can collect the first video and the second video of the target at different angles through the sliding support rod, and the distance length of the camera moving on the sliding support rod is the above target baseline.
  • the above-mentioned multi-eye camera 120 may further include other structures capable of capturing the first channel video and the second channel video of the same target, which is not specifically limited in this application.
  • the application server 130 may be a single server, or a server cluster composed of multiple servers, and the server may be implemented by a general-purpose physical server, for example, an ARM server or an X86 server, or may be combined with network functions virtualization (network functions virtualization, A virtual machine (virtual machine, VM) implemented by NFV) technology, such as a virtual machine in a data center, is not specifically limited in this application.
  • the application server 130 is used to realize functions such as three-dimensional reconstruction, industrial detection, intelligent security, AR, VR, automatic driving, etc. according to the position information sent by the target positioning system 110 .
  • the target positioning system 110 is configured to receive the first image and the second image sent by the multi-camera 120, perform target detection and matching on the first image and the second image, and obtain the first target area of the first image and the first target area of the second image. Two target areas, wherein the first target area and the second target area include the above-mentioned target to be measured. Then, feature point detection and matching are performed on the first target area and the second target area to obtain a feature point matching result, wherein the feature point matching result includes the feature point in the first target area and the feature point in the second target area.
  • the corresponding relationship between the two, the feature points with the corresponding relationship describe the same feature of the target, and finally the parallax information of the target can be determined according to the matching results of the feature points and the parameter information of the multi-camera camera, and then the position information of the target can be determined according to formula 3, And send it to the above-mentioned application server 130, so that the application server 130 can realize functions such as three-dimensional reconstruction, AR, VR, automatic driving, etc. according to the position information.
  • the target positioning system 110 can also receive the above-mentioned baseline data sent by the multi-eye camera 120, and then obtain the target baseline from the baseline data according to the measurement accuracy requirements of the target to be measured, and then send it to the multi-eye camera 120, so that The multi-camera 120 uses the camera group corresponding to the target baseline to capture the target to obtain the first image and the second image.
  • the multi-camera 120 uses the camera corresponding to the target baseline to collect the target, it can also obtain the first video and the second video.
  • the target positioning system 110 receives the first video and the second video, the time synchronization processing can be performed on the first channel video and the second channel video to obtain the above-mentioned first image and second image.
  • the target positioning system 110 provided by the present application is flexible in deployment, and can be deployed in an edge environment, specifically, an edge computing device in the edge environment or a software system running on one or more edge computing devices.
  • the edge environment refers to a cluster of edge computing devices that are geographically close to the multi-camera 120 and used to provide computing, storage, and communication resources, such as edge computing all-in-one computers located on both sides of a road.
  • the target positioning system 110 may be one or more edge computing devices located near the intersection or a software system running on the edge computing device near the intersection, where the camera 1 is set. , Camera 2 and Camera 3 to monitor the intersection.
  • the edge computing device can determine the most suitable baseline according to the target to be measured as BL3.
  • the cameras that meet the baseline BL3 include Camera 1 and Camera 3.
  • the edge computing device can monitor Camera 1 and Camera 3. 3. Perform time synchronization processing on the first video and the second video collected to obtain the first image and the second image at the same moment, and then perform target detection on the first image and the second image to obtain the first target image and the second image.
  • Two target images wherein both the first target image and the second target image include the target to be tested, then, the feature point detection and matching are performed on the first target image and the second target image to obtain a feature point matching result, according to the feature
  • the point matching result and the parameter information of camera 1 and camera 3 can determine the parallax information of the target, and then determine the position information of the target in combination with formula 3, and send the position information to the application server 130, so that the application server can realize three-dimensional reconstruction and AR according to the position information.
  • VR autonomous driving and other functions.
  • the target positioning system 110 may also be deployed in a cloud environment, which is an entity that provides cloud services to users by utilizing basic resources in a cloud computing mode.
  • the cloud environment includes a cloud data center and a cloud service platform, and the cloud data center includes a large number of basic resources (including computing resources, storage resources, and network resources) owned by the cloud service provider.
  • the target positioning system 110 may be a server in a cloud data center, a virtual machine created in the cloud data center, or a software system deployed on a server or virtual machine in the cloud data center, and the software system may be distributed It is deployed on multiple servers, or distributed on multiple virtual machines, or distributed on both virtual machines and servers.
  • the target positioning system 110 can also be deployed in a cloud data center that is far away from a certain intersection, where camera 1, camera 2 and camera 3 are set to monitor the intersection, and the cloud data center can determine the most suitable for the intersection according to the target to be measured.
  • the baseline is BL3, and the cameras that satisfy the baseline BL3 include camera 1 and camera 3.
  • the cloud data center can perform time synchronization processing on the first video and the second video collected by camera 1 and camera 3 to obtain the first video at the same time.
  • the location information is sent to the application server 130, so that the application server can implement functions such as 3D reconstruction, AR, VR, and automatic driving according to the location information.
  • the targeting system 110 may also be deployed partly in an edge environment and partly in a cloud environment.
  • the edge computing device is responsible for determining the baseline of the binocular camera according to the target to be measured, and the cloud data center determines the parallax information according to the first video and the second video collected by the binocular camera.
  • the intersection is provided with camera 1, camera 2 and camera 3 to monitor the intersection, and the edge computing device can determine the most suitable baseline according to the target to be measured as BL3, which meets the requirements of the baseline BL3.
  • the cameras include camera 1 and camera 3.
  • the edge computing device can also perform time synchronization processing on the first video and the second video collected by camera 1 and camera 3 to obtain the first image and the second image at the same moment, and then It is sent to the cloud data center, and the cloud data center can perform target detection on the first image and the second image to obtain the first target image and the second target image, wherein both the first target image and the second target image include the object to be detected. Then, perform feature point detection and matching on the first target image and the second target image, and obtain the feature point matching result.
  • the parallax information of the target can be determined, and then The position information of the target is determined in combination with formula 3, and the position information is sent to the application server 130, so that the application server can realize functions such as 3D reconstruction, AR, VR, and automatic driving according to the position information.
  • the unit modules inside the target positioning system 110 may also be divided into multiple divisions, and each module may be a software module, a hardware module, or part of a software module and part of a hardware module, which is not limited in this application.
  • FIG. 2 is an exemplary division manner.
  • the target positioning system 110 may include a baseline determination unit 111 , a synchronization unit 112 and a detection and matching unit 113 .
  • each module in the target positioning system can also be deployed on the same edge computing device, the same cloud data center, or the same physical machine, of course, it can also be partially deployed For edge computing devices, some are deployed in cloud data centers.
  • the baseline determination unit 111 is deployed in edge computing devices
  • the synchronization unit 112 and detection matching unit 113 are deployed in cloud data centers, which are not specifically limited in this application.
  • the baseline determination unit 111 is configured to receive the baseline data sent by the multi-camera 120 , and then determine the target baseline according to the measurement accuracy requirement of the target to be measured, and send it to the multi-camera 120 . It should be understood that the baseline length of the multi-eye camera and the common viewing area have a certain influence on the measurement accuracy of the parallax, wherein the common viewing area refers to the area that can be photographed by the multi-eye cameras at the same time.
  • the baseline determination unit 111 can flexibly determine the target baseline according to the measurement accuracy requirements of the target to be measured, thereby expanding the ranging range of the ranging and positioning system provided by the present application.
  • FIG. 4 is a schematic diagram of the ranging error of the binocular camera with different baselines of the fixed target. It can be seen from FIG. 4 that when the distance between the target and the multi-camera 120 is 50 meters, the baseline of the binocular camera 120 Different lengths have different ranging errors. The shorter the baseline length, the greater the ranging error, and the lower the corresponding ranging accuracy; the longer the baseline length, the smaller the ranging error, and the higher the corresponding ranging accuracy.
  • Figure 5 is a A schematic diagram of the imaging of a binocular camera with a long baseline when shooting a target point. If the baseline of the binocular camera is too long, the target point P is not within the shooting range of the right camera. In other words, the target point P is within the shooting range of the right camera. There is no imaging point on the imaging plane CD, so the position information of the target cannot be determined according to the parallax.
  • the baseline determination unit 111 can determine the target baseline in the following manner: first, determine the first accuracy index and the second accuracy index of each group of cameras, wherein the first accuracy index is related to each group of cameras.
  • the baseline length of the cameras is inversely proportional
  • the first accuracy index is proportional to the common viewing area of each group of cameras
  • the second accuracy index is proportional to the baseline length and focal length of each group of cameras.
  • the area photographed by multiple cameras and then determine the weight of the first accuracy index and the second accuracy index according to the measurement accuracy requirements of the target, and then obtain the comprehensive combination of each group of cameras according to the first accuracy index, the second accuracy index and the weight. indicators, so as to determine the target baseline according to the comprehensive indicators of each group of cameras.
  • the measurement accuracy requirement may include the approximate distance between the target and the multi-camera 120, such as whether the target is a long-distance target or a short-distance target, wherein whether the target is a long-distance target or a short-distance target can be determined according to the target in the image captured by the camera.
  • the size of the image area is determined. For example, a target whose image area size is smaller than the first threshold is a long-distance target, and a target whose image area size is greater than the second threshold is a short-range target.
  • the strategy accuracy requirement may also include the target measurement error threshold, for example, the target measurement error must not be less than 1 meter. It should be understood that the above examples are for illustration, and are not specifically limited in the present application.
  • the baseline determination unit 111 may send a baseline adjustment request carrying the target baseline to the multi-camera 120, where the baseline adjustment request is used to instruct the multi-cam 120 to adjust the baseline length of the camera group to the above-mentioned length. target baseline.
  • the synchronization unit 112 is configured to receive the multi-camera 120 to collect the target using the camera group of the target baseline to obtain multi-channel videos, such as the first channel video and the second channel video, and then perform time synchronization on the first channel video and the second channel video. processing to obtain a first image and a second image, wherein the first image and the second image are obtained by photographing the same target at the same time.
  • the two videos captured by the binocular camera may be out of sync in time. For example, the image with the time stamp of T1 on the left camera describes the world time.
  • the synchronization unit 112 performs time synchronization on the video of the first channel and the video of the second channel, the first image and the second image at the same moment can be obtained. Using the first image and the second image to perform the next disparity calculation can improve the final result. accuracy of the location information.
  • the detection and matching unit 113 is used to detect and identify the target to be tested in the first image and the second image, obtain the first target area and the second target area, and then perform feature points on the first target area and the second target area.
  • the feature point matching result includes the corresponding relationship between the feature points in the first target area and the feature points in the second target area, and the feature points with the corresponding relationship describe the target of the same feature. Then, the parallax information of each feature point is determined according to the feature point matching result, and the position information of the target is determined according to the parallax information and the parameter information of the multi-camera.
  • the disparity information includes the disparity of each feature point of the target, which may be the difference between the pixel coordinates of the feature points in the first target area and the pixel coordinates of the corresponding feature points in the second target area.
  • the disparity For description, reference may be made to the embodiment in FIG. 1 , which will not be repeated here.
  • the position information may include the distance between the target and the camera, which may be determined according to parallax and parameter information of the multi-camera. It can be known from the formula 3 in the embodiment of FIG. 1 that the parameter information at least includes the baseline length and the focal length of the multi-camera.
  • the location information may also include the geographic coordinates of the target, where the geographic coordinates may be determined according to the geographic coordinates of the multi-camera combined with the distance between the target and the camera, and may be specifically determined according to the requirements of the application service 130. If the location coordinates include The geographic coordinates of the target, the parameter information of the multi-camera may include the geographic coordinates of the multi-camera.
  • the detection and matching unit 113 determines the position information of the target according to the parallax information of the feature points, and does not need to perform matching and parallax calculation on each pixel in the first image area and the second image area, thereby reducing the time required for positioning and ranging. Computing resources, while avoiding background interference, noise and other problems, and improving the accuracy of ranging and positioning.
  • the target positioning system provided by the present application can determine the target baseline according to the target to be measured, use the camera group of the target baseline to collect the target to obtain the first image and the second image, and then compare the first image and the second image.
  • the image is subjected to target detection and matching, and the first target area and the second target area where the target is located are obtained.
  • the first target area and the second target area are detected and matched by feature points, and the result of feature point matching is obtained. Matching according to the feature points As a result, the disparity information of each feature point is determined, thereby determining the position information of the target.
  • the system can flexibly select the target baseline camera group for data collection according to the target to be measured, avoid the problem of limited ranging range caused by fixed baseline multi-eye cameras, and improve the ranging range of the target positioning system.
  • the system determines the location information of the target according to the parallax information of the feature points, and does not need to perform matching and parallax calculation for each pixel in the first image area and the second image area, thereby reducing the computational resources required for positioning and ranging, and avoiding background interference. , noise and other problems, improve the accuracy of ranging and positioning.
  • the present application provides a target positioning method, which can be applied to the architecture of the stereoscopic vision system shown in FIG. 2 .
  • the method can be executed by the aforementioned target positioning system 110 .
  • the method may include the following steps:
  • S310 Determine the target baseline according to the measurement accuracy requirement of the target to be measured.
  • the above-mentioned multi-camera 120 may include N camera groups, and each camera group includes a plurality of cameras. A binocular camera.
  • the multi-camera 120 may send the baseline data of each camera group to the object positioning system 110 before step S310.
  • the target positioning system 110 can select a target baseline from the baseline data of the N(N-1)/2 types of binocular cameras according to the measurement accuracy requirements of the target to be measured, and send it to the multi-eye camera 120 .
  • the target baseline can be determined according to the size of the common viewing area of the multi-camera 120 and the measurement accuracy requirements of the target to be measured.
  • the baseline data collected by the multi-camera 120 includes not only the baseline length between the multiple cameras in each camera group, but also the size of the common viewing area between the multiple cameras, where the size of the common viewing area can be determined according to the size of the camera.
  • the shooting range of each camera in the group is determined, and the shooting range refers to the range of the geographical area recorded in the image captured by the camera.
  • the pixel coordinates of each edge position point can be determined by determining the edge position point that can be displayed in the video picture of each channel of video, and after converting it into geographic coordinates through a camera calibration algorithm, The shooting range of the video is determined according to the area composed of these geographic coordinates, and then the size of the common viewing area between multiple cameras is obtained.
  • the measurement accuracy requirement of the target to be measured may include the approximate distance between the target and the multi-camera 120 , in other words, the long-range target or the short-range target of the target.
  • whether the target to be measured is a long-distance target or a short-range target can be determined according to the size of the image area where the target is located in the image collected by the multi-camera camera, and the image area occupied by the long-distance target in the image collected by the camera is very small , the image area occupied by the close-range target in the image collected by the camera is very large, so when the image area is smaller than the first threshold, it can be determined that the target to be measured is a close-range target, and when the image area is smaller than the second threshold, it can be determined
  • the target to be tested is a long-distance target.
  • the measurement accuracy requirement may also include the measurement error threshold of the target to be measured, for example, the measurement error is not less than 1 meter. It should be understood that the above examples are for illustration
  • the first accuracy index p 1 and the second accuracy index p 2 of each group of cameras may be determined, wherein the first accuracy index p 1 is inversely proportional to the baseline length of each group of cameras, and the first accuracy index p 1 is proportional to the common viewing area of each group of cameras, the second accuracy index p 2 is proportional to the baseline length and focal length of each group of cameras, the common viewing area is the area captured by multiple cameras in each group of cameras, Then, the weight ⁇ of the first accuracy index p 1 and the second accuracy index p 2 is determined according to the measurement accuracy requirements of the target, and then the weight ⁇ of each group of cameras is obtained according to the first accuracy index p 1 , the second accuracy index p 2 and the weight ⁇ .
  • the comprehensive index p is used to determine the target baseline according to the comprehensive index of each group of cameras.
  • ⁇ 1 and ⁇ 2 are both unit conversion coefficients, so that the units of p 1 and p 2 are consistent, and can participate in the calculation of the comprehensive accuracy p.
  • the weight ⁇ (0,1), the specific value of ⁇ can be determined according to the above-mentioned ranging and positioning requirements. For example, when the target to be measured is a long-distance target, the baseline length has a great influence on the ranging accuracy. Therefore, the value of ⁇ can be appropriately reduced in combination with the measurement error threshold of the target to be measured, and the accuracy index p 2 based on the baseline length can be improved.
  • the size of the common viewing area has a greater impact on the ranging accuracy, so it can be combined with the measurement error threshold of the target to be measured to appropriately increase ⁇ . value, and improve the impact of the common view-based precision index p 1 on the comprehensive index.
  • the comprehensive index p of the above N(N-1)/2 camera groups can be determined according to the target to be measured, and then the baseline length corresponding to the largest comprehensive index p is determined as target baseline, which is sent to the multi-camera 120. It should be understood that the above examples are used for illustration, and the present application does not limit the specific formula of the comprehensive index.
  • S320 Acquire the first image and the second image according to the target baseline.
  • the target baseline may be sent to the multi-camera 120 according to the target baseline, and then the first image and the second image captured by the camera group corresponding to the target baseline may be received.
  • a baseline adjustment request carrying a target baseline may be sent to the multi-camera camera, and after the multi-camera adjusts the baseline length of at least one camera group to the target baseline according to the baseline adjustment request and shoots a video or image, then A first image and a second image captured by the camera group corresponding to the target baseline are received.
  • the baseline adjustment request is used to instruct the multi-camera 120 to adjust the baseline length of the camera group to the above target baseline.
  • the first video and the second video captured by the camera group corresponding to the target baseline can also be received.
  • the first image and the second image are obtained after time synchronization processing is performed between the video of the first channel and the video of the second channel, wherein the video of the first channel and the video of the second channel are captured by the camera group corresponding to the target baseline.
  • An image is an image at a uniform moment.
  • the first image is a video frame in the first channel of video
  • the second image is a video frame in the second channel of video.
  • the video collected by the camera 2 is higher than the video collected by the camera 1. If the disparity calculation is performed directly on the first video and the second video collected by camera 1 and camera 2, there will be errors in the obtained disparity information, which will lead to obstacles in subsequent applications such as ranging and 3D reconstruction. Therefore, before the disparity calculation is performed, time synchronization processing may be performed on the first channel video and the second channel video at step S320, thereby improving the parallax calculation accuracy, thereby improving the accuracy of applications such as ranging and 3D reconstruction.
  • the reference frame can be obtained from the first video, and a plurality of motion frames can be obtained from the second video, wherein the reference frame and the plurality of motion frames include moving objects, and then the reference frame is combined with the plurality of motion frames.
  • the motion frame performs feature point matching to obtain synchronization frames in multiple motion frames, wherein the parallelism of the line between the feature points in the synchronization frame and the corresponding feature points in the reference frame satisfies the preset condition, and finally according to the reference frame and the reference frame.
  • the synchronization frame performs time synchronization correction on the video of the first channel and the video of the second channel to obtain the first image and the second image.
  • satisfying the preset condition may refer to determining the frame with the highest parallelism between the lines as the synchronization frame.
  • the above-mentioned reference frame and motion frame may be determined by an optical flow method.
  • the optical flow refers to the instantaneous speed of the pixel motion of the space moving object on the observation imaging plane.
  • the optical flow can also be equivalent to the displacement of the space moving object.
  • the step flow of determining the reference frame and the motion frame can be as follows: first, perform target detection of the synchronization target on each frame of the first video and the second video, and obtain one or more synchronization targets in each frame of the image.
  • the detected synchronization target when the target is detected for each frame of image, the detected synchronization target should be a target that may move, not a certain stationary target, such as a building. Therefore, the synchronization target may be the target to be measured in the foregoing content, or may be other targets, which are not specifically limited in this application.
  • the synchronization target for time synchronization can be a pedestrian or a vehicle; if the target to be tested is vehicle A, the synchronization target for time synchronization can be Vehicles and pedestrians, the above examples are used for illustration and are not limited in this application.
  • the target detection algorithm in this embodiment of the present application may use any of the existing neural network models for target detection with better effects in the industry, for example: one-stage unified real-time target detection (You Only Look Once : Unified, Real-Time Object Detection, Yolo) model, Single Shot multi box Detector (SSD) model, Region ConvolutioNal Neural Network (RCNN) model or fast regional convolutional neural network Network (Fast Region Convolutional Neural Network, Fast-RCNN) models, etc., are not specifically limited in this application.
  • the optical flow method in the embodiments of the present application can also use any one of the optical flow methods that are already available in the industry for calculating optical flow with better effects, such as the Lucas-Kanade (LK) optical flow method.
  • LK Lucas-Kanade
  • the object after obtaining the optical flow of each object in each frame (that is, the instantaneous speed of the object), it can be determined whether the object is a moving object by determining whether the speed of the object has a speed component in the direction of the image line.
  • the multi-camera eg, the multi-camera 120 shown in FIG. 2
  • the row coordinates of the object will change, so if the object X in the motion frame Tn is the same as the previous one If the row coordinates of the same object X in one frame Tn-1 (or the next frame Tn+1) are not equal, it can be determined that the object is a moving object.
  • the vertically moving object only moves in the column direction.
  • This type of moving object has no velocity component in the image row direction, but only has a velocity component in the column direction. Therefore, the vertically moving object does not contribute to the disparity calculation.
  • the vertically moving object is also regarded as a non-moving object and does not participate in the parallax calculation, thereby reducing the amount of calculation and improving the accuracy and efficiency of the parallax calculation.
  • the moving object in the reference frame and the moving object in the motion frame can be matched with the feature points, and the difference value ⁇ s of the row coordinates of each feature point can be calculated.
  • the formula of ⁇ t can be as follows:
  • ⁇ s 1 and ⁇ s 2 are the two values with the smallest absolute value of the feature point difference
  • fr is the video frame rate
  • the synchronization offset time ⁇ t can be used as the compensation for the row coordinates of each subsequent frame, so as to obtain the first channel after synchronization. video and the second channel video, and then obtain the first image and the second image at each moment.
  • the motion frame of camera 2 includes frame Q1, frame Q2 and frame Q3, and the motion frame Q1 of camera 2 and the reference frame P1 are characterized by Point matching, obtain the mean value ⁇ s 1 of the difference between the row coordinates of each feature point, perform feature point matching between the motion frame Q2 of the camera 2 and the reference frame P1, obtain the mean value ⁇ s 2 of the difference value of the row coordinates of each feature point, put the camera 2
  • the difference between the motion frame Q2 of the camera 2 and the feature points in the reference frame P1 is ⁇ s 3 .
  • the connection lines between them are parallel, the motion frame Q2 and the reference frame P1 are the first image and the second image at the same time, so the frame P1 of the camera 1 can be aligned with the frame Q2 of the camera 2, that is, the video captured by the camera 1 1 frame slower than the video captured by camera 2.
  • the camera 1 and the camera 2 can be synchronized.
  • the offset time ⁇ t 3ms, that is, the camera 1 is 3ms faster than the camera 2, then the camera 2 can be adjusted faster 3 seconds to achieve the purpose of synchronizing with the video of camera 1.
  • FIG. 4 is used for illustration, and is not specifically limited in the present application.
  • stereoscopic correction may also be performed on the first channel of video and the second channel of video.
  • the formula used in calculating the parallax is often derived under the assumption that the multi-camera is in an ideal situation, so before using the multi-camera for distance measurement and positioning, the actually used multi-camera 120 can be corrected to an ideal state .
  • the image planes of the left and right cameras are parallel, the optical axis is perpendicular to the image plane, and the pole is located at infinity.
  • the embodiment of the present application may adopt any one of the existing stereo correction methods with better effects in the industry, such as the bouguet polar line correction method, which is not specifically limited in the present application.
  • a multi-eye camera in step S320, can also be used to capture the same target at the same time to obtain the first image and the second image.
  • the step of performing time synchronization processing on the first video and the second video in step S320 can be omitted, and step S330 is executed to The parallax calculation is performed, which will not be repeated here.
  • S330 Perform target detection and matching on the first image and the second image to obtain a first target area of the first image and a second target area of the second image.
  • the first target area and the second target area include the above-mentioned target to be detected.
  • the first image may be input into the detection and matching model to obtain the first detection and matching result of the first image
  • the second image may be input into the detection and matching model to obtain the second detection and matching result of the second image.
  • the first target area is obtained from the matching result
  • the second target area is obtained according to the second detection and matching result.
  • the first detection matching result and the second detection matching result include a bounding box and a label, and the bounding box is used to indicate the area of the target to be detected in the image.
  • the labels of different targets are different. According to the first detection matching
  • the label in the result and the second detection matching result can determine the same target in the first image and the second image, and then determine the first target area and the second target area in combination with the above target frame.
  • the target frame in the target detection matching result may be a rectangular frame, a circular frame, an oval frame, etc., which is not specifically limited in this application. It should be understood that if the number of targets to be tested is multiple, the detection and matching results may include multiple target frames of multiple targets. Therefore, in the detection and matching results, the same target may be identified by the same label, and different targets may be identified using different In this way, when the disparity calculation is performed on the target, the same target in different video frames can be identified according to the label, so as to achieve the feature point matching of the same target in the first image and the second image at the same moment. , and then obtain the disparity of the target.
  • frame P3 of camera 1 and frame Q4 of camera 2 are the first image and the first image at the same moment.
  • An example diagram of a target detection matching result in a target localization method is provided, wherein the detection matching result is the rectangular target frame and ID label shown in FIG.
  • the tank truck selected by the box in frame P3 and the tank truck selected by the box in frame Q4 are the same vehicle
  • the bus box selected in frame P3 and the bus box selected in frame Q4 are the same vehicle. is the same car.
  • FIG. 8 is used for illustration, the target frame can also be other forms such as a circular frame, an oval frame, etc.
  • the ID label displayed by the detection matching result can also be other forms such as letters, numbers, etc., which are not specifically limited in this application. .
  • FIG. 9 is a schematic structural diagram of a target detection model in a target localization method provided by the present application.
  • the detection and matching model may include a feature extraction module 610 and a detection and matching module 620 .
  • the feature extraction module 610 is used to extract the features in the input first image and the second image, and generate a high-dimensional feature vector
  • the detection matching module 620 is used to generate the detection matching result including the target frame and the label according to the above-mentioned feature vector.
  • the frame P3 of the camera 1 and the frame Q4 of the camera 2 are the first image and the second image at the same time.
  • the frame P3 and the frame Q4 can be input to the feature extraction module 610 to generate a high-dimensional feature vector, and then the feature The vector input detection matching module 620, the detection matching module 620 generates the detection matching result as shown in Figure 5, if the target to be tested is 001, then the first target area and the second target area shown in Figure 9 can be obtained, it should be understood , FIG. 9 is used for illustration, and is not specifically limited in this application.
  • a sample set may be used to train the detection matching model, and the sample set may include a first image sample, a second image sample, and a corresponding sample truth value, and the sample truth value includes target detection.
  • the ground-truth value and the target matching ground-truth value wherein the target detection ground-truth value includes the target frame of the target in the first image sample and the second image sample, and the target matching ground-truth value includes the label of the target in the first image sample and the second image sample, using
  • the detection matching loss used for back propagation is determined according to the difference between the output value of the detection matching module 620 and the sample true value, and the detection matching model is determined according to the detection matching loss.
  • the parameters are adjusted until the above detection matching loss reaches the threshold, and the trained detection matching model is obtained.
  • the feature extraction module 610 may be a neural network backbone structure such as VGG, Resnet, etc. for extracting image features
  • the above-mentioned detection and matching module 620 may be a target detection network, such as YOLO network, SSD network, RCNN, etc. Specific restrictions.
  • the first target area and the second target area can be determined according to whether the labels are the same, rather than Performing image recognition on the target to determine the same target in the first image and the second image can reduce computational complexity, improve the acquisition efficiency of the first target area and the second target area, and further improve the efficiency of ranging and positioning.
  • the feature point matching result includes the corresponding relationship between the feature points in the first target area and the feature points in the second target area, and the feature points with the corresponding relationship describe the same feature of the target to be measured.
  • the target to be tested is a pedestrian
  • the feature points of the pedestrian include eyes, nose and mouth, then there is a correspondence between the eyes of the pedestrian in the first target area and the eyes of the pedestrian in the second target area.
  • a feature point detection algorithm can be used to perform feature point detection on the first target area and the second target area to obtain the feature points of the first target area and the feature points of the second target area.
  • the target is the same target, so the feature points of the first target area have their corresponding feature points in the second target area.
  • the feature point detection and matching algorithm in this embodiment of the present application may be a feature point extraction algorithm (features from accelerated segment test, FAST), a feature point description algorithm (binary robust independent elementary features, BRIEF), a combination of FAST and BRIEF oriented fast and rotated brief (ORB), accelerated robust feature algorithm (speeded up robust features, SURF), accelerated KAZE feature algorithm (accelerated KAZE features, AKAZE), etc., which are not specifically limited in this application.
  • FAST feature point extraction algorithm
  • BRIEF feature point description algorithm
  • ORB accelerated robust feature algorithm
  • speeded up robust features SURF
  • accelerated KAZE feature algorithm accelerated KAZE features, AKAZE
  • the target to be tested is a vehicle with ID: 001
  • the first target area and the second target area are shown in FIG. 10 , which is an example provided by this application
  • FIG. 10 shows a partial feature point matching result, and each feature point detected in the first target area will have a corresponding feature point in the second target area.
  • the feature points with corresponding relationships are represented by connecting lines.
  • the feature point matching result can be used to represent the corresponding relationship between the feature points in other ways. FIG. Specific restrictions.
  • S350 Determine the position information of the target according to the feature point matching result and the parameter information of the multi-camera.
  • the parameter information of the multi-camera includes at least the baseline length and focal length of the multi-camera, and may also include geographic coordinate information of the multi-camera.
  • the location information of the target may include the distance between the target and the multi-camera, and may also include the geographic coordinates of the target, which is not specifically limited in this application.
  • the disparity information of the target can be obtained according to the pixel difference between the feature points with corresponding relationships in the feature matching result, where the disparity information includes the pixel coordinates of the feature points in the first target area and the corresponding features in the second target area.
  • the distance between the pixel coordinates of the point can be used to determine the distance between the target and the multi-camera camera according to the parallax information, the baseline length b and the focal length f. Determine the geographic coordinates of the target.
  • some credible pixel differences are taken as the parallax, or the average value is taken as the target disparity information, and then use the disparity information to perform distance calculation, which is not specifically limited in this application.
  • the second target area includes feature points A2 and B2 of target X, where A1 and A2 are the same feature point, and B1 and B2 are the same
  • A1 and A2 are the same feature point
  • B1 and B2 are the same
  • the average value of the pixel difference D1 and the pixel difference D2 can be determined.
  • the parallax of the target, and then the distance between the target and the binocular camera is obtained.
  • FIG. 11 is a schematic diagram of the feature point matching result of a target positioning method provided by the present application in an application scenario, to measure the actual application scenario of the distance between the person Y and the binocular camera.
  • you can first determine the target baseline according to the measurement accuracy requirements (for example, the person Y is a short-range target, and the measurement error is plus or minus 1 meter), combined with formulas (4) to (6), and then Send the target baseline to the multi-camera 120, obtain the first image and the second image captured by the camera group corresponding to the target baseline, and then input the first image and the second image into the detection matching model shown in FIG.
  • the measurement accuracy requirements for example, the person Y is a short-range target, and the measurement error is plus or minus 1 meter
  • Send the target baseline to the multi-camera 120 obtain the first image and the second image captured by the camera group corresponding to the target baseline, and then input the first image and the second image into the detection matching model shown in FIG.
  • the first detection matching result and the second detection matching result shown in FIG. 11 according to the target frame and label in the first detection matching result and the second detection matching result, the first target area and the second target area including the person Y are obtained. , after the feature point detection and matching of the first target area and the second target area, the feature point matching result as shown in Figure 11 can be obtained.
  • the parallax is 14.2m away from the camera. It should be understood that FIG. 11 is used for illustration, and is not specifically limited in the present application.
  • the parallax is determined according to the difference between the feature points, rather than the difference between each pixel in the first target area and the second target area, it can not only reduce the amount of calculation, but also improve the calculation efficiency of the parallax.
  • the feature points can not only be in the pixel, but also between the pixels, in other words, the accuracy of determining the parallax based on pixel matching is at the integer level, and the accuracy of determining the parallax based on the feature point matching is at the decimal level. Therefore, in the present application, the parallax calculation is performed with higher accuracy by means of feature point matching, thereby making the accuracy of ranging and positioning higher.
  • the solution provided by the present application can also improve the parallax calculation accuracy of the textureless object, thereby improving the ranging and positioning accuracy of the textureless object. It is understandable that when using a multi-eye camera to shoot textureless objects, the pixel difference of the textureless objects is very small, resulting in the method of calculating the difference between the pixels of different road images to determine the target parallax, and the accuracy is very poor.
  • using the solution provided in this application extract the first target area and the second target area where the target is located, and then perform feature point matching on the first target area and the second target area to obtain the feature point matching result. According to the feature point matching result Determining parallax can improve the matching accuracy of untextured objects.
  • FIG. 12 is a schematic diagram of a textureless object provided by the present application, assuming a textureless object.
  • Z is a checkerboard, and the checkerboard is placed at a distance of 7.5m from the binocular camera.
  • the output depth value is 6.7m, while the output depth value of the solution provided by this application is 7.2 Therefore, the solution provided by the present application has higher accuracy of parallax calculation and better ranging and positioning accuracy.
  • the solution provided by the present application can also improve the accuracy of parallax calculation in an occluded scene, thereby improving the accuracy of ranging and positioning of occluded objects. It is understandable that since the pixels of the occluded object are covered and appear as the pixels of the occluder, the method of calculating the difference between the pixel points of different road images to determine the parallax of the target has poor accuracy, but the method provided by this application is used. Scheme, after using the detection and matching model shown in Figure 9 to perform target detection and matching on the target, the position of the occluded object can be estimated, the occluded object can be supplemented, and the supplemented first target area and second target area can be obtained.
  • the target area and then perform feature point detection and matching on it to obtain the feature point matching result, determine the parallax information of the target according to the feature to be matched result, and obtain the distance between the target and the multi-camera, so that the calculated parallax accuracy is higher , the ranging accuracy of occluded objects is also higher.
  • FIG. 13 is a schematic flowchart of the steps of determining the first target area and the second target area in an occlusion scenario provided by the present application. It is assumed that the first target area of the target 004 is not covered by the target 005 occlusion, and in the second target area, the target 004 is occluded by the target 005. If the disparity information of the target is directly determined according to the difference between the pixels between the first target area and the second target area, since the target 004 is blocked in the right image If the target 005 is occluded, the final obtained parallax will be inaccurate, resulting in low ranging and positioning accuracy.
  • the solution provided by this application When using the solution provided by this application to perform parallax calculation on the target 004 in the first target area and the second target area, you can first estimate the position of the target 004 in the second target area, and then perform feature point detection. and matching to obtain the feature point matching result, thereby obtaining the disparity of the target 004, and then obtaining the ranging and positioning result of the target 004.
  • the solution provided by the present application has higher disparity calculation accuracy in the occlusion scene.
  • the present application provides a target positioning method, which can determine a target baseline according to the target to be measured, use the camera group of the target baseline to collect the target to obtain a first image and a second image, and then perform the first image and the second image.
  • the image and the second image are subjected to target detection and matching, and the first target area and the second target area where the target is located are obtained, and finally the feature point detection and matching are performed on the first target area and the second target area, and the feature point matching result is obtained,
  • the disparity information of each feature point is determined according to the feature point matching result, thereby determining the position information of the target.
  • the system can flexibly select the target baseline camera group for data collection according to the target to be measured, avoid the problem of limited ranging range caused by fixed baseline multi-eye cameras, and improve the ranging range of the target positioning system.
  • the system determines the location information of the target according to the parallax information of the feature points, and does not need to perform matching and parallax calculation for each pixel in the first image area and the second image area, thereby reducing the computational resources required for positioning and ranging, and avoiding background interference. , noise and other problems, improve the accuracy of ranging and positioning.
  • the target positioning system 110 may be divided into modules or units according to functions, and there may be various ways of division.
  • the target positioning system 110 may include a baseline determination unit 111 , a synchronization unit 112 and a detection matching unit 113 .
  • the target positioning system 110 may be further divided into units according to functions.
  • FIG. 14 is a schematic structural diagram of another target positioning system 110 provided by the present application.
  • the present application provides a target positioning system 110.
  • the target positioning system 110 includes: a baseline determination unit 1410, an acquisition unit 1420, a synchronization unit 1430, a detection matching unit 1440, and a position Determining unit 1450.
  • an acquisition unit 1420 configured to acquire a first image and a second image, the first image and the second image are obtained by photographing the same target at the same time by a multi-camera camera;
  • the detection and matching unit 1440 is configured to perform target detection and matching on the first image and the second image, and obtain the first target area of the first image and the second target area of the second image, wherein the first target area and the second target area are The area includes the target;
  • the detection and matching unit 1440 is configured to perform feature point detection and matching on the first target area and the second target area, and obtain a feature point matching result, wherein the feature point matching result includes the feature points in the first target area and the second target area.
  • the corresponding relationship between the feature points in , the feature points with the corresponding relationship describe the same feature of the target;
  • the position determining unit 1450 is configured to determine the position information of the target according to the feature point matching result and the parameter information of the multi-camera.
  • the parameter information includes at least the baseline length of the multi-camera camera and the focal length of the multi-camera camera; the position determination unit 1450 is configured to obtain the target according to the pixel difference between the feature points with corresponding relationships in the feature point matching result.
  • the disparity information includes the difference between the pixel coordinates of the feature points in the first target area and the pixel coordinates of the feature points that have a corresponding relationship in the second target area; the position determination unit 1450 is used for disparity information according to the target.
  • the baseline length of the multi-eye camera and the focal length of the multi-eye camera determine the distance between the target and the camera, and obtain the position information of the target.
  • the multi-camera camera includes a plurality of camera groups, and each camera group in the plurality of camera groups includes a plurality of cameras, and the baseline determination unit 1410 is configured to acquire baseline data of the multi-camera camera, and the baseline data includes each group of cameras.
  • the baseline length between the multiple cameras in the baseline; the baseline determination unit 1410 is used to obtain the target baseline from the baseline data according to the measurement accuracy requirements of the target;
  • the acquisition unit 1420 is used to obtain the first image and the second image according to the target baseline , wherein the first image and the second image are captured by the camera group corresponding to the target baseline.
  • the baseline determination unit 1410 is configured to send a baseline adjustment request carrying a target baseline to the multi-camera camera, where the baseline adjustment request is used to instruct the multi-camera camera to adjust the baseline length of the camera group included in the multi-camera camera to the Target baseline; the acquiring unit 1420 is configured to receive the first image and the second image captured by the camera group corresponding to the target baseline.
  • the baseline determination unit 1410 is configured to determine the first accuracy index and the second accuracy index of each group of cameras, wherein the first accuracy index is inversely proportional to the baseline length of each group of cameras, and the first accuracy index is inversely proportional to the length of the baseline of each group of cameras.
  • the common viewing area of each group of cameras is in a proportional relationship, the second accuracy index is proportional to the baseline length and focal length of each group of cameras, and the common viewing area is the area photographed by multiple cameras in each group of cameras; the baseline determination unit 1410 , used to determine the weight of the first accuracy index and the second accuracy index according to the measurement accuracy requirements of the target; the baseline determination unit 1410 is used to obtain the comprehensive index of each group of cameras according to the first accuracy index, the second accuracy index and the weight ; The baseline determination unit 1410 is used to determine the target baseline according to the comprehensive index of each group of cameras.
  • the synchronization unit 1430 is used to receive the first video and the second video obtained by shooting the target with the multi-eye camera; the synchronization unit 1430 is used to time the first video and the second video. Synchronous processing is performed to obtain a first image and a second image at the same moment, wherein the first image is an image frame in the first channel of video, and the second image is an image frame in the second channel of video.
  • the synchronization unit 1430 is configured to obtain a reference frame from the first channel video, and obtain a plurality of motion frames from the second channel video, wherein the reference frame and the plurality of motion frames include moving objects; the synchronization unit 1430, for performing feature point matching between the reference frame and multiple motion frames, to obtain a synchronization frame in the multiple motion frames, wherein the parallelism of the line between the feature point in the synchronization frame and the corresponding feature point in the reference frame The preset conditions are met; the synchronization unit 1430 is configured to perform time synchronization correction on the video of the first channel and the video of the second channel according to the reference frame and the synchronization frame, and obtain the first image and the second image at the same moment.
  • the detection matching unit 1440 is configured to input the first image into the detection matching model, obtain the first detection matching result of the first image, input the second image into the detection matching model, and obtain the second detection matching result of the second image.
  • the matching result wherein the first detection matching result and the second detection matching result include a target frame and a label, the target frame is used to indicate the area of the target in the image, and the label of the same target is the same; the detection matching unit 1440 is used for according to the first
  • the first target area is obtained by detecting the matching result, and the second target area is obtained according to the second detecting and matching result.
  • the unit modules inside the target positioning system 110 may also be divided into multiple divisions, and each module may be a software module, a hardware module, or part of a software module and part of a hardware module, which is not limited in this application.
  • 2 and 14 are both exemplary division manners.
  • the obtaining unit 1420 in FIG. 14 may also be omitted; in other feasible solutions, the location in FIG. 14
  • the present application provides a target positioning system, which can determine a target baseline according to the target to be measured, use the camera group of the target baseline to collect the target to obtain a first image and a second image, and then perform the first image and the second image.
  • the image and the second image are subjected to target detection and matching, and the first target area and the second target area where the target is located are obtained, and finally the feature point detection and matching are performed on the first target area and the second target area, and the feature point matching result is obtained,
  • the disparity information of each feature point is determined according to the feature point matching result, thereby determining the position information of the target.
  • the system can flexibly select the target baseline camera group for data collection according to the target to be measured, avoid the problem of limited ranging range caused by fixed baseline multi-eye cameras, and improve the ranging range of the target positioning system.
  • the system determines the location information of the target according to the parallax information of the feature points, and does not need to perform matching and parallax calculation for each pixel in the first image area and the second image area, thereby reducing the computational resources required for positioning and ranging, and avoiding background interference. , noise and other problems, improve the accuracy of ranging and positioning.
  • FIG. 15 is a schematic structural diagram of a computing device 900 provided by the present application, and the computing device 900 may be the target positioning system 110 in the foregoing content.
  • the computing device 900 includes a processor 910 , a communication interface 920 and a memory 930 .
  • the processor 910, the communication interface 920 and the memory 930 can be connected to each other through the internal bus 940, and can also communicate through other means such as wireless transmission.
  • the embodiment of the present application takes the connection through the bus 940 as an example, and the bus 940 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus or the like.
  • the bus 940 can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is shown in Figure 15, but it does not mean that there is only one bus or one type of bus.
  • the processor 910 may be composed of at least one general-purpose processor, such as a central processing unit (CPU), or a combination of a CPU and a hardware chip.
  • the above-mentioned hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD) or a combination thereof.
  • the above-mentioned PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL) or any combination thereof.
  • Processor 910 executes various types of digitally stored instructions, such as software or firmware programs stored in memory 930, which enable computing device 900 to provide various services.
  • the memory 930 is used for storing program codes, and is controlled and executed by the processor 910 to execute the processing steps of the target positioning system in the above-mentioned embodiment.
  • the program code may include one or more software modules, and the one or more software modules may be the software modules provided in the embodiment of FIG. 14 , such as an acquisition unit, a detection matching unit and a position determination unit, wherein the acquisition unit is used to acquire The first image and the second image, the detection and matching unit is used to input the first image and the second image into the detection and matching model, obtain the first target area and the second target area, and then perform the feature on the first target area and the second target area.
  • Point detection and matching to obtain the feature point matching result and the position determination unit is used for determining the position information of the target according to the feature point matching result and the parameter information of the multi-camera. Specifically, it can be used to execute S310-step S350 in the embodiment of FIG. 6 and its optional steps, and can also be used to implement other functions of the target positioning system 110 described in the embodiment of FIG. 1 to FIG. 13 , which will not be repeated here.
  • this embodiment can be implemented by a general physical server, for example, an ARM server or an X86 server, or can be implemented based on a general physical server combined with a virtual machine implemented by NFV technology .
  • a complete computer system with complete hardware system functions and running in a completely isolated environment is not specifically limited in this application.
  • Memory 930 may include volatile memory (volatile memory), such as random access memory (RAM); memory 1030 may also include non-volatile memory (non-volatile memory), such as read-only memory (read- only memory, ROM), flash memory (flash memory), hard disk drive (HDD) or solid-state drive (solid-state drive, SSD); the memory 930 may also include a combination of the above types.
  • volatile memory such as random access memory (RAM)
  • memory 1030 may also include non-volatile memory (non-volatile memory), such as read-only memory (read- only memory, ROM), flash memory (flash memory), hard disk drive (HDD) or solid-state drive (solid-state drive, SSD); the memory 930 may also include a combination of the above types.
  • the memory 930 may store program codes, and may specifically include program codes for executing other steps described in the embodiments of FIG. 1 to FIG. 13 , which will not be repeated here.
  • the communication interface 920 may be a wired interface (such as an Ethernet interface), an internal interface (such as a high-speed serial computer expansion bus (peripheral component interconnect express, PCIe) bus interface), a wired interface (such as an Ethernet interface), or a wireless interface (such as a cellular network interface or using a wireless local area network interface) to communicate with other devices or modules.
  • a wired interface such as an Ethernet interface
  • an internal interface such as a high-speed serial computer expansion bus (peripheral component interconnect express, PCIe) bus interface
  • PCIe peripheral component interconnect express
  • Ethernet interface such as an Ethernet interface
  • a wireless interface such as a cellular network interface or using a wireless local area network interface
  • FIG. 15 is only a possible implementation manner of the embodiment of the present application.
  • the computing device 900 may further include more or less components, which is not limited here.
  • the computing device shown in FIG. 15 may also be a computer cluster composed of at least one server, which is not specifically limited in this application.
  • Embodiments of the present application further provide a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium runs on a processor, the method flow shown in FIG. 1 to FIG. 13 is implemented.
  • the embodiment of the present application further provides a computer program product, when the computer program product runs on the processor, the method flow shown in FIG. 1-FIG. 13 is realized.
  • the above embodiments may be implemented in whole or in part by software, hardware, firmware or any other combination.
  • the above-described embodiments may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded or executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated.
  • the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, fiber optic, digital subscriber line, DSL) or wireless (eg, infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, or the like that contains one or more sets of available media.
  • the available media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, high density digital video discs (DVDs), or semiconductor media.
  • the semiconductor media may be SSDs.

Abstract

A target positioning method and system, and a related device. The method comprises the following steps: acquiring a first image and a second image; performing target detection and matching on the first image and the second image, so as to obtain a first target area of the first image and a second target area of the second image; performing feature point detection and matching on the first target area and the second target area, so as to obtain a feature point matching result; and determining position information of a target according to the feature point matching result. By means of the method, feature point detection and matching are performed on a target area where a target is located, and parallax information of the target is determined according to a pixel difference between feature points, without the need to match each pixel in a first target area and a second target area, such that computing resources required for target positioning are reduced, and a background image is prevented from interfering with target parallax computation, thereby improving the accuracy of parallax computation, and improving the precision of distance measurement and positioning.

Description

一种目标定位的方法、系统及相关设备A method, system and related equipment for target positioning
本申请要求于2020年12月31日提交中国知识产权局、申请号为202011638235.5、申请名称为“一种数据处理方法、系统及设备”的中国专利申请,以及于2021年5月24日提交中国知识产权局、申请号为202110567480.X、申请名称为“一种目标定位的方法、系统及相关设备”中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application requires a Chinese patent application with the application number 202011638235.5 and the application title "A Data Processing Method, System and Equipment" to be filed with the China Intellectual Property Office on December 31, 2020, and filed in China on May 24, 2021 The Intellectual Property Office, the application number is 202110567480.X, the application title is "a method, system and related equipment for target positioning" Chinese patent application priority, the entire content of which is incorporated in this application by reference.
技术领域technical field
本申请涉及人工智能(Artificial Intelligence,AI)领域,尤其涉及一种目标定位的方法、系统及相关设备。The present application relates to the field of artificial intelligence (Artificial Intelligence, AI), and in particular, to a method, system and related equipment for target positioning.
背景技术Background technique
随着AI技术的不断发展,立体视觉算法目前被广泛应用在智能安防、自动驾驶、工业检测、三维重建、虚拟现实等领域,体现出其强有力的技术竞争力。立体视觉算法通常使用多目摄像机对目标进行拍摄获得目标的多路图像后,根据多路图像确定目标的视差(parallax),其中,视差指的是从有一定距离的两个观察点上观察同一个目标所产生的方向差异,根据摄像机之间的距离(即基线长度)和视差,可计算出目标和摄像机之间的距离。With the continuous development of AI technology, stereo vision algorithms are currently widely used in intelligent security, autonomous driving, industrial inspection, 3D reconstruction, virtual reality and other fields, reflecting its strong technical competitiveness. Stereo vision algorithms usually use a multi-camera to shoot the target to obtain a multi-channel image of the target, and then determine the parallax of the target according to the multi-channel image. The directional difference produced by a target, the distance between the target and the camera can be calculated based on the distance between the cameras (i.e. the baseline length) and the parallax.
但是,当前立体视觉算法确定目标和摄像机之间的距离时,由于目标在多路图像中并不是一个点,而是一片图像区域,因此需要确定区域内每个像素的视差,根据每个像素的视差和多目摄像机的基线长度确定目标和摄像机之间的距离,该过程不仅消耗巨额计算资源,而且计算过程中易出现噪声、计算错误等问题,导致目标定位精度差,进而影响后续三维重建、自动驾驶、安防监控等应用。However, when the current stereo vision algorithm determines the distance between the target and the camera, since the target is not a point in the multi-channel image, but an image area, it is necessary to determine the disparity of each pixel in the area. Parallax and the baseline length of the multi-eye camera determine the distance between the target and the camera. This process not only consumes huge computing resources, but also is prone to noise and calculation errors during the calculation process, resulting in poor target positioning accuracy, which in turn affects subsequent 3D reconstruction, Automatic driving, security monitoring and other applications.
发明内容SUMMARY OF THE INVENTION
本申请提供了一种目标定位的方法、系统及相关设备,用于解决目标定位过程消耗巨额资源、目标定位精度差的问题。The present application provides a method, system and related equipment for target positioning, which are used to solve the problems that the target positioning process consumes a huge amount of resources and the target positioning accuracy is poor.
第一方面,提供了一种目标定位的方法,该方法包括以下步骤:获取第一图像和第二图像,第一图像和第二图像是多目摄像机在同一时刻对同一目标进行拍摄获得,然后对第一图像和第二图像进行目标检测和匹配,获得第一图像的第一目标区域和第二图像的第二目标区域,其中,第一目标区域和第二目标区域包括目标,对第一目标区域和第二目标区域进行特征点检测和匹配,获得特征点匹配结果,其中,特征点匹配结果包括第一目标区域中的特征点与第二目标区域中的特征点之间的对应关系,存在对应关系的特征点描述目标的同一个特征,根据特征点匹配结果和多目摄像机的参数信息,确定目标的位置信息。A first aspect provides a method for locating a target, the method comprising the following steps: acquiring a first image and a second image, the first image and the second image are obtained by shooting the same target at the same time by a multi-camera camera, and then Perform target detection and matching on the first image and the second image to obtain a first target area of the first image and a second target area of the second image, wherein the first target area and the second target area include targets, and the first target area and the second target area include targets. The target area and the second target area are subjected to feature point detection and matching, and a feature point matching result is obtained, wherein the feature point matching result includes the corresponding relationship between the feature points in the first target area and the feature points in the second target area, The feature points with corresponding relationship describe the same feature of the target, and the position information of the target is determined according to the matching result of the feature points and the parameter information of the multi-camera.
具体实现中,参数信息至少包括多目摄像机的基线长度和多目摄像机的焦距,可根据 特征点匹配结果中存在对应关系的特征点之间的像素差异,获得目标的视差信息,视差信息包括第一目标区域中的特征点的像素坐标与第二目标区域中存在对应关系的特征点的像素坐标之间的差距,然后根据目标的视差信息、多目摄像机的基线长度以及多目摄像机的焦距,确定目标与摄像机之间的距离,获得目标的位置信息。In the specific implementation, the parameter information includes at least the baseline length of the multi-camera camera and the focal length of the multi-camera camera, and the parallax information of the target can be obtained according to the pixel difference between the feature points with corresponding relationship in the feature point matching result, and the parallax information includes the first The difference between the pixel coordinates of the feature points in one target area and the pixel coordinates of the corresponding feature points in the second target area, and then according to the disparity information of the target, the baseline length of the multi-camera camera, and the focal length of the multi-camera camera, Determine the distance between the target and the camera, and obtain the position information of the target.
实施第一方面描述的方法,根据特征点的匹配结果确定目标的视差信息,进而确定目标的位置信息,无需对第一图像区域和第二图像区域中每个像素进行匹配和视差计算,从而降低定位测距时所需的计算资源,同时避免背景干扰、噪声等问题,提升测距定位的精度。Implement the method described in the first aspect, determine the parallax information of the target according to the matching result of the feature points, and then determine the position information of the target, without performing matching and parallax calculation on each pixel in the first image area and the second image area, thereby reducing The computing resources required for positioning and ranging, while avoiding background interference, noise and other problems, improve the accuracy of ranging and positioning.
在第一方面的一种可能的实现方式中,多目摄像机包括多个摄像机组,多个摄像机组中的每组摄像机包括多个摄像机,基于此,可获取多目摄像机的基线数据,基线数据包括每组摄像机中的多个摄像机之间的基线长度,根据目标的测量精度需求,从基线数据中获取目标基线,然后根据目标基线获取第一图像和第二图像,其中,第一图像和第二图像是目标基线对应的摄像机组拍摄获得。In a possible implementation manner of the first aspect, the multi-camera camera includes multiple camera groups, and each camera group in the multiple camera groups includes multiple cameras. Based on this, baseline data of the multi-camera camera can be obtained. Including the baseline length between multiple cameras in each group of cameras, according to the measurement accuracy requirements of the target, the target baseline is obtained from the baseline data, and then the first image and the second image are obtained according to the target baseline, wherein the first image and the second image are obtained. The second image is captured by the camera group corresponding to the target baseline.
举例来说,若多目摄像机包括N个摄像机,其中,N为正整数,相机编号依次为1,2,…,N,每两个摄像机可组合成一个对应基线长度的双目摄像机组,比如相机1和相机N组成的双目摄像机组的基线为BL1,相机1和相机N-1组成的双目摄像机组的基线为BL2,以此类推,可以获得C_N^2=N(N-1)/2个双目摄像机组,因此基线数据包括C_N^2=N(N-1)/2种双目摄像机的基线长度。上述举例用于说明,本身不对多目摄像机的数量以及每个摄像机组中包含摄像机的数量进行限定。For example, if the multi-camera includes N cameras, where N is a positive integer, and the camera numbers are 1, 2, ..., N in sequence, every two cameras can be combined into a binocular camera group corresponding to the baseline length, such as The baseline of the binocular camera group composed of camera 1 and camera N is BL1, the baseline of the binocular camera group composed of camera 1 and camera N-1 is BL2, and so on, C_N^2=N(N-1) /2 binocular camera groups, so the baseline data includes the baseline lengths of C_N^2=N(N-1)/2 kinds of binocular cameras. The above examples are for illustration, and do not limit the number of multi-camera cameras and the number of cameras included in each camera group.
具体实现中,可根据目标的测量精度需求确定目标基线。具体地,可先确定每组摄像机的第一精度指标和第二精度指标,其中,第一精度指标与每组摄像机的基线长度呈反比例关系,第一精度指标与每组摄像机的共视区域呈正比例关系,第二精度指标与每组摄像机的基线长度和焦距呈正比例关系,共视区域是每组摄像机中的多个摄像机共同拍摄到的区域,然后根据目标的测量精度需求确定第一精度指标和第二精度指标的权重,再根据第一精度指标、第二精度指标和权重,获得所每组摄像机的综合指标,从而根据每组摄像机的综合指标,从而确定目标基线。In the specific implementation, the target baseline can be determined according to the measurement accuracy requirements of the target. Specifically, the first accuracy index and the second accuracy index of each group of cameras may be determined first, wherein the first accuracy index is inversely proportional to the baseline length of each group of cameras, and the first accuracy index is positively related to the common viewing area of each group of cameras. Proportional relationship. The second accuracy index is proportional to the baseline length and focal length of each group of cameras. The common viewing area is the area captured by multiple cameras in each group of cameras, and then the first accuracy index is determined according to the measurement accuracy requirements of the target. and the weight of the second precision index, and then obtain the comprehensive index of each group of cameras according to the first precision index, the second precision index and the weight, so as to determine the target baseline according to the comprehensive index of each group of cameras.
应理解,固定基线的多目摄像机,为了确保测距精度,其测距范围也是固定的,这是因为目标距离摄像机越近,多目摄像机的共视区域趋近于0,其中,共视区域指的是多目摄像机能够同时拍到的区域,此时目标在多目摄像机中的个别摄像机中可能不存在成像点,也就无法计算目标的视差。而目标距离摄像机越远,目标在第一图像和第二图像上所在的区域会越来越模糊,对视差计算造成影响,因此固定基线的多目摄像机拥有固定的测距范围。进一步的,由于多目摄像机的基线长度和共视区域对视差的测量精度有着一定的影响,其中,共视区域指的是多目摄像机能够同时拍到的区域。多目摄像机的基线长度越长,测距精度越大,但是共视区域随着基线长度的变大会逐渐减少,可能会出现目标不在多目摄像机的共视区域内的情况。因此,可根据多目摄像机的共视区域大小以及待测的目标的测量精度需求确定目标基线。It should be understood that for a multi-eye camera with a fixed baseline, in order to ensure the ranging accuracy, its ranging range is also fixed. This is because the closer the target is to the camera, the co-viewing area of the multi-eye camera approaches 0, wherein the co-viewing area is It refers to the area that the multi-camera can capture at the same time. At this time, the target may not have imaging points in individual cameras in the multi-camera, so the parallax of the target cannot be calculated. The farther the target is from the camera, the more blurred the target area on the first image and the second image will be, which will affect the parallax calculation. Therefore, a multi-eye camera with a fixed baseline has a fixed ranging range. Further, since the baseline length of the multi-eye camera and the common viewing area have a certain influence on the measurement accuracy of the parallax, the common viewing area refers to the area that the multi-eye camera can capture at the same time. The longer the baseline length of the multi-eye camera, the greater the ranging accuracy, but the common view area will gradually decrease with the increase of the baseline length, and the target may not be in the common view area of the multi-eye camera. Therefore, the target baseline can be determined according to the size of the common viewing area of the multi-camera and the measurement accuracy requirements of the target to be measured.
其中,待测的目标的测量精度需求可包括目标与多目摄像机的大致远近程度,换句话说,目标的远距离目标或近距离目标。其中,待测的目标是远距离目标还是近距离目标, 可以根据多目摄像机采集的图像中目标所在的图像区域大小来确定,远距离目标在摄像机所采集的图像中所占的图像区域很小,近距离目标在摄像机所采集的图像中所占的图像区域很大,因此当图像区域小于第一阈值时,可以确定待测的目标为近距离目标,图像区域小于第二阈值时,可以确定待测的目标为远距离目标。测量精度需求还可包括待测的目标的测量误差阈值,比如测量误差不小于1米。应理解,上述举例用于说明,本申请不作具体限定。Wherein, the measurement accuracy requirement of the target to be measured may include the approximate distance between the target and the multi-camera camera, in other words, the long-distance target or the short-distance target of the target. Among them, whether the target to be measured is a long-distance target or a short-range target can be determined according to the size of the image area where the target is located in the image collected by the multi-eye camera, and the image area occupied by the long-distance target in the image collected by the camera is very small. , the image area occupied by the close-range target in the image collected by the camera is very large, so when the image area is smaller than the first threshold, it can be determined that the target to be measured is a close-range target, and when the image area is smaller than the second threshold, it can be determined The target to be tested is a long-distance target. The measurement accuracy requirement may also include the measurement error threshold of the target to be measured, for example, the measurement error is not less than 1 meter. It should be understood that the above examples are for illustration, and are not specifically limited in the present application.
其中,多目摄像机采集的基线数据不仅包括每个摄像机组中的多个摄像机之间的基线长度,还可包括多个摄像机之间的共视区域大小,其中,共视区域大小可以根据摄像机组中每个摄像机的拍摄范围确定,该拍摄范围指的是摄像机拍摄的图像中记录的地理区域的范围。具体实现中,可以通过确定每路视频的视频画面中能够显示的最边缘的边缘位置点,确定每个边缘位置点的像素坐标,通过相机标定算法(camera calibration)将其转换为地理坐标后,根据这些地理坐标组成的区域确定该路视频的拍摄范围,进而获得共视区域大小。The baseline data collected by the multi-camera camera not only includes the baseline length between multiple cameras in each camera group, but also includes the size of the common viewing area between the multiple cameras, wherein the size of the common viewing area can be determined according to the camera group. The shooting range of each camera is determined, and the shooting range refers to the range of the geographical area recorded in the image captured by the camera. In the specific implementation, the pixel coordinates of each edge position point can be determined by determining the edge position point that can be displayed in the video picture of each channel of video, and after converting it into geographic coordinates through a camera calibration algorithm, According to the area composed of these geographic coordinates, the shooting range of the video of the road is determined, and then the size of the common viewing area is obtained.
可选地,可向多目摄像机发送携带目标基线的基线调整请求,基线调整请求用于指示多目摄像机调整该多目摄像机中包括的摄像机组的基线长度到上述目标基线,然后再接收由上述目标基线对应的摄像机组拍摄的第一图像和第二图像。Optionally, a baseline adjustment request carrying a target baseline can be sent to the multi-camera camera, and the baseline adjustment request is used to instruct the multi-camera camera to adjust the baseline length of the camera group included in the multi-camera camera to the above-mentioned target baseline, and then receive the above-mentioned target baseline. The first image and the second image captured by the camera group corresponding to the target baseline.
上述实现方式,根据多目摄像机的共视区域大小以及待测的目标的测量精度需求确定目标基线,可以在尽可能提高测量精度的同时,确保目标处于该目标基线对应的双目摄像机的拍摄范围内,解决固定基线的多目摄像机只能对固定测距范围内的目标进行测距的问题,扩大本申请提供的测距定位系统的测距范围。In the above implementation manner, the target baseline is determined according to the size of the common viewing area of the multi-camera camera and the measurement accuracy requirements of the target to be measured, and the measurement accuracy can be improved as much as possible while ensuring that the target is within the shooting range of the binocular camera corresponding to the target baseline. It solves the problem that the fixed baseline multi-eye camera can only measure the distance of the target within the fixed distance measurement range, and expands the distance measurement range of the distance measurement and positioning system provided by the present application.
在第一方面的一种可能的实现方式中,多目摄像机可以对目标进行拍摄获得第一路视频和第二路视频,接收上述第一路视频和第二路视频后,可对第一路视频和第二路视频进行时间同步处理,获得同一时刻的第一图像和第二图像,其中,第一图像是第一路视频中的图像帧,第二图像是第二路视频中的图像帧。In a possible implementation manner of the first aspect, the multi-eye camera can shoot the target to obtain the first channel video and the second channel video, and after receiving the first channel video and the second channel video, the first channel video and the second channel video can be obtained. The video and the second channel of video are subjected to time synchronization processing to obtain the first image and the second image at the same moment, wherein the first image is the image frame in the first channel video, and the second image is the image frame in the second channel video .
具体实现中,可以从第一路视频中获取参考帧,从第二路视频中获取多个运动帧,其中,参考帧和多个运动帧中包括运动物体,然后将参考帧与多个运动帧进行特征点匹配,获得多个运动帧中的同步帧,其中,同步帧中的特征点与参考帧中对应的特征点之间连线的平行度满足预设条件,再根据参考帧和同步帧对第一路视频和第二路视频进行时间同步校正,获得同一时刻的第一图像和第二图像。其中,满足预设条件可以是指将连线之间的平行度最高的帧确定为同步帧。In a specific implementation, a reference frame can be obtained from the first channel of video, and a plurality of motion frames can be obtained from the second channel of video, wherein the reference frame and the plurality of motion frames include moving objects, and then the reference frame and the plurality of motion frames are combined with each other. Perform feature point matching to obtain synchronization frames in multiple motion frames, wherein the parallelism of the connection between the feature points in the synchronization frame and the corresponding feature points in the reference frame satisfies a preset condition, and then according to the reference frame and the synchronization frame The time synchronization correction is performed on the video of the first channel and the video of the second channel to obtain the first image and the second image at the same moment. Wherein, satisfying the preset condition may refer to determining the frame with the highest parallelism between the lines as the synchronization frame.
应理解,由于多目摄像机中每个摄像机的型号、厂家、时间戳、视频的帧率可能是不相同的,并且网络传输时延也可能会导致传输过程中出现丢帧,摄像机计算性能较差时也容易出现丢帧,因此多个摄像机采集的多路视频难以保证时间同步性。举例来说,摄像头1和摄像头2监控同一个路口,摄像头1由于在T1时刻的对闯红灯的车辆进行了抓拍,使得摄像头1传输的实时视频流自抓拍时刻T1之后20ms内产生的视频帧均丢失,摄像头2没有进行抓拍,也没有出现丢帧的情况,因此目标定位系统110接收到的第一路视频和第二路视频后,自T1时刻起,摄像头2采集的视频比摄像头1采集的视频快20ms,若直接将摄像头1和摄像头2采集的第一路视频和第二路视频进行视差计算,获得的视差信息存在误差,进而导致后续测距定位、三维重建等应用存在障碍,对第一路图像和第二路图像 进行时间同步可以解决该问题。It should be understood that because the model, manufacturer, timestamp, and frame rate of the video may be different for each camera in the multi-camera, and the network transmission delay may also cause frame loss during the transmission process, the calculation performance of the camera is poor. Frame loss is also prone to occur, so it is difficult to ensure the time synchronization of multi-channel video collected by multiple cameras. For example, camera 1 and camera 2 monitor the same intersection. Because camera 1 captures the vehicle running the red light at time T1, the real-time video stream transmitted by camera 1 is lost within 20ms after the capture time T1. , the camera 2 did not capture, and there was no frame loss. Therefore, after the first video and the second video received by the target positioning system 110, from the moment T1, the video collected by the camera 2 is higher than the video collected by the camera 1. It is 20ms faster. If the disparity calculation is performed directly on the first video and the second video collected by camera 1 and camera 2, there will be errors in the obtained disparity information, which will lead to obstacles in subsequent applications such as ranging and 3D reconstruction. Time synchronization of the channel image and the second channel image can solve this problem.
具体实现中,可通过光流法确定上述参考帧和运动帧。其中,光流指的是空间运动物体在观察成像平面上的像素运动的瞬时速度,在时间间隔很小时,光流也可等同于空间运动物体的位移。基于此,确定参考帧和运动帧的步骤流程可以如下:先对第一路视频和第二路视频中的每帧图像进行同步目标的目标检测,获得每帧图像中的一个或者多个同步目标,然后通过光流法确定每个同步目标的光流,通过每帧图像中的每个同步目标的光流来判断其是否为运动物体,从而获得包含运动物体的运动帧,以及包含运动物体数量最多参考帧。In a specific implementation, the above-mentioned reference frame and motion frame may be determined by an optical flow method. Among them, the optical flow refers to the instantaneous speed of the pixel motion of the space moving object on the observation imaging plane. When the time interval is very small, the optical flow can also be equivalent to the displacement of the space moving object. Based on this, the step flow of determining the reference frame and the motion frame can be as follows: first, perform target detection of the synchronization target on each frame of the first video and the second video, and obtain one or more synchronization targets in each frame of the image. , and then determine the optical flow of each synchronization target by the optical flow method, and determine whether it is a moving object by the optical flow of each synchronization target in each frame of image, so as to obtain the moving frame containing the moving object and the number of moving objects. Up to reference frames.
值得注意的是,对每帧图像进行目标检测时,检测出的同步目标应是有运动可能的目标,而不能是一定静止的目标,比如建筑物。因此同步目标可以是前述内容中待测的目标,也可以是其他目标,本申请不作具体限定。举例来说,如果待测的目标是某一电线杆,那么用于实现时间同步的同步目标可以是行人或者车辆,如果待测的目标是车辆A,那么用于实现时间同步的同步目标可以是车辆和行人,上述例子用于举例说明,本申请不对此进行限定。It is worth noting that when the target is detected for each frame of image, the detected synchronization target should be a target that may move, not a certain stationary target, such as a building. Therefore, the synchronization target may be the target to be measured in the foregoing content, or may be other targets, which are not specifically limited in this application. For example, if the target to be tested is a certain utility pole, the synchronization target used to achieve time synchronization can be a pedestrian or a vehicle; if the target to be measured is vehicle A, then the synchronization target used to achieve time synchronization can be Vehicles and pedestrians, the above examples are used for illustration and are not limited in this application.
需要说明的,本申请实施例中的目标检测算法可采用业界已有的用于目标检测具有较优效果的神经网络模型中的任意一种,例如:一阶段统一实时目标检测(You Only Look Once:Unified,Real-Time Object Detection,Yolo)模型、单镜头多盒检测器(Single Shot multi box Detector,SSD)模型、区域卷积神经网络(Region ConvolutioNal Neural Network,RCNN)模型或快速区域卷积神经网络(Fast Region Convolutional Neural Network,Fast-RCNN)模型等,本申请不作具体限定。并且,本申请实施例中的光流法也可采用业界已有的用于计算光流具有较优效果的光流法中的任意一种,例如Lucas-Kanade(LK)光流法,本申请不作具体限定。It should be noted that the target detection algorithm in this embodiment of the present application may use any of the existing neural network models for target detection with better effects in the industry, for example: one-stage unified real-time target detection (You Only Look Once : Unified, Real-Time Object Detection, Yolo) model, Single Shot multi box Detector (SSD) model, Region ConvolutioNal Neural Network (RCNN) model or fast regional convolutional neural network Network (Fast Region Convolutional Neural Network, Fast-RCNN) models, etc., are not specifically limited in this application. In addition, the optical flow method in the embodiments of the present application can also use any one of the optical flow methods that are already available in the industry for calculating optical flow with better effects, such as the Lucas-Kanade (LK) optical flow method. There is no specific limitation.
可选地,可在获得每一帧中每个物体的光流(即物体的瞬时速度)后,通过确定物体的速度在图像行方向上是否有速度分量判断该物体是否是运动物体,具体地,由于多目摄像机(例如图1所示的多目摄像机)固定于同一高度,若物体在行方向进行运动,物体的行坐标将发生变化,因此若运动帧Tn中的物体X与前一帧Tn-1(或者后一帧Tn+1)中同一物体X的行坐标不相等,即可确定该物体为运动物体。可以理解的是,垂直运动的物体只在列方向进行运动,该类运动物体在图像行方向上没有速度分量,只在列方向上有速度分量,因此垂直运动的物体对视差计算没有贡献,在可将垂直运动物体也当做非运动物体,不参与视差计算,从而减少计算量,提高视差计算的精度和效率。Optionally, after obtaining the optical flow of each object in each frame (that is, the instantaneous speed of the object), it can be determined whether the object is a moving object by determining whether the speed of the object has a speed component in the direction of the image line, specifically, Since the multi-camera (such as the multi-camera shown in Figure 1) is fixed at the same height, if the object moves in the row direction, the row coordinates of the object will change, so if the object X in the motion frame Tn is the same as the previous frame Tn If the row coordinates of the same object X in -1 (or the next frame Tn+1) are not equal, it can be determined that the object is a moving object. It can be understood that the vertically moving object only moves in the column direction. This type of moving object has no velocity component in the image row direction, but only has a velocity component in the column direction. Therefore, the vertically moving object does not contribute to the disparity calculation. The vertically moving object is also regarded as a non-moving object and does not participate in the parallax calculation, thereby reducing the amount of calculation and improving the accuracy and efficiency of the parallax calculation.
进一步地,将参考帧与运动帧进行匹配时,可将参考帧中的运动物体与运动帧中的运动物体做特征点的匹配,计算出各个特征点行坐标的差值Δs,Δs越小表示参考帧中的运动物体与运动帧中的运动物体越接近,参考帧中的特征点与运动帧中特征点之间的连线就越平行,如果Δs为0,表示两帧同步,若不为0,则表示两帧不同步,因此可根据Δs计算出准确的同步偏移时间Δt,该同步偏移时间Δt可作为后续每一帧行坐标的补偿,从而获得同步后的第一路视频和第二路视频,进而获得每一时刻下的第一图像和第二图像,比如第一图像和第二图像。Further, when the reference frame is matched with the motion frame, the moving object in the reference frame and the moving object in the motion frame can be matched with the feature points, and the difference value Δs of the row coordinates of each feature point can be calculated. The closer the moving object in the reference frame is to the moving object in the moving frame, the more parallel the line between the feature point in the reference frame and the feature point in the moving frame is. If Δs is 0, it means that the two frames are synchronized. 0, it means that the two frames are not synchronized, so the accurate synchronization offset time Δt can be calculated according to Δs, and the synchronization offset time Δt can be used as the compensation for the row coordinates of each subsequent frame, so as to obtain the synchronized first video and The second channel video, and then obtain the first image and the second image at each moment, such as the first image and the second image.
可选地,在对第一路视频和第二路视频进行视频同步处理之前,还可对第一路视频和 第二路视频进行立体校正。应理解,计算视差时使用的公式往往是在假设多目摄像机处于理想境况时推导得出的,所以在使用多目摄像机进行测距定位之前,要将实际使用的多目摄像机校正为理想状态。以双目摄像机为例,经过立体校正后的双目摄像机,其左右摄像机的图像平面平行,光轴和图像平面垂直,极点位于无限远处,此时的点(x0,y0)对应的极线为y=y0。具体实现中,本申请实施例可采用业界已有的具有较优效果的立体校正方法中的任意一种,例如bouguet极线校正方法,本申请不作具体限定。Optionally, before performing video synchronization processing on the first channel of video and the second channel of video, stereoscopic correction may also be performed on the first channel of video and the second channel of video. It should be understood that the formula used for calculating the parallax is often derived under the assumption that the multi-camera is in an ideal situation, so before using the multi-camera for ranging and positioning, the actually used multi-camera should be corrected to an ideal state. Taking a binocular camera as an example, after stereo correction, the image planes of the left and right cameras are parallel, the optical axis is perpendicular to the image plane, and the pole is located at infinity. At this time, the polar line corresponding to the point (x0, y0) is y=y0. In specific implementation, the embodiment of the present application may adopt any one of the existing stereo correction methods with better effects in the industry, such as the bouguet polar line correction method, which is not specifically limited in the present application.
上述实现方式,对第一路视频和第二路视频进行时间同步处理获得第一图像和第二图像,然后根据第一图像和第二图像进行目标的位置信息的确定,可以提高位置信息的准确度,进而提高后续AR、VR、三维重建等应用的准确性。In the above-mentioned implementation manner, the time synchronization processing is performed on the first video and the second video to obtain the first image and the second image, and then the determination of the position information of the target is carried out according to the first image and the second image, which can improve the accuracy of the position information. This improves the accuracy of subsequent applications such as AR, VR, and 3D reconstruction.
在第一方面的一种可能的实现方式中,可以将第一图像输入检测匹配模型,获得第一图像的第一检测匹配结果,将第二图像输入检测匹配模型,获得第二图像的第二检测匹配结果,其中,第一检测匹配结果和第二检测匹配结果包括目标框和标签,目标框用于指示目标在图像中的区域,同一目标的标签相同,根据第一检测匹配结果获得第一目标区域,根据第二检测匹配结果获得第二目标区域。In a possible implementation manner of the first aspect, the first image may be input into the detection and matching model to obtain a first detection and matching result of the first image, and the second image may be input into the detection and matching model to obtain the second image of the second image. Detection and matching results, wherein the first detection and matching results and the second detection and matching results include a target frame and a label, the target frame is used to indicate the area of the target in the image, the label of the same target is the same, and the first detection and matching result is obtained. The target area is obtained according to the second detection and matching result.
其中,目标检测匹配结果中的目标框可以是矩形框、圆形框、椭圆框等等,本申请不作具体限定。应理解,如果待测目标的数量为多个,检测匹配结果可包括多个目标的多个目标框,因此在检测匹配结果中,可对相同目标使用相同的标签进行标识,不同的目标使用不同的标签进行标识,这样,在对目标进行视差计算时,可根据标签辨别不同路视频帧中的同一目标,从而达到将同一时刻下的第一图像和第二图像中的同一目标进行特征点匹配的目的,进而获得该目标的视差。The target frame in the target detection matching result may be a rectangular frame, a circular frame, an oval frame, etc., which is not specifically limited in this application. It should be understood that if the number of targets to be tested is multiple, the detection and matching results may include multiple target frames of multiple targets. Therefore, in the detection and matching results, the same target may be identified by the same label, and different targets may be identified using different In this way, when the disparity calculation is performed on the target, the same target in different video frames can be identified according to the label, so as to achieve the feature point matching of the same target in the first image and the second image at the same moment. , and then obtain the parallax of the target.
可选地,该检测匹配模型可包括特征提取模块和检测匹配模块。其中,特征提取模块用于提取输入的第一图像和第二图像中的特征,生成高维度的特征向量,检测匹配模块620用于根据上述特征向量,生成包含目标框和标签的检测匹配结果。Optionally, the detection matching model may include a feature extraction module and a detection matching module. The feature extraction module is used to extract the features in the input first image and the second image, and generate a high-dimensional feature vector, and the detection matching module 620 is used to generate a detection matching result including a target frame and a label according to the above-mentioned feature vector.
可选地,在获取第一图像和第二图像之前,可使用样本集对检测匹配模型进行训练,该样本集可包括第一图像样本、第二图像样本以及对应的样本真值,该样本真值包括目标检测真值和目标匹配真值,其中,目标检测真值包括第一图像样本和第二图像样本中目标的目标框,目标匹配真值包括第一图像样本和第二图像样本中目标的标签,使用该样本集对上述检测匹配模型进行训练时,用于反向传播的检测匹配损失是根据检测匹配模块的输出值与样本真值之间的差距确定的,根据该检测匹配损失对检测匹配模型的参数进行调整,直至上述检测匹配损失达到阈值后,获得训练好的检测匹配模型。Optionally, before acquiring the first image and the second image, a sample set may be used to train the detection matching model, and the sample set may include the first image sample, the second image sample, and The value includes the true value of target detection and the true value of target matching, wherein the true value of target detection includes the target frame of the target in the first image sample and the second image sample, and the true value of target matching includes the target in the first image sample and the second image sample. When using this sample set to train the above detection matching model, the detection matching loss used for backpropagation is determined according to the gap between the output value of the detection matching module and the true value of the sample. According to the detection matching loss pair The parameters of the detection and matching model are adjusted until the above-mentioned detection and matching loss reaches the threshold, and then the trained detection and matching model is obtained.
具体实现中,特征提取模块可以是VGG、Resnet等用于提取图像特征的神经网络骨干结构,上述检测匹配模块可以是目标检测网络,比如YOLO网络、SSD网络、RCNN等等,本申请不作具体限定。In the specific implementation, the feature extraction module can be a neural network backbone structure such as VGG, Resnet, etc. for extracting image features. The above detection and matching module can be a target detection network, such as YOLO network, SSD network, RCNN, etc., which is not specifically limited in this application. .
上述实现方式,通过将相同的目标标记为相同标签的方式,可以将第一图像和第二图像输入检测匹配模型后,根据标签是否相同确定第一目标区域和第二目标区域,而不是对目标进行图像识别来确定第一图像和第二图像中的同一目标,可以减少计算复杂度,提高第一目标区域和第二目标区域的获取效率,进而提高测距定位的效率。In the above implementation manner, by marking the same target with the same label, after the first image and the second image are input into the detection matching model, the first target area and the second target area can be determined according to whether the labels are the same, rather than the target area. Performing image recognition to determine the same target in the first image and the second image can reduce computational complexity, improve the acquisition efficiency of the first target area and the second target area, and further improve the efficiency of ranging and positioning.
第二方面,提供了一种目标定位系统,该系统包括:获取单元,用于获取第一图像和 第二图像,第一图像和第二图像是多目摄像机在同一时刻对同一目标进行拍摄获得;检测匹配单元,用于对第一图像和第二图像进行目标检测和匹配,获得第一图像的第一目标区域和第二图像的第二目标区域,其中,第一目标区域和第二目标区域包括目标;检测匹配单元,用于对第一目标区域和第二目标区域进行特征点检测和匹配,获得特征点匹配结果,其中,特征点匹配结果包括第一目标区域中的特征点与第二目标区域中的特征点之间的对应关系,存在对应关系的特征点描述目标的同一个特征;位置确定单元,用于根据特征点匹配结果和多目摄像机的参数信息,确定目标的位置信息。In a second aspect, a target positioning system is provided, the system includes: an acquisition unit, configured to acquire a first image and a second image, the first image and the second image are obtained by shooting the same target at the same time by a multi-camera camera ; Detection and matching unit, for carrying out target detection and matching to the first image and the second image, to obtain the first target area of the first image and the second target area of the second image, wherein, the first target area and the second target area The area includes a target; the detection and matching unit is used to detect and match feature points on the first target area and the second target area, and obtain a feature point matching result, wherein the feature point matching result includes the feature point in the first target area and the first target area. The corresponding relationship between the feature points in the two target areas, the feature points with the corresponding relationship describe the same feature of the target; the position determination unit is used to determine the position information of the target according to the matching result of the feature points and the parameter information of the multi-camera camera .
实施第二方面描述的系统,可根据待测的目标确定目标基线,使用该目标基线的摄像机组对目标进行采集获得第一图像和第二图像,然后对第一图像和第二图像进行目标检测和匹配,获得目标所在的第一目标区域和第二目标区域,最后对第一目标区域和第二目标区域进行特征点的检测和匹配,获得特征点匹配结果,根据特征点匹配结果确定每个特征点的视差信息,从而确定目标的位置信息。该系统可根据待测的目标灵活选择目标基线的摄像机组进行数据采集,避免由于固定基线的多目摄像机带来的测距范围受限的问题,提高目标定位系统的测距范围,同时,该系统根据特征点的视差信息确定目标的位置信息,无需对第一图像区域和第二图像区域中每个像素进行匹配和视差计算,从而降低定位测距时所需的计算资源,同时避免背景干扰、噪声等问题,提升测距定位的精度。By implementing the system described in the second aspect, a target baseline can be determined according to the target to be measured, and the camera group using the target baseline can collect the target to obtain a first image and a second image, and then perform target detection on the first image and the second image. and matching, obtain the first target area and the second target area where the target is located, and finally detect and match the feature points of the first target area and the second target area to obtain the feature point matching result, and determine each feature point matching result according to the feature point matching result. The disparity information of the feature points is used to determine the position information of the target. The system can flexibly select the target baseline camera group for data collection according to the target to be measured, avoid the problem of limited ranging range caused by fixed baseline multi-eye cameras, and improve the ranging range of the target positioning system. The system determines the location information of the target according to the parallax information of the feature points, and does not need to perform matching and parallax calculation for each pixel in the first image area and the second image area, thereby reducing the computational resources required for positioning and ranging, and avoiding background interference. , noise and other problems, improve the accuracy of ranging and positioning.
在第二方面的一种可能的实现方式中,参数信息至少包括多目摄像机的基线长度和多目摄像机的焦距;位置确定单元,用于根据特征点匹配结果中存在对应关系的特征点之间的像素差异,获得目标的视差信息,视差信息包括第一目标区域中的特征点的像素坐标与第二目标区域中存在对应关系的特征点的像素坐标之间的差距;位置确定单元,用于根据目标的视差信息、多目摄像机的基线长度以及多目摄像机的焦距,确定目标与摄像机之间的距离,获得目标的位置信息。In a possible implementation manner of the second aspect, the parameter information includes at least the baseline length of the multi-camera camera and the focal length of the multi-camera camera; the position determination unit is configured to match the feature points according to the feature points with corresponding relationships between the feature points. The disparity information of the target is obtained, and the disparity information includes the difference between the pixel coordinates of the feature points in the first target area and the pixel coordinates of the feature points that have a corresponding relationship in the second target area; the position determination unit is used for According to the parallax information of the target, the baseline length of the multi-eye camera and the focal length of the multi-eye camera, the distance between the target and the camera is determined, and the position information of the target is obtained.
在第二方面的一种可能的实现方式中,多目摄像机包括多个摄像机组,多个摄像机组中的每组摄像机包括多个摄像机,系统还包括基线确定单元,基线确定单元,用于获取多目摄像机的基线数据,基线数据包括每组摄像机中的多个摄像机之间的基线长度;基线确定单元,用于根据目标的测量精度需求,从基线数据中获取目标基线;获取单元,用于根据目标基线获取第一图像和第二图像,其中,第一图像和第二图像是目标基线对应的摄像机组拍摄获得。In a possible implementation manner of the second aspect, the multi-camera camera includes multiple camera groups, each camera group in the multiple camera groups includes multiple cameras, and the system further includes a baseline determination unit, a baseline determination unit for acquiring Baseline data of multi-camera cameras, the baseline data includes the baseline length between multiple cameras in each group of cameras; the baseline determination unit is used to obtain the target baseline from the baseline data according to the measurement accuracy requirements of the target; the acquisition unit is used for The first image and the second image are acquired according to the target baseline, wherein the first image and the second image are captured by a camera group corresponding to the target baseline.
在第二方面的一种可能的实现方式中,基线确定单元,用于向多目摄像机发送携带目标基线的基线调整请求,基线调整请求用于指示多目摄像机调整其包括的摄像机组的基线长度到目标基线;获取单元,用于接收目标基线对应的摄像机组拍摄的第一图像和第二图像。In a possible implementation manner of the second aspect, the baseline determination unit is configured to send a baseline adjustment request carrying a target baseline to the multi-camera camera, where the baseline adjustment request is used to instruct the multi-camera camera to adjust the baseline length of the camera group it includes to the target baseline; an acquisition unit configured to receive the first image and the second image captured by the camera group corresponding to the target baseline.
在第二方面的一种可能的实现方式中,基线确定单元,用于确定每组摄像机的第一精度指标和第二精度指标,其中,第一精度指标与每组摄像机的基线长度呈反比例关系,第一精度指标与每组摄像机的共视区域呈正比例关系,第二精度指标与每组摄像机的基线长度和焦距呈正比例关系,共视区域是每组摄像机中的多个摄像机共同拍摄到的区域;基线确定单元,用于根据目标的测量精度需求确定第一精度指标和第二精度指标的权重;基线确定单元,用于根据第一精度指标、第二精度指标和权重,获得所每组摄像机的综合指标; 基线确定单元,用于根据每组摄像机的综合指标,确定目标基线。In a possible implementation manner of the second aspect, a baseline determination unit is configured to determine a first accuracy index and a second accuracy index of each group of cameras, wherein the first accuracy index is inversely proportional to the baseline length of each group of cameras , the first accuracy index is proportional to the common viewing area of each group of cameras, the second accuracy index is proportional to the baseline length and focal length of each group of cameras, and the common viewing area is captured by multiple cameras in each group of cameras. area; a baseline determination unit for determining the weights of the first accuracy index and the second accuracy index according to the measurement accuracy requirements of the target; a baseline determination unit for obtaining each group of The comprehensive index of the camera; the baseline determination unit, which is used to determine the target baseline according to the comprehensive index of each group of cameras.
在第二方面的一种可能的实现方式中,该系统还包括同步单元,同步单元,用于接收多目摄像机对目标进行拍摄获得的第一路视频和第二路视频;同步单元,用于对第一路视频和第二路视频进行时间同步处理,获得同一时刻的第一图像和第二图像,其中,第一图像是第一路视频中的图像帧,第二图像是第二路视频中的图像帧。In a possible implementation manner of the second aspect, the system further includes a synchronization unit, which is used for receiving the first channel video and the second channel video obtained by shooting the target with the multi-camera camera; the synchronization unit is used for Perform time synchronization processing on the video of the first channel and the video of the second channel to obtain the first image and the second image at the same moment, wherein the first image is the image frame in the video of the first channel, and the second image is the video of the second channel image frame in .
在第二方面的一种可能的实现方式中,同步单元,用于从第一路视频中获取参考帧,从第二路视频中获取多个运动帧,其中,参考帧和多个运动帧中包括运动物体;同步单元,用于将参考帧与多个运动帧进行特征点匹配,获得多个运动帧中的同步帧,其中,同步帧中的特征点与参考帧中对应的特征点之间连线的平行度满足预设条件;同步单元,用于根据参考帧和同步帧对第一路视频和第二路视频进行时间同步校正,获得同一时刻的第一图像和第二图像。In a possible implementation manner of the second aspect, the synchronization unit is configured to obtain a reference frame from the first channel of video, and obtain multiple motion frames from the second channel of video, wherein the reference frame and the multiple motion frames are Including a moving object; a synchronization unit is used to match the reference frame with the feature points of a plurality of motion frames, and obtain a synchronization frame in the plurality of motion frames, wherein, between the feature points in the synchronization frame and the corresponding feature points in the reference frame The parallelism of the connection line satisfies the preset condition; the synchronization unit is used to perform time synchronization correction on the video of the first channel and the video of the second channel according to the reference frame and the synchronization frame, and obtain the first image and the second image at the same moment.
在第二方面的一种可能的实现方式中,检测匹配单元,用于将第一图像输入检测匹配模型,获得第一图像的第一检测匹配结果,将第二图像输入检测匹配模型,获得第二图像的第二检测匹配结果,其中,第一检测匹配结果和第二检测匹配结果包括目标框和标签,目标框用于指示目标在图像中的区域,同一目标的标签相同;检测匹配单元,用于根据第一检测匹配结果获得第一目标区域,根据第二检测匹配结果获得第二目标区域。In a possible implementation manner of the second aspect, the detection and matching unit is configured to input the first image into the detection and matching model, obtain a first detection and matching result of the first image, input the second image into the detection and matching model, and obtain the first detection and matching result of the first image. The second detection and matching results of the two images, wherein the first detection and matching results and the second detection and matching results include a target frame and a label, and the target frame is used to indicate the area of the target in the image, and the labels of the same target are the same; the detection and matching unit, It is used to obtain the first target area according to the first detection and matching result, and obtain the second target area according to the second detection and matching result.
第三方面,提供了一种计算机程序产品,包括计算机程序,当计算机程序被计算设备读取并执行时,实现如第一方面所描述的方法。In a third aspect, a computer program product is provided, comprising a computer program that, when read and executed by a computing device, implements the method described in the first aspect.
第四方面,提供了一种计算机可读存储介质,包括指令,当指令在计算设备上运行时,使得计算设备实现如第一方面描述的方法。In a fourth aspect, a computer-readable storage medium is provided, comprising instructions that, when executed on a computing device, cause the computing device to implement the method as described in the first aspect.
第五方面,提供了一种计算设备,包括处理器和存储器,处理器执行存储器中的代码实现如第一方面描述的方法。In a fifth aspect, a computing device is provided, including a processor and a memory, where the processor executes code in the memory to implement the method described in the first aspect.
附图说明Description of drawings
图1是一种基于双目摄像机成像结构示意图;1 is a schematic diagram of an imaging structure based on a binocular camera;
图2是本申请提供的一种立体视觉系统的架构图;Fig. 2 is the architecture diagram of a kind of stereo vision system provided by the application;
图3是本申请提供的一种目标定位系统的部署示意图;3 is a schematic diagram of the deployment of a target positioning system provided by the present application;
图4是本申请提供的固定目标不同基线的双目摄像机的测距误差示意图;4 is a schematic diagram of ranging errors of binocular cameras with different baselines of fixed targets provided by the present application;
图5是一种基线过长的双目摄像机在拍摄目标点时的成像示意图;Fig. 5 is the imaging schematic diagram of a kind of binocular camera with too long baseline when shooting the target point;
图6是本申请提供的一种目标定位方法的步骤流程示意图;6 is a schematic flowchart of steps of a target positioning method provided by the present application;
图7是本申请提供的第一路视频和第二路视频进行时间同步的步骤流程意图;Fig. 7 is the step flow diagram of the first video and the second video provided by the application to perform time synchronization;
图8是本申请提供的目标检测匹配结果的示例图;8 is an example diagram of a target detection matching result provided by the present application;
图9是本申请提供的目标检测模型的结构示意图;9 is a schematic structural diagram of a target detection model provided by the present application;
图10是本申请提供的特征点检测和匹配的流程示意图;10 is a schematic flowchart of feature point detection and matching provided by the application;
图11是本申请提供的一种目标定位方法在一应用场景下的特征点匹配结果示意图;11 is a schematic diagram of a feature point matching result of a target positioning method provided by the present application in an application scenario;
图12是本申请提供的一种无纹理物体的示意图;12 is a schematic diagram of a textureless object provided by the present application;
图13是本申请提供的一种遮挡场景下的确定第一目标区域和第二目标区域的步骤流程示意图;13 is a schematic flowchart of steps for determining a first target area and a second target area under an occlusion scenario provided by the present application;
图14是本申请提供的一种目标定位系统的结构示意图;14 is a schematic structural diagram of a target positioning system provided by the present application;
图15是本申请提供的一种计算设备的结构示意图。FIG. 15 is a schematic structural diagram of a computing device provided by the present application.
具体实施方式Detailed ways
本申请的实施方式部分使用的术语仅用于对本申请的具体实施例进行解释,而非旨在限定本申请。The terms used in the embodiments of the present application are only used to explain specific embodiments of the present application, and are not intended to limit the present application.
下面对本申请涉及的应用场景进行说明。The application scenarios involved in this application are described below.
立体视觉定位指的是通过图像传感器获取的视频或图像信息来确定图像中的物体在三维世界空间中的位置。人们可以对图像传感器采集的视频信息进行分析,实现目标坐标定位、目标测距、三维重建等,并将结果反馈到终端或者云端处理器中,服务于更加丰富的应用,比如智能安防、自动驾驶、工业检测、智慧交通、AR、VR、ADAS、医学等。Stereoscopic positioning refers to determining the position of objects in the image in the three-dimensional world space through the video or image information obtained by the image sensor. People can analyze the video information collected by the image sensor to realize target coordinate positioning, target ranging, 3D reconstruction, etc., and feed the results back to the terminal or cloud processor to serve more abundant applications, such as intelligent security and autonomous driving. , industrial testing, intelligent transportation, AR, VR, ADAS, medicine, etc.
立体视觉定位通常使用双目摄像机对目标进行拍摄获得目标的多路视频后,先根据多路视频确定视差(parallax),其中,视差指的是从有一定距离的两个观察点上观察同一个目标所产生的方向差异,根据摄像机之间的距离(即基线长度)和视差,可计算出目标和摄像机之间的距离,从而获得目标在三维世界空间中的精准位置。具体实现中,视差可以是目标在不同摄像机所拍摄的图像中的像素坐标的差异。举例来说,假设双目摄像机包括左侧相机和右侧相机,若目标X在左侧相机拍摄的图片中的像素坐标为(x,y),在右侧相机拍摄的图片中的像素坐标为(x+d,y),那么d即为目标X在水平方向的视差。应理解,上述举例用于说明,本申请不作具体限定。Stereo vision positioning usually uses a binocular camera to shoot the target to obtain a multi-channel video of the target, first determine the parallax according to the multi-channel video, where the parallax refers to observing the same object from two observation points with a certain distance. The direction difference generated by the target, according to the distance between the cameras (ie the baseline length) and the parallax, can calculate the distance between the target and the camera, so as to obtain the accurate position of the target in the three-dimensional world space. In a specific implementation, the parallax may be the difference in pixel coordinates of the target in images captured by different cameras. For example, assuming that the binocular camera includes a left camera and a right camera, if the pixel coordinates of the target X in the picture captured by the left camera are (x, y), the pixel coordinates in the picture captured by the right camera are (x+d,y), then d is the parallax of the target X in the horizontal direction. It should be understood that the above examples are for illustration, and are not specifically limited in the present application.
下面结合图1,对上述根据基线长度和视差,确定目标和摄像机之间距离的方法进行举例说明,如图1所示,图1是一种基于双目摄像机成像结构示意图。其中,双目摄像机的两个摄像机的焦距、成像平面长度等摄像机参数均相同,P是待检测的目标点,O L是该双目摄像机的左侧相机的光心,O R是右侧相机的光心,线段AB是左侧相机的成像平面,线段CD是右侧相机的成像平面,线段O LO R即为该双目摄像机的基线,长度为b,成像平面AB与光心O L之间的距离即为双目摄像机的焦距f。 The method for determining the distance between the target and the camera according to the baseline length and parallax will be illustrated below with reference to FIG. 1 . As shown in FIG. 1 , FIG. Among them, the camera parameters such as the focal length and imaging plane length of the two cameras of the binocular camera are the same, P is the target point to be detected, O L is the optical center of the left camera of the binocular camera, and OR is the right camera. The optical center of the camera, the line segment AB is the imaging plane of the left camera, the line segment CD is the imaging plane of the right camera, the line segment O L O R is the baseline of the binocular camera, the length is b, the imaging plane AB and the optical center O L The distance between them is the focal length f of the binocular camera.
成像平面AB上的点P L即为目标点P在左侧相机的成像,成像平面CD上的点P R即为目标点P在右侧相机的成像,成像P L与成像平面AB最左侧的边缘的点A之间的距离X L即为目标点P在左侧相机所拍摄图像的图像横坐标,成像P R与成像平面CD最左侧的边缘的点C之间的距离X R即为目标点P在右侧相机所拍摄图像的图像横坐标,那么待检测的目标点P的视差即为(X L-X R)。 The point PL on the imaging plane AB is the imaging of the target point P on the left camera, the point P R on the imaging plane CD is the imaging of the target point P on the right camera, and the imaging PL and the imaging plane AB are the leftmost The distance XL between the points A of the edge of the target point P is the image abscissa of the image captured by the camera on the left side of the target point P, and the distance X R between the imaging P R and the point C on the leftmost edge of the imaging plane CD is is the image abscissa of the image captured by the camera on the right side of the target point P, then the parallax of the target point P to be detected is (X L -X R ).
由于成像平面AB、CD和基线O LO R平行,因此三角形PO LO R与三角形PP LP R相似,假设目标P与两个摄像机的基线之间的垂直距离为z,根据三角形相似定理可获得如下等式: Since the imaging planes AB, CD and the baseline O L O R are parallel, the triangle PO L O R is similar to the triangle PP L P R. Assuming that the vertical distance between the target P and the baselines of the two cameras is z, according to the triangle similarity theorem, Obtain the following equation:
Figure PCTCN2021139421-appb-000001
Figure PCTCN2021139421-appb-000001
由于双目摄像机的两个摄像机的参数是相同的,因此CD=AB,所以可以获得如下等式:Since the parameters of the two cameras of the binocular camera are the same, so CD=AB, the following equation can be obtained:
Figure PCTCN2021139421-appb-000002
Figure PCTCN2021139421-appb-000002
Figure PCTCN2021139421-appb-000003
Figure PCTCN2021139421-appb-000003
其中,X L-X R即为视差,b为基线长度,f为摄像机的焦距,因此根据多目摄像机的视差、基线长度和焦距可以获得目标与摄像机之间的距离z。应理解,图1所示的推导过程计算过程用于举例说明,本申请不对根据视差确定目标与摄像机之间的举例的具体算法进行限定。 Among them, X L - X R is the parallax, b is the baseline length, and f is the focal length of the camera, so the distance z between the target and the camera can be obtained according to the parallax, baseline length and focal length of the multi-eye camera. It should be understood that the calculation process of the derivation process shown in FIG. 1 is used for illustration, and the present application does not limit the specific algorithm for determining the relationship between the target and the camera according to the parallax.
但是,当前立体视觉算法确定目标和摄像机之间的距离时,由于目标在多路图像中并不是一个点,而是一片图像区域,因此需要确定区域内每个像素的视差,根据每个像素的视差和多目摄像机的基线长度确的目标和摄像机之间的距离,该过程不仅消耗巨额计算资源,而且大量的计算中容易出现噪声、计算错误、背景干扰等问题,使得测距定位的精度无法保证,进而影响后续三维重建、自动驾驶、安防监控等应用。However, when the current stereo vision algorithm determines the distance between the target and the camera, since the target is not a point in the multi-channel image, but an image area, it is necessary to determine the disparity of each pixel in the area. The parallax and the baseline length of the multi-camera are accurate to determine the distance between the target and the camera. This process not only consumes huge computing resources, but also is prone to problems such as noise, calculation errors, and background interference in a large number of calculations, making the accuracy of ranging and positioning impossible. This will affect subsequent 3D reconstruction, automatic driving, security monitoring and other applications.
为了解决上述立体视觉算法进行测距定位时消耗巨额计算资源且测距定位精度差的问题,本申请提供了一种目标定位系统,可根据待测的目标灵活设置多目摄像机的基线长度,从而解决测距范围小的问题,通过对多目摄像机拍摄到的第一图像和第二图像进行目标检测和匹配,确定第一图像和第二图像中该目标所在的目标区域,然后对该目标所在的目标区域进行特征点检测和匹配,根据每个目标区域中的特征点与其他目标区域中特征点之间的像素差异确定目标的视差信息,而无需对目标区域中的每个像素进行匹配,从而降低测距定位所需的计算资源并且避免了背景图像对于目标视差计算的干扰,提高视差计算准确度,提升测距定位的精度。In order to solve the problem that the above-mentioned stereo vision algorithm consumes a huge amount of computing resources and the accuracy of ranging and positioning is poor, the present application provides a target positioning system, which can flexibly set the baseline length of the multi-eye camera according to the target to be measured, thereby To solve the problem of small ranging range, by performing target detection and matching on the first image and the second image captured by the multi-camera, determine the target area where the target is located in the first image and the second image, and then determine the target area where the target is located. The target area of the target area is detected and matched, and the disparity information of the target is determined according to the pixel difference between the feature points in each target area and the feature points in other target areas, without matching each pixel in the target area, Therefore, the computing resources required for ranging and positioning are reduced, the interference of the background image on the target parallax calculation is avoided, the accuracy of the parallax calculation is improved, and the accuracy of the ranging and positioning is improved.
图2是本申请实施例的一种系统架构图,如图2所示,本申请提供的用于进行立体视觉定位的系统架构可包括目标定位系统110、多目摄像机120以及应用服务器130。其中,目标定位系统110与多目摄像机120通过网络连接,应用服务器130与目标定位系统110通过网络连接,上述网络可以是无线网络也可以是有线网络,本申请不作具体限定。FIG. 2 is a system architecture diagram of an embodiment of the present application. As shown in FIG. 2 , the system architecture for stereo vision positioning provided by the present application may include a target positioning system 110 , a multi-eye camera 120 and an application server 130 . The target positioning system 110 and the multi-camera 120 are connected through a network, and the application server 130 and the target positioning system 110 are connected through a network. The above network may be a wireless network or a wired network, which is not specifically limited in this application.
多目摄像机120包括多个摄像机组,多个摄像机组中的每组摄像机包括多个摄像机,比如多目摄像机120包括N个摄像机,其中,N为正整数,相机编号依次为1,2,…,N,每两个摄像机可组合成一个对应基线长度的双目摄像机组,比如相机1和相机N组成的双目摄像机组的基线为BL1,相机1和相机N-1组成的双目摄像机组的基线为BL2,以此类推,可以获得
Figure PCTCN2021139421-appb-000004
个双目摄像机组。应理解,上述举例用于说明,本申请不作具体限定。
The multi-eye camera 120 includes multiple camera groups, and each camera group in the multiple camera groups includes multiple cameras. For example, the multi-eye camera 120 includes N cameras, where N is a positive integer, and the camera numbers are 1, 2, . . . ,N, every two cameras can be combined into a binocular camera group corresponding to the baseline length, for example, the baseline of the binocular camera group composed of camera 1 and camera N is BL1, and the binocular camera group composed of camera 1 and camera N-1 The baseline is BL2, and so on, you can get
Figure PCTCN2021139421-appb-000004
A set of binocular cameras. It should be understood that the above examples are for illustration, and are not specifically limited in the present application.
其中,多目摄像机120用于将基线数据发送至目标定位系统110,其中,该基线数据包括每个摄像机组中多个摄像机之间的基线长度,仍以上述例子为例,多目摄像机120包括
Figure PCTCN2021139421-appb-000005
种双目摄像机,那么基线数据可以包括
Figure PCTCN2021139421-appb-000006
种双目摄像机的基线长度。多目摄像机120还用于接收目标定位系统110的发送的目标基线,并将目标基线对应的摄像机组所采集的第一图像和第二图像发送至目标定位系统110。其中,第一图像和第二图像是同一时刻对同一目标进行拍摄获得的图像。目标基线是目标定位系统110根据目标的测量精度需求确定的目标基线。
Wherein, the multi-eye camera 120 is used for sending baseline data to the target positioning system 110, wherein the baseline data includes the baseline length between multiple cameras in each camera group. Still taking the above example as an example, the multi-eye camera 120 includes
Figure PCTCN2021139421-appb-000005
binocular cameras, then the baseline data can include
Figure PCTCN2021139421-appb-000006
Baseline length of a binocular camera. The multi-eye camera 120 is further configured to receive the target baseline sent by the target positioning system 110 , and send the first image and the second image collected by the camera group corresponding to the target baseline to the target positioning system 110 . The first image and the second image are images obtained by photographing the same target at the same time. The target baseline is the target baseline determined by the target positioning system 110 according to the measurement accuracy requirements of the target.
具体实现中,多目摄像机120可接收由目标定位系统110发送的基线调整请求,该基 线调整请求中携带有上述目标基线,多目摄像机120可根据该基线调整请求,获取由目标基线对应的摄像机组拍摄的第一图像和第二图像,然后将其发送至目标定位系统110。In a specific implementation, the multi-camera 120 can receive a baseline adjustment request sent by the target positioning system 110, and the baseline adjustment request carries the above-mentioned target baseline, and the multi-camera 120 can obtain the camera corresponding to the target baseline according to the baseline adjustment request. The group captures the first image and the second image, which are then sent to the object positioning system 110 .
可以理解的,使用立体视觉算法对目标进行定位时,多目摄像机的基线长度是往往固定不变的,在基线固定的情况下,越远的目标测距精度越差,进而导致测距定位的范围也受到限制。因此,目标定位系统根据目标的测量精度需求,确定多目摄像及120的基线长度,比如在对远距离目标进行测距时,可使用基线较长的双目摄像机,近距离目标进行测距定位时,可使用基线较短的双目摄像机,从而扩大测距定位的范围,解决测距定位时的测距范围小的问题。It is understandable that when using the stereo vision algorithm to locate the target, the baseline length of the multi-camera camera is often fixed. In the case of a fixed baseline, the farther the target is, the worse the ranging accuracy will be, which will lead to poor ranging and positioning. The range is also limited. Therefore, the target positioning system determines the baseline length of the multi-camera and 120 according to the measurement accuracy requirements of the target. For example, when measuring the distance of a long-distance target, a binocular camera with a longer baseline can be used, and a short-range target can be used for distance measurement and positioning. When , a binocular camera with a short baseline can be used to expand the range of ranging and positioning and solve the problem of small ranging range during ranging and positioning.
可选地,多目摄像机120还用于使用该目标基线对应的摄像机组对目标进行拍摄,获得多路视频,比如第一路视频和第二路视频,然后将第一路视频和第二路视频发送至目标定位系统110,其中,第一路视频和第二路视频包括上述第一图像和第二图像,目标定位系统110可以对第一路视频和第二路视频进行时间同步处理后,获得上述第一图像和第二图像。Optionally, the multi-eye camera 120 is further configured to use the camera group corresponding to the target baseline to shoot the target to obtain multi-channel videos, such as the first channel video and the second channel video, and then combine the first channel video and the second channel video. The video is sent to the target positioning system 110, wherein the first video and the second video include the above-mentioned first image and the second image, and the target positioning system 110 can perform time synchronization processing on the first video and the second video, The first and second images described above are obtained.
值得注意的是,上述第一路视频和第二路视频可以是多目摄像机120实时采集的视频,也可以是缓存的历史视频,比如多目摄像机120包括10个摄像头,位于某小区门口,每个摄像头采集了早上8点至晚上8点小区门口的监控视频后,于晚上9点作为第一路视频和第二路视频传输至目标定位系统110进行处理,也可以每个摄像头实时采集小区门口的监控视频,通过网络实时传输至目标定位系统110进行处理,本申请不作具体限定。It is worth noting that the above-mentioned first channel video and second channel video may be real-time video collected by the multi-camera 120, or may be a cached historical video. For example, the multi-camera 120 includes 10 cameras, which are located at the door of a certain community. After each camera collects the surveillance video at the entrance of the community from 8:00 am to 8:00 pm, it is transmitted to the target positioning system 110 as the first video and the second video at 9:00 pm for processing, and each camera can also collect real-time monitoring of the community door. The surveillance video is transmitted to the target positioning system 110 in real time through the network for processing, which is not specifically limited in this application.
可选地,在目标是静止目标时,上述多目摄像机120中的摄像机也可以是单目可移动相机,简单来说,该摄像机系统只包括一个安装于一可滑动的支撑杆上的摄像机,摄像机可通过该滑动支撑杆,在不同角度对目标进行第一路视频和第二路视频的采集,摄像机在滑动支撑杆上移动的距离长度即为上述目标基线。应理解,上述多目摄像机120还可包括其他能够采集同一目标的第一路视频和第二路视频的架构,本申请不作具体限定。Optionally, when the target is a stationary target, the camera in the multi-eye camera 120 can also be a monocular movable camera. In short, the camera system only includes a camera mounted on a slidable support rod, The camera can collect the first video and the second video of the target at different angles through the sliding support rod, and the distance length of the camera moving on the sliding support rod is the above target baseline. It should be understood that the above-mentioned multi-eye camera 120 may further include other structures capable of capturing the first channel video and the second channel video of the same target, which is not specifically limited in this application.
应用服务器130可以是单个服务器,也可以是多个服务器组成的服务器集群,服务器可以是通用的物理服务器实现的,例如,ARM服务器或者X86服务器,也可以是结合网络功能虚拟化(network functions virtualization,NFV)技术实现的虚拟机(virtual machine,VM)比如数据中心内的虚拟机,本申请不作具体限定。其中,应用服务器130用于根据目标定位系统110发送的位置信息,实现三维重建、工业检测、智能安防、AR、VR、自动驾驶等功能。The application server 130 may be a single server, or a server cluster composed of multiple servers, and the server may be implemented by a general-purpose physical server, for example, an ARM server or an X86 server, or may be combined with network functions virtualization (network functions virtualization, A virtual machine (virtual machine, VM) implemented by NFV) technology, such as a virtual machine in a data center, is not specifically limited in this application. Among them, the application server 130 is used to realize functions such as three-dimensional reconstruction, industrial detection, intelligent security, AR, VR, automatic driving, etc. according to the position information sent by the target positioning system 110 .
目标定位系统110用于接收多目摄像机120发送的第一图像和第二图像,对第一图像和第二图像进行目标检测和匹配,获得第一图像的第一目标区域和第二图像的第二目标区域,其中,第一目标区域和第二目标区域包括上述待测的目标。然后对第一目标区域和第二目标区域进行特征点检测和匹配,获得特征点匹配结果,其中,该特征点匹配结果包括第一目标区域中的特征点与第二目标区域中的特征点之间的对应关系,存在该对应关系的特征点描述目标的同一个特征,最后根据特征点匹配结果和多目摄像机的参数信息,可以确定目标的视差信息,进而结合公式3确定目标的位置信息,并将其发送至上述应用服务器130,以供应用服务器130根据位置信息实现三维重建、AR、VR、自动驾驶等功能。The target positioning system 110 is configured to receive the first image and the second image sent by the multi-camera 120, perform target detection and matching on the first image and the second image, and obtain the first target area of the first image and the first target area of the second image. Two target areas, wherein the first target area and the second target area include the above-mentioned target to be measured. Then, feature point detection and matching are performed on the first target area and the second target area to obtain a feature point matching result, wherein the feature point matching result includes the feature point in the first target area and the feature point in the second target area. The corresponding relationship between the two, the feature points with the corresponding relationship describe the same feature of the target, and finally the parallax information of the target can be determined according to the matching results of the feature points and the parameter information of the multi-camera camera, and then the position information of the target can be determined according to formula 3, And send it to the above-mentioned application server 130, so that the application server 130 can realize functions such as three-dimensional reconstruction, AR, VR, automatic driving, etc. according to the position information.
可选地,目标定位系统110还可接收多目摄像机120发送的上述基线数据,然后根据 待测的目标的测量精度需求,从基线数据中获取目标基线,然后将其发送至多目摄像机120,使得多目摄像机120使用目标基线对应的摄像机组对目标进行采集,获得第一图像和第二图像。Optionally, the target positioning system 110 can also receive the above-mentioned baseline data sent by the multi-eye camera 120, and then obtain the target baseline from the baseline data according to the measurement accuracy requirements of the target to be measured, and then send it to the multi-eye camera 120, so that The multi-camera 120 uses the camera group corresponding to the target baseline to capture the target to obtain the first image and the second image.
可选的,多目摄像机120使用目标基线对应的摄像机对目标进行采集后,还可获得第一路视频和第二路视频,目标定位系统110接收到上述第一路视频和第二路视频后,可以对第一路视频和第二路视频进行时间同步处理,获得上述第一图像和第二图像。Optionally, after the multi-camera 120 uses the camera corresponding to the target baseline to collect the target, it can also obtain the first video and the second video. After the target positioning system 110 receives the first video and the second video, , the time synchronization processing can be performed on the first channel video and the second channel video to obtain the above-mentioned first image and second image.
具体实现中,本申请提供的目标定位系统110部署灵活,可部署在边缘环境,具体可以是边缘环境中的一个边缘计算设备或运行在一个或者多个边缘计算设备上的软件系统。边缘环境指在地理位置上距离上述多目摄像机120较近的,用于提供计算、存储、通信资源的边缘计算设备集群,比如位于道路两侧的边缘计算一体机。举例来说,目标定位系统110可以是距离路口较近的位置的一台或多台边缘计算设备或者是运行在距离路口较近的位置的边缘计算设备的软件系统,该路口中设置有摄像机1、摄像机2和摄像机3对该路口进行监控,边缘计算设备可根据待测的目标确定最适合的基线为BL3,满足基线BL3的摄像机包括摄像机1和摄像机3,边缘计算设备可对摄像机1和摄像机3采集的第一路视频和第二路视频进行时间同步处理,获得同一时刻下的第一图像和第二图像,然后对第一图像和第二图像进行目标检测,获得第一目标图像和第二目标图像,其中,第一目标图像和第二目标图像都包括该待测的目标,接着,对第一目标图像和第二目标图像进行特征点检测和匹配,获得特征点匹配结果,根据特征点匹配结果和摄像机1、摄像机3的参数信息可以确定目标的视差信息,进而结合公式3确定目标的位置信息,将位置信息发送给应用服务器130,以供应用服务器根据位置信息实现三维重建、AR、VR、自动驾驶等功能。In specific implementation, the target positioning system 110 provided by the present application is flexible in deployment, and can be deployed in an edge environment, specifically, an edge computing device in the edge environment or a software system running on one or more edge computing devices. The edge environment refers to a cluster of edge computing devices that are geographically close to the multi-camera 120 and used to provide computing, storage, and communication resources, such as edge computing all-in-one computers located on both sides of a road. For example, the target positioning system 110 may be one or more edge computing devices located near the intersection or a software system running on the edge computing device near the intersection, where the camera 1 is set. , Camera 2 and Camera 3 to monitor the intersection. The edge computing device can determine the most suitable baseline according to the target to be measured as BL3. The cameras that meet the baseline BL3 include Camera 1 and Camera 3. The edge computing device can monitor Camera 1 and Camera 3. 3. Perform time synchronization processing on the first video and the second video collected to obtain the first image and the second image at the same moment, and then perform target detection on the first image and the second image to obtain the first target image and the second image. Two target images, wherein both the first target image and the second target image include the target to be tested, then, the feature point detection and matching are performed on the first target image and the second target image to obtain a feature point matching result, according to the feature The point matching result and the parameter information of camera 1 and camera 3 can determine the parallax information of the target, and then determine the position information of the target in combination with formula 3, and send the position information to the application server 130, so that the application server can realize three-dimensional reconstruction and AR according to the position information. , VR, autonomous driving and other functions.
目标定位系统110还可以部署在云环境,云环境是云计算模式下利用基础资源向用户提供云服务的实体。云环境包括云数据中心和云服务平台,该云数据中心包括云服务提供商拥有的大量基础资源(包括计算资源、存储资源和网络资源)。目标定位系统110可以是云数据中心的服务器,也可以是创建在云数据中心中的虚拟机,还可以是部署在云数据中心中的服务器或者虚拟机上的软件系统,该软件系统可以分布式地部署在多个服务器上、或者分布式地部署在多个虚拟机上、或者分布式地部署在虚拟机和服务器上。例如,目标定位系统110还可部署在距离某路口较远的云数据中心,该路口中设置有摄像机1、摄像机2和摄像机3对该路口进行监控,云数据中心可根据待测目标确定最适合的基线为BL3,满足基线BL3的摄像机包括摄像机1和摄像机3,云数据中心可对摄像机1和摄像机3采集的第一路视频和第二路视频进行时间同步处理,获得同一时刻下的第一图像和第二图像,然后对第一图像和第二图像进行目标检测,获得第一目标图像和第二目标图像,其中,第一目标图像和第二目标图像都包括该待测的目标,接着,对第一目标图像和第二目标图像进行特征点检测和匹配,获得特征点匹配结果,根据特征点匹配结果和摄像机1、摄像机3的参数信息确定目标的视差信息,进而结合公式3确定目标的位置信息,将位置信息发送给应用服务器130,以供应用服务器根据位置信息实现三维重建、AR、VR、自动驾驶等功能。The target positioning system 110 may also be deployed in a cloud environment, which is an entity that provides cloud services to users by utilizing basic resources in a cloud computing mode. The cloud environment includes a cloud data center and a cloud service platform, and the cloud data center includes a large number of basic resources (including computing resources, storage resources, and network resources) owned by the cloud service provider. The target positioning system 110 may be a server in a cloud data center, a virtual machine created in the cloud data center, or a software system deployed on a server or virtual machine in the cloud data center, and the software system may be distributed It is deployed on multiple servers, or distributed on multiple virtual machines, or distributed on both virtual machines and servers. For example, the target positioning system 110 can also be deployed in a cloud data center that is far away from a certain intersection, where camera 1, camera 2 and camera 3 are set to monitor the intersection, and the cloud data center can determine the most suitable for the intersection according to the target to be measured. The baseline is BL3, and the cameras that satisfy the baseline BL3 include camera 1 and camera 3. The cloud data center can perform time synchronization processing on the first video and the second video collected by camera 1 and camera 3 to obtain the first video at the same time. image and the second image, and then perform target detection on the first image and the second image to obtain the first target image and the second target image, wherein the first target image and the second target image both include the target to be detected, and then , perform feature point detection and matching on the first target image and the second target image, obtain the feature point matching result, determine the parallax information of the target according to the feature point matching result and the parameter information of camera 1 and camera 3, and then determine the target in combination with formula 3 The location information is sent to the application server 130, so that the application server can implement functions such as 3D reconstruction, AR, VR, and automatic driving according to the location information.
目标定位系统110还可以部分部署在边缘环境,部分部署在云环境。比如边缘计算设备负责根据待测的目标确定双目摄像机的基线,云数据中心根据该双目摄像机采集的第一 路视频和第二路视频确定视差信息。举例来说,如图3所示,该路口中设置有摄像机1、摄像机2和摄像机3对该路口进行监控,边缘计算设备可根据待测的目标确定最适合的基线为BL3,满足基线BL3的摄像机包括摄像机1和摄像机3,边缘计算设备还可对摄像机1和摄像机3采集的第一路视频和第二路视频进行时间同步处理,获得同一时刻下的第一图像和第二图像,然后将其发送给云数据中心,云数据中心可对第一图像和第二图像进行目标检测,获得第一目标图像和第二目标图像,其中,第一目标图像和第二目标图像都包括该待测的目标,接着,对第一目标图像和第二目标图像进行特征点检测和匹配,获得特征点匹配结果,根据特征点匹配结果和摄像机1、摄像机3的参数信息可以确定目标的视差信息,进而结合公式3确定目标的位置信息,将位置信息发送给应用服务器130,以供应用服务器根据位置信息实现三维重建、AR、VR、自动驾驶等功能。The targeting system 110 may also be deployed partly in an edge environment and partly in a cloud environment. For example, the edge computing device is responsible for determining the baseline of the binocular camera according to the target to be measured, and the cloud data center determines the parallax information according to the first video and the second video collected by the binocular camera. For example, as shown in Figure 3, the intersection is provided with camera 1, camera 2 and camera 3 to monitor the intersection, and the edge computing device can determine the most suitable baseline according to the target to be measured as BL3, which meets the requirements of the baseline BL3. The cameras include camera 1 and camera 3. The edge computing device can also perform time synchronization processing on the first video and the second video collected by camera 1 and camera 3 to obtain the first image and the second image at the same moment, and then It is sent to the cloud data center, and the cloud data center can perform target detection on the first image and the second image to obtain the first target image and the second target image, wherein both the first target image and the second target image include the object to be detected. Then, perform feature point detection and matching on the first target image and the second target image, and obtain the feature point matching result. According to the feature point matching result and the parameter information of camera 1 and camera 3, the parallax information of the target can be determined, and then The position information of the target is determined in combination with formula 3, and the position information is sent to the application server 130, so that the application server can realize functions such as 3D reconstruction, AR, VR, and automatic driving according to the position information.
应理解,目标定位系统110内部的单元模块也可以有多种划分,各个模块可以是软件模块,也可以是硬件模块,也可以部分是软件模块部分是硬件模块,本申请不对其进行限制。图2为一种示例性的划分方式,如图2所示,目标定位系统110可包括基线确定单元111、同步单元112以及检测匹配单元113。需要说明的,由于目标定位系统110部署灵活,因此目标定位系统中的各个模块也可以部署于同一个边缘计算设备、或者同一个云数据中心、或者同一个物理机上,当然,也可以是部分部署于边缘计算设备,部分部署于云数据中心,比如基线确定单元111部署于边缘计算设备,同步单元112以及检测匹配单元113部署于云数据中心,本申请不作具体限定。It should be understood that the unit modules inside the target positioning system 110 may also be divided into multiple divisions, and each module may be a software module, a hardware module, or part of a software module and part of a hardware module, which is not limited in this application. FIG. 2 is an exemplary division manner. As shown in FIG. 2 , the target positioning system 110 may include a baseline determination unit 111 , a synchronization unit 112 and a detection and matching unit 113 . It should be noted that, due to the flexible deployment of the target positioning system 110, each module in the target positioning system can also be deployed on the same edge computing device, the same cloud data center, or the same physical machine, of course, it can also be partially deployed For edge computing devices, some are deployed in cloud data centers. For example, the baseline determination unit 111 is deployed in edge computing devices, and the synchronization unit 112 and detection matching unit 113 are deployed in cloud data centers, which are not specifically limited in this application.
其中,基线确定单元111用于接收多目摄像机120发送的基线数据,然后根据待测的目标的测量精度需求确定目标基线,并将其发送至多目摄像机120。应理解,多目摄像机的基线长度和共视区域对视差的测量精度有着一定的影响,其中,共视区域指的是多目摄像机能够同时拍到的区域。The baseline determination unit 111 is configured to receive the baseline data sent by the multi-camera 120 , and then determine the target baseline according to the measurement accuracy requirement of the target to be measured, and send it to the multi-camera 120 . It should be understood that the baseline length of the multi-eye camera and the common viewing area have a certain influence on the measurement accuracy of the parallax, wherein the common viewing area refers to the area that can be photographed by the multi-eye cameras at the same time.
应理解,固定基线的多目摄像机,为了确保测距精度,其测距范围也是固定的,这是因为目标距离摄像机越近,多目摄像机的共视区域趋近于0,目标在多目摄像机中的个别相机可能不存在成像点,也就无法计算目标的视差。而目标距离摄像机越远,目标在第一图像和第二图像上所在的区域会越来越模糊,对视差计算造成影响,因此固定基线的多目摄像机拥有固定的测距范围。本申请提供的系统中,基线确定单元111可根据待测的目标的测量精度需求灵活确定目标基线,进而扩大本申请提供的测距定位系统的测距范围。It should be understood that, in order to ensure the ranging accuracy of a multi-camera camera with a fixed baseline, its ranging range is also fixed. This is because the closer the target is to the camera, the common viewing area of the multi-camera camera is close to 0. There may be no imaging points for individual cameras in , and therefore no disparity of the target can be calculated. The farther the target is from the camera, the more blurred the target area on the first image and the second image will be, which will affect the parallax calculation. Therefore, a multi-eye camera with a fixed baseline has a fixed ranging range. In the system provided by the present application, the baseline determination unit 111 can flexibly determine the target baseline according to the measurement accuracy requirements of the target to be measured, thereby expanding the ranging range of the ranging and positioning system provided by the present application.
下面对影响多目摄像机进行测距定位误差的几个因素进行简要说明。以双目摄像机为例,图4是固定目标不同基线的双目摄像机的测距误差示意图,由图4可知,目标与多目摄像机120之间的距离为50米时,双目摄像机120的基线长度不同,测距误差也不同。基线长度越短,测距误差越大,对应的测距精度越低;基线长度越长,测距误差越小,对应的测距精度越高。The following briefly describes several factors that affect the ranging and positioning error of the multi-camera. Taking the binocular camera as an example, FIG. 4 is a schematic diagram of the ranging error of the binocular camera with different baselines of the fixed target. It can be seen from FIG. 4 that when the distance between the target and the multi-camera 120 is 50 meters, the baseline of the binocular camera 120 Different lengths have different ranging errors. The shorter the baseline length, the greater the ranging error, and the lower the corresponding ranging accuracy; the longer the baseline length, the smaller the ranging error, and the higher the corresponding ranging accuracy.
但是,双目摄像机的基线长度过长时,双目摄像机的共视区域会不断减少,可能会出现左侧摄像机或右侧摄像机无法拍摄到目标的情况,例如图5所示,图5是一种基线过长的双目摄像机在拍摄目标点时的成像示意图,若双目摄像机的基线过长,目标点P不在右侧相机的拍摄范围内,换句话说,目标点P在右侧相机的成像平面CD上没有成像点,进而无法根据视差确定目标的位置信息。However, when the baseline length of the binocular cameras is too long, the common viewing area of the binocular cameras will continue to decrease, and there may be situations where the left camera or the right camera cannot capture the target, for example, as shown in Figure 5, Figure 5 is a A schematic diagram of the imaging of a binocular camera with a long baseline when shooting a target point. If the baseline of the binocular camera is too long, the target point P is not within the shooting range of the right camera. In other words, the target point P is within the shooting range of the right camera. There is no imaging point on the imaging plane CD, so the position information of the target cannot be determined according to the parallax.
结合上述影响多目摄像机测距精度的因素,基线确定单元111可通过如下方式确定目标基线:首先,确定每组摄像机的第一精度指标和第二精度指标,其中,第一精度指标与每组摄像机的基线长度呈反比例关系,第一精度指标与每组摄像机的共视区域呈正比例关系,第二精度指标与每组摄像机的基线长度和焦距呈正比例关系,共视区域是每组摄像机中的多个摄像机共同拍摄到的区域,然后根据目标的测量精度需求确定第一精度指标和第二精度指标的权重,再根据第一精度指标、第二精度指标和权重,获得所每组摄像机的综合指标,从而根据每组摄像机的综合指标,确定目标基线。Combining the above factors affecting the ranging accuracy of the multi-camera, the baseline determination unit 111 can determine the target baseline in the following manner: first, determine the first accuracy index and the second accuracy index of each group of cameras, wherein the first accuracy index is related to each group of cameras. The baseline length of the cameras is inversely proportional, the first accuracy index is proportional to the common viewing area of each group of cameras, and the second accuracy index is proportional to the baseline length and focal length of each group of cameras. The area photographed by multiple cameras, and then determine the weight of the first accuracy index and the second accuracy index according to the measurement accuracy requirements of the target, and then obtain the comprehensive combination of each group of cameras according to the first accuracy index, the second accuracy index and the weight. indicators, so as to determine the target baseline according to the comprehensive indicators of each group of cameras.
其中,测量精度需求可包括目标与多目摄像机120的大致远近程度,比如目标是远距离目标还是近距离目标,其中,目标是远距离目标还是近距离目标可根据目标在摄像机所拍摄的图像中的图像区域大小确定,比如图像区域大小小于第一阈值的目标为远距离目标,图像区域大小大于第二阈值的目标为近距离目标。策略精度需求还可包括目标的测量误差阈值,比如目标的测量误差不得小于1米。应理解,上述举例用于说明,本申请不作具体限定。The measurement accuracy requirement may include the approximate distance between the target and the multi-camera 120, such as whether the target is a long-distance target or a short-distance target, wherein whether the target is a long-distance target or a short-distance target can be determined according to the target in the image captured by the camera. The size of the image area is determined. For example, a target whose image area size is smaller than the first threshold is a long-distance target, and a target whose image area size is greater than the second threshold is a short-range target. The strategy accuracy requirement may also include the target measurement error threshold, for example, the target measurement error must not be less than 1 meter. It should be understood that the above examples are for illustration, and are not specifically limited in the present application.
可选地,基线确定单元111可以在确定目标基线后,将携带有目标基线的基线调整请求发送给多目摄像机120,该基线调整请求用于指示多目摄像机120调整摄像机组的基线长度到上述目标基线。Optionally, after determining the target baseline, the baseline determination unit 111 may send a baseline adjustment request carrying the target baseline to the multi-camera 120, where the baseline adjustment request is used to instruct the multi-cam 120 to adjust the baseline length of the camera group to the above-mentioned length. target baseline.
同步单元112用于接收多目摄像机120使用目标基线的摄像机组对目标进行采集获得多路视频,比如第一路视频和第二路视频,然后对第一路视频和第二路视频进行时间同步处理,获得第一图像和第二图像,其中,第一图像和第二图像是在同一时刻对同一目标进行拍摄获得。应理解,由于不同摄像机的时间戳、帧率、网络时延不同,双目摄像机拍摄的两个视频可能会存在时间不同步的情况,比如左侧摄像机时间戳为T1的图像描述的是世界时间为T1时刻下的场景,右侧摄像机时间戳为T1的图像描述的是世界时间为T1+Δt时刻下的场景。同步单元112对第一路视频和第二路视频进行时间同步后,可获得同一时刻下的第一图像和第二图像,使用第一图像和第二图像进行接下来视差计算,可以提高最终获得的位置信息的精度。The synchronization unit 112 is configured to receive the multi-camera 120 to collect the target using the camera group of the target baseline to obtain multi-channel videos, such as the first channel video and the second channel video, and then perform time synchronization on the first channel video and the second channel video. processing to obtain a first image and a second image, wherein the first image and the second image are obtained by photographing the same target at the same time. It should be understood that due to the different time stamps, frame rates, and network delays of different cameras, the two videos captured by the binocular camera may be out of sync in time. For example, the image with the time stamp of T1 on the left camera describes the world time. is the scene at time T1, and the image with the time stamp of T1 on the right camera describes the scene at time T1+Δt in the world time. After the synchronization unit 112 performs time synchronization on the video of the first channel and the video of the second channel, the first image and the second image at the same moment can be obtained. Using the first image and the second image to perform the next disparity calculation can improve the final result. accuracy of the location information.
检测匹配单元113用于对第一图像和第二图像中的待测的目标进行检测和识别,获得第一目标区域和第二目标区域,然后对第一目标区域和第二目标区域进行特征点的检测和匹配,从而获得特征点匹配结果,该特征点匹配结果中包括第一目标区域中的特征点和第二目标区域中的特征点之间的对应关系,存在对应关系的特征点描述目标的同一个特征。然后根据特征点匹配结果确定每个特征点的视差信息,根据视差信息和多目摄像机的参数信息确定目标的位置信息。The detection and matching unit 113 is used to detect and identify the target to be tested in the first image and the second image, obtain the first target area and the second target area, and then perform feature points on the first target area and the second target area. The feature point matching result includes the corresponding relationship between the feature points in the first target area and the feature points in the second target area, and the feature points with the corresponding relationship describe the target of the same feature. Then, the parallax information of each feature point is determined according to the feature point matching result, and the position information of the target is determined according to the parallax information and the parameter information of the multi-camera.
其中,视差信息包括该目标的每一个特征点的视差,具体可以是第一目标区域中的特征点的像素坐标与在第二目标区域中对应的特征点的像素坐标之间的差距,关于视差的描述可以参考图1实施例,这里不重复赘述。Wherein, the disparity information includes the disparity of each feature point of the target, which may be the difference between the pixel coordinates of the feature points in the first target area and the pixel coordinates of the corresponding feature points in the second target area. Regarding the disparity For description, reference may be made to the embodiment in FIG. 1 , which will not be repeated here.
位置信息可包括目标与摄像机之间的距离,该距离可个根据视差和多目摄像机的参数信息确定。结合图1实施例中的公式3可知,该参数信息至少包括多目摄像机的基线长度和焦距。位置信息还可包括目标的地理坐标,其中,该地理坐标可以是根据多目摄像机的地理坐标结合上述目标与摄像机之间的距离确定的,具体可根据应用服务130的需求确定, 若位置坐标包括目标的地理坐标,上述多目摄像机的参数信息可包括多目摄像机的地理坐标。The position information may include the distance between the target and the camera, which may be determined according to parallax and parameter information of the multi-camera. It can be known from the formula 3 in the embodiment of FIG. 1 that the parameter information at least includes the baseline length and the focal length of the multi-camera. The location information may also include the geographic coordinates of the target, where the geographic coordinates may be determined according to the geographic coordinates of the multi-camera combined with the distance between the target and the camera, and may be specifically determined according to the requirements of the application service 130. If the location coordinates include The geographic coordinates of the target, the parameter information of the multi-camera may include the geographic coordinates of the multi-camera.
可以理解的,检测匹配单元113根据特征点的视差信息确定目标的位置信息,无需对第一图像区域和第二图像区域中每个像素进行匹配和视差计算,从而降低定位测距时所需的计算资源,同时避免背景干扰、噪声等问题,提升测距定位的精度。It can be understood that the detection and matching unit 113 determines the position information of the target according to the parallax information of the feature points, and does not need to perform matching and parallax calculation on each pixel in the first image area and the second image area, thereby reducing the time required for positioning and ranging. Computing resources, while avoiding background interference, noise and other problems, and improving the accuracy of ranging and positioning.
综上可知,本申请提供的目标定位系统,可根据待测的目标确定目标基线,使用该目标基线的摄像机组对目标进行采集获得第一图像和第二图像,然后对第一图像和第二图像进行目标检测和匹配,获得目标所在的第一目标区域和第二目标区域,最后对第一目标区域和第二目标区域进行特征点的检测和匹配,获得特征点匹配结果,根据特征点匹配结果确定每个特征点的视差信息,从而确定目标的位置信息。该系统可根据待测的目标灵活选择目标基线的摄像机组进行数据采集,避免由于固定基线的多目摄像机带来的测距范围受限的问题,提高目标定位系统的测距范围,同时,该系统根据特征点的视差信息确定目标的位置信息,无需对第一图像区域和第二图像区域中每个像素进行匹配和视差计算,从而降低定位测距时所需的计算资源,同时避免背景干扰、噪声等问题,提升测距定位的精度。To sum up, the target positioning system provided by the present application can determine the target baseline according to the target to be measured, use the camera group of the target baseline to collect the target to obtain the first image and the second image, and then compare the first image and the second image. The image is subjected to target detection and matching, and the first target area and the second target area where the target is located are obtained. Finally, the first target area and the second target area are detected and matched by feature points, and the result of feature point matching is obtained. Matching according to the feature points As a result, the disparity information of each feature point is determined, thereby determining the position information of the target. The system can flexibly select the target baseline camera group for data collection according to the target to be measured, avoid the problem of limited ranging range caused by fixed baseline multi-eye cameras, and improve the ranging range of the target positioning system. The system determines the location information of the target according to the parallax information of the feature points, and does not need to perform matching and parallax calculation for each pixel in the first image area and the second image area, thereby reducing the computational resources required for positioning and ranging, and avoiding background interference. , noise and other problems, improve the accuracy of ranging and positioning.
如图6所示,本申请提供了一种目标定位方法,该方法可应用于图2所示的立体视觉系统的架构中,具体地,该方法可由前述目标定位系统110执行。该方法可包括以下步骤:As shown in FIG. 6 , the present application provides a target positioning method, which can be applied to the architecture of the stereoscopic vision system shown in FIG. 2 . Specifically, the method can be executed by the aforementioned target positioning system 110 . The method may include the following steps:
S310:根据待测的目标的测量精度需求确定目标基线。S310: Determine the target baseline according to the measurement accuracy requirement of the target to be measured.
参考前述内容可知,上述多目摄像机120可包括N个摄像机组,每个摄像机组包括多个摄像机,两两组合可以获得
Figure PCTCN2021139421-appb-000007
种双目摄像机。多目摄像机120可以在步骤S310之前将每个摄像机组的基线数据发送至目标定位系统110。目标定位系统110可根据待测的目标的测量精度需求,从上述N(N-1)/2种双目摄像机的基线数据中选择目标基线,并将其发送至多目摄像机120。
With reference to the foregoing content, the above-mentioned multi-camera 120 may include N camera groups, and each camera group includes a plurality of cameras.
Figure PCTCN2021139421-appb-000007
A binocular camera. The multi-camera 120 may send the baseline data of each camera group to the object positioning system 110 before step S310. The target positioning system 110 can select a target baseline from the baseline data of the N(N-1)/2 types of binocular cameras according to the measurement accuracy requirements of the target to be measured, and send it to the multi-eye camera 120 .
在一实施例中,参考图4和图5实施例可知,多目摄像机120的基线长度越长,测距精度越大,但是共视区域随着基线长度的变大会逐渐减少,可能会出现目标不在多目摄像机120的共视区域内的情况。因此,可根据多目摄像机120的共视区域大小以及待测的目标的测量精度需求确定目标基线。In one embodiment, referring to the embodiments of FIGS. 4 and 5 , it can be known that the longer the baseline length of the multi-camera 120 is, the greater the ranging accuracy is, but the common viewing area will gradually decrease as the baseline length increases, and targets may appear. In the case of not being within the common viewing area of the multi-camera 120 . Therefore, the target baseline can be determined according to the size of the common viewing area of the multi-camera 120 and the measurement accuracy requirements of the target to be measured.
其中,多目摄像机120采集的基线数据不仅包括每个摄像机组中的多个摄像机之间的基线长度,还可包括多个摄像机之间的共视区域大小,其中,共视区域大小可以根据摄像机组中每个摄像机的拍摄范围确定,该拍摄范围指的是摄像机拍摄的图像中记录的地理区域的范围。The baseline data collected by the multi-camera 120 includes not only the baseline length between the multiple cameras in each camera group, but also the size of the common viewing area between the multiple cameras, where the size of the common viewing area can be determined according to the size of the camera The shooting range of each camera in the group is determined, and the shooting range refers to the range of the geographical area recorded in the image captured by the camera.
具体实现中,可以通过确定每路视频的视频画面中能够显示的最边缘的边缘位置点,确定每个边缘位置点的像素坐标,通过相机标定算法(camera calibration)将其转换为地理坐标后,根据这些地理坐标组成的区域确定该路视频的拍摄范围,进而获得多个摄像机之间的共视区域大小。In the specific implementation, the pixel coordinates of each edge position point can be determined by determining the edge position point that can be displayed in the video picture of each channel of video, and after converting it into geographic coordinates through a camera calibration algorithm, The shooting range of the video is determined according to the area composed of these geographic coordinates, and then the size of the common viewing area between multiple cameras is obtained.
待测的目标的测量精度需求可包括目标与多目摄像机120的大致远近程度,换句话说,目标的远距离目标或近距离目标。其中,待测的目标是远距离目标还是近距离目标,可以根据多目摄像机采集的图像中目标所在的图像区域大小来确定,远距离目标在摄像机所采 集的图像中所占的图像区域很小,近距离目标在摄像机所采集的图像中所占的图像区域很大,因此当图像区域小于第一阈值时,可以确定待测的目标为近距离目标,图像区域小于第二阈值时,可以确定待测的目标为远距离目标。测量精度需求还可包括待测的目标的测量误差阈值,比如测量误差不小于1米。应理解,上述举例用于说明,本申请不作具体限定。The measurement accuracy requirement of the target to be measured may include the approximate distance between the target and the multi-camera 120 , in other words, the long-range target or the short-range target of the target. Among them, whether the target to be measured is a long-distance target or a short-range target can be determined according to the size of the image area where the target is located in the image collected by the multi-camera camera, and the image area occupied by the long-distance target in the image collected by the camera is very small , the image area occupied by the close-range target in the image collected by the camera is very large, so when the image area is smaller than the first threshold, it can be determined that the target to be measured is a close-range target, and when the image area is smaller than the second threshold, it can be determined The target to be tested is a long-distance target. The measurement accuracy requirement may also include the measurement error threshold of the target to be measured, for example, the measurement error is not less than 1 meter. It should be understood that the above examples are for illustration, and are not specifically limited in the present application.
在一实施例中,可确定每组摄像机的第一精度指标p 1和第二精度指标p 2,其中,第一精度指标p 1与每组摄像机的基线长度呈反比例关系,第一精度指标p 1与每组摄像机的共视区域呈正比例关系,第二精度指标p 2与每组摄像机的基线长度和焦距呈正比例关系,共视区域是每组摄像机中的多个摄像机共同拍摄到的区域,然后根据目标的测量精度需求确定第一精度指标p 1和第二精度指标p 2的权重α,再根据第一精度指标p 1、第二精度指标p 2和权重α,获得所每组摄像机的综合指标p,从而根据每组摄像机的综合指标,确定目标基线。 In one embodiment, the first accuracy index p 1 and the second accuracy index p 2 of each group of cameras may be determined, wherein the first accuracy index p 1 is inversely proportional to the baseline length of each group of cameras, and the first accuracy index p 1 is proportional to the common viewing area of each group of cameras, the second accuracy index p 2 is proportional to the baseline length and focal length of each group of cameras, the common viewing area is the area captured by multiple cameras in each group of cameras, Then, the weight α of the first accuracy index p 1 and the second accuracy index p 2 is determined according to the measurement accuracy requirements of the target, and then the weight α of each group of cameras is obtained according to the first accuracy index p 1 , the second accuracy index p 2 and the weight α. The comprehensive index p is used to determine the target baseline according to the comprehensive index of each group of cameras.
具体实现中,结合上述结论:“共视区域随着基线长度的变大会逐渐减少”可以获知测距精度p 1与共视区域FOV之间的关系为: In the specific implementation, combined with the above conclusion: "the common view area gradually decreases with the increase of the baseline length", it can be known that the relationship between the ranging accuracy p 1 and the common view area FOV is:
Figure PCTCN2021139421-appb-000008
Figure PCTCN2021139421-appb-000008
结合上述结论:“基线长度越长,测距精度越大”可以获知基线长度b与测距精度p 2之间的关系为: Combining the above conclusion: "the longer the baseline length, the greater the ranging accuracy", the relationship between the baseline length b and the ranging accuracy p 2 can be known as:
p 2=f×b                (5) p 2 =f×b (5)
综合上述两个结论:“基线长度越大,测距精度越大,但是同时共视区域越小”,可以获知综合精度为:Combining the above two conclusions: "The longer the baseline length is, the greater the ranging accuracy is, but the smaller the common viewing area is at the same time", it can be known that the comprehensive accuracy is:
p=αλ 1p 1+(1-α)λ 2p 2              (6) p=αλ 1 p 1 +(1-α)λ 2 p 2 (6)
其中,其中,λ 1和λ 2均为单位转换系数,使得p 1和p 2的单位一致,能够参与综合精度p的计算。权重α∈(0,1),α的具体值可根据上述测距定位需求来确定。比如待测的目标为远距离目标时,基线长度对测距精度的影响较大,因此可结合待测的目标的测量误差阈值,适当减小α的值,提高基于基线长度的精度指标p 2对综合指标的影响;同理,在待测的目标为近距离目标时,共视区域的大小对测距精度的影响较大,因此可结合待测的目标的测量误差阈值,适提高α的值,提高基于共视区的精度指标p 1对综合指标的影响。 Among them, λ 1 and λ 2 are both unit conversion coefficients, so that the units of p 1 and p 2 are consistent, and can participate in the calculation of the comprehensive accuracy p. The weight α∈(0,1), the specific value of α can be determined according to the above-mentioned ranging and positioning requirements. For example, when the target to be measured is a long-distance target, the baseline length has a great influence on the ranging accuracy. Therefore, the value of α can be appropriately reduced in combination with the measurement error threshold of the target to be measured, and the accuracy index p 2 based on the baseline length can be improved. Influence on the comprehensive index; Similarly, when the target to be measured is a short-range target, the size of the common viewing area has a greater impact on the ranging accuracy, so it can be combined with the measurement error threshold of the target to be measured to appropriately increase α. value, and improve the impact of the common view-based precision index p 1 on the comprehensive index.
应理解,基于公式(1)~(3),可根据待测的目标确定上述N(N-1)/2种摄像机组的综合指标p,然后将最大的综合指标p对应的基线长度确定为目标基线,将其发送至多目摄像机120。应理解,上述举例用于说明,本申请不对综合指标的具体公式进行限定。It should be understood that, based on formulas (1) to (3), the comprehensive index p of the above N(N-1)/2 camera groups can be determined according to the target to be measured, and then the baseline length corresponding to the largest comprehensive index p is determined as target baseline, which is sent to the multi-camera 120. It should be understood that the above examples are used for illustration, and the present application does not limit the specific formula of the comprehensive index.
S320:根据目标基线获取第一图像和第二图像。S320: Acquire the first image and the second image according to the target baseline.
具体地,可以根据目标基线向多目摄像机120发送目标基线,然后接收由目标基线对应的摄像机组拍摄的第一图像和第二图像。在一些实施例中,可以向多目摄像机发送携带有目标基线的基线调整请求,待多目摄像机根据基线调整请求将其中至少一个摄像机组的基线长度调整至目标基线并拍摄视频或图像后,然后接收目标基线对应的摄像机组拍摄的第一图像和第二图像。简而言之,该基线调整请求用于指示多目摄像机120调整摄像机组的基线长度到上述目标基线。Specifically, the target baseline may be sent to the multi-camera 120 according to the target baseline, and then the first image and the second image captured by the camera group corresponding to the target baseline may be received. In some embodiments, a baseline adjustment request carrying a target baseline may be sent to the multi-camera camera, and after the multi-camera adjusts the baseline length of at least one camera group to the target baseline according to the baseline adjustment request and shoots a video or image, then A first image and a second image captured by the camera group corresponding to the target baseline are received. In short, the baseline adjustment request is used to instruct the multi-camera 120 to adjust the baseline length of the camera group to the above target baseline.
在一实施例中,将目标基线或者携带有目标基线的基线调整请求发送给多目摄像机后,还可接收由目标基线对应的摄像机组拍摄的第一路视频和第二路视频,对第一路视频和第 二路视频进行时间同步处理后,获得第一图像和第二图像,其中,第一路视频和第二路视频是由目标基线对应的摄像机组拍摄获得,第一图像和第二图像是统一时刻下的图像。其中,第一图像是第一路视频中的视频帧,第二图像是第二路视频中的视频帧。In one embodiment, after the target baseline or the baseline adjustment request carrying the target baseline is sent to the multi-camera, the first video and the second video captured by the camera group corresponding to the target baseline can also be received. The first image and the second image are obtained after time synchronization processing is performed between the video of the first channel and the video of the second channel, wherein the video of the first channel and the video of the second channel are captured by the camera group corresponding to the target baseline. An image is an image at a uniform moment. The first image is a video frame in the first channel of video, and the second image is a video frame in the second channel of video.
应理解,由于多目摄像机中每个摄像机的型号、厂家、时间戳、视频的帧率可能是不相同的,并且网络传输时延也可能会导致传输过程中出现丢帧,摄像机计算性能较差时也容易出现丢帧,因此多个摄像机采集的多路视频难以保证时间同步性。举例来说,摄像头1和摄像头2监控同一个路口,摄像头1由于在T1时刻的对闯红灯的车辆进行了抓拍,使得摄像头1传输的实时视频流自抓拍时刻T1之后20ms内产生的视频帧均丢失,摄像头2没有进行抓拍,也没有出现丢帧的情况,因此目标定位系统110接收到的第一路视频和第二路视频后,自T1时刻起,摄像头2采集的视频比摄像头1采集的视频快20ms,若直接将摄像头1和摄像头2采集的第一路视频和第二路视频进行视差计算,获得的视差信息存在误差,进而导致后续测距定位、三维重建等应用存在障碍。因此,在进行视差计算之前,步骤S320处可对第一路视频和第二路视频进行时间同步处理,从而提高视差计算精度,进而提高测距定位、三维重建等应用的准确性。It should be understood that because the model, manufacturer, timestamp, and frame rate of the video may be different for each camera in the multi-camera, and the network transmission delay may also cause frame loss during the transmission process, the calculation performance of the camera is poor. Frame loss is also prone to occur, so it is difficult to ensure the time synchronization of multi-channel video collected by multiple cameras. For example, camera 1 and camera 2 monitor the same intersection. Because camera 1 captures the vehicle running the red light at time T1, the real-time video stream transmitted by camera 1 is lost within 20ms after the capture time T1. , the camera 2 did not capture, and there was no frame loss. Therefore, after the first video and the second video received by the target positioning system 110, from the moment T1, the video collected by the camera 2 is higher than the video collected by the camera 1. If the disparity calculation is performed directly on the first video and the second video collected by camera 1 and camera 2, there will be errors in the obtained disparity information, which will lead to obstacles in subsequent applications such as ranging and 3D reconstruction. Therefore, before the disparity calculation is performed, time synchronization processing may be performed on the first channel video and the second channel video at step S320, thereby improving the parallax calculation accuracy, thereby improving the accuracy of applications such as ranging and 3D reconstruction.
在一实施例中,可从第一路视频中获取参考帧,从第二路视频中获取多个运动帧,其中,参考帧和多个运动帧中包括运动物体,然后将参考帧与多个运动帧进行特征点匹配,获得多个运动帧中的同步帧,其中,同步帧中的特征点与参考帧中对应的特征点之间连线的平行度满足预设条件,最后根据参考帧和同步帧对第一路视频和第二路视频进行时间同步校正,获得第一图像和第二图像。其中,满足预设条件可以是指将连线之间的平行度最高的帧确定为同步帧。In one embodiment, the reference frame can be obtained from the first video, and a plurality of motion frames can be obtained from the second video, wherein the reference frame and the plurality of motion frames include moving objects, and then the reference frame is combined with the plurality of motion frames. The motion frame performs feature point matching to obtain synchronization frames in multiple motion frames, wherein the parallelism of the line between the feature points in the synchronization frame and the corresponding feature points in the reference frame satisfies the preset condition, and finally according to the reference frame and the reference frame. The synchronization frame performs time synchronization correction on the video of the first channel and the video of the second channel to obtain the first image and the second image. Wherein, satisfying the preset condition may refer to determining the frame with the highest parallelism between the lines as the synchronization frame.
具体地,可通过光流(optical flow)法确定上述参考帧和运动帧。其中,光流指的是空间运动物体在观察成像平面上的像素运动的瞬时速度,在时间间隔很小时,光流也可等同于空间运动物体的位移。基于此,确定参考帧和运动帧的步骤流程可以如下:先对第一路视频和第二路视频中的每帧图像进行同步目标的目标检测,获得每帧图像中的一个或者多个同步目标,然后通过光流法确定每个同步目标的光流,通过每帧图像中的每个同步目标的光流来判断其是否为运动物体,从而获得包含运动物体的运动帧,以及包含运动物体数量最多参考帧。Specifically, the above-mentioned reference frame and motion frame may be determined by an optical flow method. Among them, the optical flow refers to the instantaneous speed of the pixel motion of the space moving object on the observation imaging plane. When the time interval is very small, the optical flow can also be equivalent to the displacement of the space moving object. Based on this, the step flow of determining the reference frame and the motion frame can be as follows: first, perform target detection of the synchronization target on each frame of the first video and the second video, and obtain one or more synchronization targets in each frame of the image. , and then determine the optical flow of each synchronization target by the optical flow method, and determine whether it is a moving object by the optical flow of each synchronization target in each frame of image, so as to obtain the moving frame containing the moving object and the number of moving objects. Up to reference frames.
值得注意的是,对每帧图像进行目标检测时,检测出的同步目标应是有运动可能的目标,而不能是一定静止的目标,比如建筑物。因此同步目标可以是前述内容中待测的目标,也可以是其他目标,本申请不作具体限定。举例来说,如果待测的目标是某一电线杆,那么用于实现时间同步的同步目标可以是行人或者车辆,如果待测的目标是车辆A,那么用于实现时间同步的同步目标可以是车辆和行人,上述例子用于举例说明,本申请不对此进行限定。It is worth noting that when the target is detected for each frame of image, the detected synchronization target should be a target that may move, not a certain stationary target, such as a building. Therefore, the synchronization target may be the target to be measured in the foregoing content, or may be other targets, which are not specifically limited in this application. For example, if the target to be tested is a certain utility pole, the synchronization target for time synchronization can be a pedestrian or a vehicle; if the target to be tested is vehicle A, the synchronization target for time synchronization can be Vehicles and pedestrians, the above examples are used for illustration and are not limited in this application.
需要说明的,本申请实施例中的目标检测算法可采用业界已有的用于目标检测具有较优效果的神经网络模型中的任意一种,例如:一阶段统一实时目标检测(You Only Look Once:Unified,Real-Time Object Detection,Yolo)模型、单镜头多盒检测器(Single Shot multi box Detector,SSD)模型、区域卷积神经网络(Region ConvolutioNal Neural Network,RCNN)模型或快速区域卷积神经网络(Fast Region Convolutional Neural Network,Fast-RCNN)模型等,本 申请不作具体限定。并且,本申请实施例中的光流法也可采用业界已有的用于计算光流具有较优效果的光流法中的任意一种,例如Lucas-Kanade(LK)光流法,本申请不作具体限定。It should be noted that the target detection algorithm in this embodiment of the present application may use any of the existing neural network models for target detection with better effects in the industry, for example: one-stage unified real-time target detection (You Only Look Once : Unified, Real-Time Object Detection, Yolo) model, Single Shot multi box Detector (SSD) model, Region ConvolutioNal Neural Network (RCNN) model or fast regional convolutional neural network Network (Fast Region Convolutional Neural Network, Fast-RCNN) models, etc., are not specifically limited in this application. In addition, the optical flow method in the embodiments of the present application can also use any one of the optical flow methods that are already available in the industry for calculating optical flow with better effects, such as the Lucas-Kanade (LK) optical flow method. There is no specific limitation.
在一实施例中,可在获得每一帧中每个物体的光流(即物体的瞬时速度)后,通过确定物体的速度在图像行方向上是否有速度分量判断该物体是否是运动物体,具体地,由于多目摄像机(例如图2所示的多目摄像机120)固定于同一高度,若物体在行方向进行运动,物体的行坐标将发生变化,因此若运动帧Tn中的物体X与前一帧Tn-1(或者后一帧Tn+1)中同一物体X的行坐标不相等,即可确定该物体为运动物体。可以理解的是,垂直运动的物体只在列方向进行运动,该类运动物体在图像行方向上没有速度分量,只在列方向上有速度分量,因此垂直运动的物体对视差计算没有贡献,在可将垂直运动物体也当做非运动物体,不参与视差计算,从而减少计算量,提高视差计算的精度和效率。In one embodiment, after obtaining the optical flow of each object in each frame (that is, the instantaneous speed of the object), it can be determined whether the object is a moving object by determining whether the speed of the object has a speed component in the direction of the image line. Ground, since the multi-camera (eg, the multi-camera 120 shown in FIG. 2 ) is fixed at the same height, if the object moves in the row direction, the row coordinates of the object will change, so if the object X in the motion frame Tn is the same as the previous one If the row coordinates of the same object X in one frame Tn-1 (or the next frame Tn+1) are not equal, it can be determined that the object is a moving object. It can be understood that the vertically moving object only moves in the column direction. This type of moving object has no velocity component in the image row direction, but only has a velocity component in the column direction. Therefore, the vertically moving object does not contribute to the disparity calculation. The vertically moving object is also regarded as a non-moving object and does not participate in the parallax calculation, thereby reducing the amount of calculation and improving the accuracy and efficiency of the parallax calculation.
进一步地,将参考帧与运动帧进行匹配时,可将参考帧中的运动物体与运动帧中的运动物体做特征点的匹配,计算出各个特征点行坐标的差值Δs,Δs越小表示参考帧中的运动物体与运动帧中的运动物体越接近,参考帧中的特征点与运动帧中特征点之间的连线就越平行,如果Δs为0,表示两帧同步,若不为0,则表示两帧不同步,因此可根据Δs计算出准确的同步偏移时间Δt,示例性地,Δt的公式可如下:Further, when the reference frame is matched with the motion frame, the moving object in the reference frame and the moving object in the motion frame can be matched with the feature points, and the difference value Δs of the row coordinates of each feature point can be calculated. The closer the moving object in the reference frame is to the moving object in the moving frame, the more parallel the line between the feature point in the reference frame and the feature point in the moving frame is. If Δs is 0, it means that the two frames are synchronized. 0, it means that the two frames are not synchronized, so the accurate synchronization offset time Δt can be calculated according to Δs. Exemplarily, the formula of Δt can be as follows:
Figure PCTCN2021139421-appb-000009
Figure PCTCN2021139421-appb-000009
其中,Δs 1和Δs 2为特征点差值绝对值最小的两个值,fr为视频帧率,同步偏移时间Δt可作为后续每一帧行坐标的补偿,从而获得同步后的第一路视频和第二路视频,进而获得每一时刻下的第一图像和第二图像。 Among them, Δs 1 and Δs 2 are the two values with the smallest absolute value of the feature point difference, fr is the video frame rate, and the synchronization offset time Δt can be used as the compensation for the row coordinates of each subsequent frame, so as to obtain the first channel after synchronization. video and the second channel video, and then obtain the first image and the second image at each moment.
举例来说,如图7所示,假设摄像机1的参考帧的帧号为P1,摄像机2的运动帧包括帧Q1、帧Q2和帧Q3,将摄像机2的运动帧Q1与参考帧P1进行特征点匹配,获得各个特征点行坐标的差值的均值Δs 1,将摄像机2的运动帧Q2与参考帧P1进行特征点匹配,获得各个特征点行坐标的差值的均值Δs 2,将摄像机2的运动帧Q3与参考帧P1进行特征点匹配,获得各个特征点行坐标的差值的均值Δs 3,其中,Δs 2=0,因此,摄像机2的运动帧Q2与参考帧P1中特征点之间的连线是平行的,运动帧Q2与参考帧P1是同一时刻下的第一图像和第二图像,因此可将摄像机1的帧P1与摄像机2的帧Q2对齐,即摄像机1采集的视频比摄像机2采集的视频慢1帧。当然,也可根据公式(7)获得同步偏移时间Δt之后,将摄像机1与摄像机2进行同步,比如偏移时间Δt=3ms,即摄像机1比摄像机2快3ms,那么可将摄像机2调快3秒,达到与摄像机1的视频同步的目的。应理解,图4用于举例说明,本申请不作具体限定。 For example, as shown in FIG. 7 , assuming that the frame number of the reference frame of camera 1 is P1, the motion frame of camera 2 includes frame Q1, frame Q2 and frame Q3, and the motion frame Q1 of camera 2 and the reference frame P1 are characterized by Point matching, obtain the mean value Δs 1 of the difference between the row coordinates of each feature point, perform feature point matching between the motion frame Q2 of the camera 2 and the reference frame P1, obtain the mean value Δs 2 of the difference value of the row coordinates of each feature point, put the camera 2 The feature points of the motion frame Q3 of the camera 2 and the reference frame P1 are matched to obtain the mean value Δs 3 of the difference between the row coordinates of each feature point, where Δs 2 =0. Therefore, the difference between the motion frame Q2 of the camera 2 and the feature points in the reference frame P1 is Δs 3 . The connection lines between them are parallel, the motion frame Q2 and the reference frame P1 are the first image and the second image at the same time, so the frame P1 of the camera 1 can be aligned with the frame Q2 of the camera 2, that is, the video captured by the camera 1 1 frame slower than the video captured by camera 2. Of course, after obtaining the synchronization offset time Δt according to formula (7), the camera 1 and the camera 2 can be synchronized. For example, the offset time Δt=3ms, that is, the camera 1 is 3ms faster than the camera 2, then the camera 2 can be adjusted faster 3 seconds to achieve the purpose of synchronizing with the video of camera 1. It should be understood that FIG. 4 is used for illustration, and is not specifically limited in the present application.
在一实施例中,在步骤S320对第一路视频和第二路视频进行视频同步处理之前,还可对第一路视频和第二路视频进行立体校正。应理解,计算视差时使用的公式往往是在假设多目摄像机处于理想境况时推导得出的,所以在使用多目摄像机进行测距定位之前,可将实际使用的多目摄像机120校正为理想状态。以双目摄像机为例,经过立体校正后的双目摄像机,其左右摄像机的图像平面平行,光轴和图像平面垂直,极点位于无限远处,此时的点(x 0,y 0)对应的极线为y=y 0。具体实现中,本申请实施例可采用业界已有的具有较优效果 的立体校正方法中的任意一种,例如bouguet极线校正方法,本申请不作具体限定。 In one embodiment, before performing video synchronization processing on the first channel of video and the second channel of video in step S320, stereoscopic correction may also be performed on the first channel of video and the second channel of video. It should be understood that the formula used in calculating the parallax is often derived under the assumption that the multi-camera is in an ideal situation, so before using the multi-camera for distance measurement and positioning, the actually used multi-camera 120 can be corrected to an ideal state . Taking a binocular camera as an example, after stereo correction, the image planes of the left and right cameras are parallel, the optical axis is perpendicular to the image plane, and the pole is located at infinity. The point (x 0 , y 0 ) at this time corresponds to The epipolar line is y=y 0 . In specific implementation, the embodiment of the present application may adopt any one of the existing stereo correction methods with better effects in the industry, such as the bouguet polar line correction method, which is not specifically limited in the present application.
在一实施例中,步骤S320还可使用多目摄像机在同一时刻对同一目标进行拍摄获得第一图像和第二图像,可以理解的,在步骤S320处使用多目摄像机采集了第一图像和第二图像而不是第一路视频和第二路视频的情况下,可以省去步骤S320对第一路视频和第二路视频进行时间同步处理的步骤,执行步骤S330对第一图像和第二图像进行视差计算,这里不再展开赘述。In one embodiment, in step S320, a multi-eye camera can also be used to capture the same target at the same time to obtain the first image and the second image. In the case of two images instead of the first video and the second video, the step of performing time synchronization processing on the first video and the second video in step S320 can be omitted, and step S330 is executed to The parallax calculation is performed, which will not be repeated here.
S330:对第一图像和第二图像进行目标检测和匹配,获得第一图像的第一目标区域和第二图像的第二目标区域。其中,第一目标区域和第二目标区域包括上述待检测的目标。S330: Perform target detection and matching on the first image and the second image to obtain a first target area of the first image and a second target area of the second image. Wherein, the first target area and the second target area include the above-mentioned target to be detected.
在一实施例中,可将第一图像输入检测匹配模型,获得第一图像的第一检测匹配结果,将第二图像输入检测匹配模型获得第二图像的第二检测匹配结果,根据第一检测匹配结果获得第一目标区域,根据第二检测匹配结果获得第二目标区域。其中,第一检测匹配结果和第二检测匹配结果包括目标框(bounding box)和标签,该目标框用于指示待测的目标在图像中的区域,不同目标的标签不同,根据第一检测匹配结果和第二检测匹配结果中的标签,可以确定第一图像和第二图像中的同一个目标,然后结合上述目标框确定第一目标区域和第二目标区域。In one embodiment, the first image may be input into the detection and matching model to obtain the first detection and matching result of the first image, and the second image may be input into the detection and matching model to obtain the second detection and matching result of the second image. The first target area is obtained from the matching result, and the second target area is obtained according to the second detection and matching result. The first detection matching result and the second detection matching result include a bounding box and a label, and the bounding box is used to indicate the area of the target to be detected in the image. The labels of different targets are different. According to the first detection matching The label in the result and the second detection matching result can determine the same target in the first image and the second image, and then determine the first target area and the second target area in combination with the above target frame.
具体地,目标检测匹配结果中的目标框可以是矩形框、圆形框、椭圆框等等,本申请不作具体限定。应理解,如果待测目标的数量为多个,检测匹配结果可包括多个目标的多个目标框,因此在检测匹配结果中,可对相同目标使用相同的标签进行标识,不同的目标使用不同的标签进行标识,这样,在对目标进行视差计算时,可根据标签辨别不同路视频帧中的同一目标,从而达到将同一时刻下的第一图像和第二图像中的同一目标进行特征点匹配的目的,进而获得该目标的视差。Specifically, the target frame in the target detection matching result may be a rectangular frame, a circular frame, an oval frame, etc., which is not specifically limited in this application. It should be understood that if the number of targets to be tested is multiple, the detection and matching results may include multiple target frames of multiple targets. Therefore, in the detection and matching results, the same target may be identified by the same label, and different targets may be identified using different In this way, when the disparity calculation is performed on the target, the same target in different video frames can be identified according to the label, so as to achieve the feature point matching of the same target in the first image and the second image at the same moment. , and then obtain the disparity of the target.
举例来说,仍以图7所示的例子为例,同步后的第一路视频和第二路视频中,摄像机1的帧P3和摄像机2的帧Q4是同一时刻下的第一图像和第二图像,示例性的,摄像机1的帧P3和摄像机2的帧Q4输入上述检测匹配模型后,获得的第一检测匹配结果和第二检测匹配结果可以如图8所示,图8是本申请提供的一种目标定位方法中的目标检测匹配结果的示例图,其中,检测匹配结果为图8中显示的矩形目标框和ID标签,ID:001和ID:002,根据第一检测匹配结果和第二检测匹配结果可获知帧P3中框选出的油罐车与帧Q4中框选出的油罐车是同一辆车,帧P3中框选出的巴士与帧Q4中框选出的巴士是同一辆车。应理解,图8用于举例说明,目标框还可以是圆形框、椭圆形框等其他表现形式,检测匹配结果显示的ID标签还可以是字母、数字等其他表现形式,本申请不作具体限定。For example, still taking the example shown in FIG. 7 as an example, in the synchronized first video and second video, frame P3 of camera 1 and frame Q4 of camera 2 are the first image and the first image at the same moment. Two images, exemplarily, after the frame P3 of the camera 1 and the frame Q4 of the camera 2 are input into the above detection matching model, the obtained first detection matching result and second detection matching result can be as shown in FIG. An example diagram of a target detection matching result in a target localization method is provided, wherein the detection matching result is the rectangular target frame and ID label shown in FIG. 8, ID: 001 and ID: 002, according to the first detection matching result and As a result of the second detection and matching, it can be known that the tank truck selected by the box in frame P3 and the tank truck selected by the box in frame Q4 are the same vehicle, and the bus box selected in frame P3 and the bus box selected in frame Q4 are the same vehicle. is the same car. It should be understood that FIG. 8 is used for illustration, the target frame can also be other forms such as a circular frame, an oval frame, etc., and the ID label displayed by the detection matching result can also be other forms such as letters, numbers, etc., which are not specifically limited in this application. .
可选地,如图9所示,图9是本申请提供的一种目标定位方法中的目标检测模型的结构示意图,该检测匹配模型可包括特征提取模块610和检测匹配模块620。其中,特征提取模块610用于提取输入的第一图像和第二图像中的特征,生成高维度的特征向量,检测匹配模块620用于根据上述特征向量,生成包含目标框和标签的检测匹配结果。举例来说,摄像机1的帧P3和摄像机2的帧Q4是同一时刻下的第一图像和第二图像,帧P3和帧Q4可先输入特征提取模块610生成高维特征向量,然后将该特征向量输入检测匹配模块620,检测匹配模块620生成如图5所示的检测匹配结果,如果待测的目标是001,那么可以获得图9所示的第一目标区域和第二目标区域,应理解,图9用于举例说明,本申请不作具 体限定。Optionally, as shown in FIG. 9 , FIG. 9 is a schematic structural diagram of a target detection model in a target localization method provided by the present application. The detection and matching model may include a feature extraction module 610 and a detection and matching module 620 . Among them, the feature extraction module 610 is used to extract the features in the input first image and the second image, and generate a high-dimensional feature vector, and the detection matching module 620 is used to generate the detection matching result including the target frame and the label according to the above-mentioned feature vector. . For example, the frame P3 of the camera 1 and the frame Q4 of the camera 2 are the first image and the second image at the same time. The frame P3 and the frame Q4 can be input to the feature extraction module 610 to generate a high-dimensional feature vector, and then the feature The vector input detection matching module 620, the detection matching module 620 generates the detection matching result as shown in Figure 5, if the target to be tested is 001, then the first target area and the second target area shown in Figure 9 can be obtained, it should be understood , FIG. 9 is used for illustration, and is not specifically limited in this application.
在一实施例中,在步骤S310之前,可使用样本集对检测匹配模型进行训练,该样本集可包括第一图像样本、第二图像样本以及对应的样本真值,该样本真值包括目标检测真值和目标匹配真值,其中,目标检测真值包括第一图像样本和第二图像样本中目标的目标框,目标匹配真值包括第一图像样本和第二图像样本中目标的标签,使用该样本集对上述检测匹配模型进行训练时,用于反向传播的检测匹配损失是根据检测匹配模块620的输出值与样本真值之间的差距确定的,根据该检测匹配损失对检测匹配模型的参数进行调整,直至上述检测匹配损失达到阈值后,获得训练好的检测匹配模型。In one embodiment, before step S310, a sample set may be used to train the detection matching model, and the sample set may include a first image sample, a second image sample, and a corresponding sample truth value, and the sample truth value includes target detection. The ground-truth value and the target matching ground-truth value, wherein the target detection ground-truth value includes the target frame of the target in the first image sample and the second image sample, and the target matching ground-truth value includes the label of the target in the first image sample and the second image sample, using When the above-mentioned detection matching model is trained by the sample set, the detection matching loss used for back propagation is determined according to the difference between the output value of the detection matching module 620 and the sample true value, and the detection matching model is determined according to the detection matching loss. The parameters are adjusted until the above detection matching loss reaches the threshold, and the trained detection matching model is obtained.
具体实现中,特征提取模块610可以是VGG、Resnet等用于提取图像特征的神经网络骨干结构,上述检测匹配模块620可以是目标检测网络,比如YOLO网络、SSD网络、RCNN等等,本申请不作具体限定。In specific implementation, the feature extraction module 610 may be a neural network backbone structure such as VGG, Resnet, etc. for extracting image features, and the above-mentioned detection and matching module 620 may be a target detection network, such as YOLO network, SSD network, RCNN, etc. Specific restrictions.
应理解,本申请通过将相同的目标标记为相同标签的方式,可以将第一图像和第二图像输入检测匹配模型后,根据标签是否相同确定第一目标区域和第二目标区域,而不是对目标进行图像识别来确定第一图像和第二图像中的同一目标,可以减少计算复杂度,提高第一目标区域和第二目标区域的获取效率,进而提高测距定位的效率。It should be understood that by marking the same target with the same label in this application, after the first image and the second image are input into the detection and matching model, the first target area and the second target area can be determined according to whether the labels are the same, rather than Performing image recognition on the target to determine the same target in the first image and the second image can reduce computational complexity, improve the acquisition efficiency of the first target area and the second target area, and further improve the efficiency of ranging and positioning.
S340:对第一目标区域和第二目标区域进行特征点检测和匹配,获得特征点匹配结果。其中,特征点匹配结果包括第一目标区域中的特征点与第二目标区域中特征点之间的对应关系,存在该对应关系的特征点描述的是待测的目标的同一个特征。比如待测的目标是行人,该行人的特征点包括眼睛、鼻子和嘴巴,那么第一目标区域中该行人的眼睛与第二目标区域中该行人的眼睛之间存在对应关系。S340: Perform feature point detection and matching on the first target area and the second target area to obtain a feature point matching result. The feature point matching result includes the corresponding relationship between the feature points in the first target area and the feature points in the second target area, and the feature points with the corresponding relationship describe the same feature of the target to be measured. For example, the target to be tested is a pedestrian, and the feature points of the pedestrian include eyes, nose and mouth, then there is a correspondence between the eyes of the pedestrian in the first target area and the eyes of the pedestrian in the second target area.
在一实施例中,可使用特征点检测算法对第一目标区域和第二目标区域进行特征点检测,获得第一目标区域的特征点以及第二目标区域的特征点,由于每个目标区域中的目标是同一个目标,因此第一目标区域的特征点,在第二目标区域中存在与其对应的特征点。In one embodiment, a feature point detection algorithm can be used to perform feature point detection on the first target area and the second target area to obtain the feature points of the first target area and the feature points of the second target area. The target is the same target, so the feature points of the first target area have their corresponding feature points in the second target area.
可选地,本申请实施例中的特征点检测和匹配算法可以是特征点提取算法(features from accelerated segment test,FAST)、特征点描述算法(binary robust independent elementary features,BRIEF)、结合FAST和BRIEF的算法(oriented fast and rotated brief,ORB)、加速稳健特征算法(speeded up robust features,SURF)、加速KAZE特征算法(accelerated KAZE features,AKAZE)等等,本申请不作具体限定。Optionally, the feature point detection and matching algorithm in this embodiment of the present application may be a feature point extraction algorithm (features from accelerated segment test, FAST), a feature point description algorithm (binary robust independent elementary features, BRIEF), a combination of FAST and BRIEF oriented fast and rotated brief (ORB), accelerated robust feature algorithm (speeded up robust features, SURF), accelerated KAZE feature algorithm (accelerated KAZE features, AKAZE), etc., which are not specifically limited in this application.
仍以图7~图9实施例描述的例子为例,待测的目标为ID:001的车辆,第一目标区域和第二目标区域如图10所示,图10是本申请提供的一种目标定位方法中特征点检测和匹配的流程示意图,对第一目标区域和第二目标区域进行特征点检测和匹配后,可获得如图10所示的特征点匹配结果。示例性的,图10显示了部分特征点匹配结果,第一目标区域中检测出的每一个特征点,在第二目标区域中将会有与其对应的特征点。应理解,图10将存在对应关系的特征点之间以连线表示,具体实现中,特征点匹配结果可以用其他方式表示特征点之间的对应关系,图10用于举例说明,本申请不作具体限定。Still taking the examples described in the embodiments of FIGS. 7 to 9 as an example, the target to be tested is a vehicle with ID: 001, and the first target area and the second target area are shown in FIG. 10 , which is an example provided by this application A schematic flowchart of feature point detection and matching in the target localization method. After the feature point detection and matching is performed on the first target area and the second target area, the feature point matching result shown in FIG. 10 can be obtained. Exemplarily, FIG. 10 shows a partial feature point matching result, and each feature point detected in the first target area will have a corresponding feature point in the second target area. It should be understood that in FIG. 10 , the feature points with corresponding relationships are represented by connecting lines. In a specific implementation, the feature point matching result can be used to represent the corresponding relationship between the feature points in other ways. FIG. Specific restrictions.
S350:根据特征点匹配结果和多目摄像机的参数信息,确定目标的位置信息。S350: Determine the position information of the target according to the feature point matching result and the parameter information of the multi-camera.
其中,多目摄像机的参数信息至少包括多目摄像机的基线长度和焦距,还可包括多目摄像机的地理坐标信息。目标的位置信息可包括目标与多目摄像机之间的距离,还可包括 目标的地理坐标,本申请不作具体限定。The parameter information of the multi-camera includes at least the baseline length and focal length of the multi-camera, and may also include geographic coordinate information of the multi-camera. The location information of the target may include the distance between the target and the multi-camera, and may also include the geographic coordinates of the target, which is not specifically limited in this application.
具体地,可以根据特征匹配结果中存在对应关系的特征点之间的像素差异,获得目标的视差信息,该视差信息包括第一目标区域中的特征点的像素坐标与第二目标区域中对应特征点的像素坐标之间的差距,结合图1实施例和公式3,可以根据视差信息、基线长度b以及焦距f确定目标与多目摄像机之间的距离,根据多目摄像机的地理坐标信息,可以确定目标的地理坐标。Specifically, the disparity information of the target can be obtained according to the pixel difference between the feature points with corresponding relationships in the feature matching result, where the disparity information includes the pixel coordinates of the feature points in the first target area and the corresponding features in the second target area. The distance between the pixel coordinates of the point, combined with the embodiment of FIG. 1 and formula 3, can be used to determine the distance between the target and the multi-camera camera according to the parallax information, the baseline length b and the focal length f. Determine the geographic coordinates of the target.
具体实现中,确定每个特征点在第一目标区域中的像素坐标与第二目标区域中的像素坐标之间的差距后,取部分可信的像素差异作为视差,或者取平均值作为该目标的视差信息,然后使用该视差信息进行距离计算,本申请不作具体限定。举例来说,若第一目标区域包括目标X的特征点A1和B1,第二目标区域中包括目标X的特征点A2和B2,其中,A1和A2是同一个特征点,B1和B2是同一个特征点,在确定特征点A1和特征点A2之间的像素差异D1,以及特征点B1和特征点B2之间的像素差异D2之后,可根据像素差异D1和像素差异D2的平均值确定该目标的视差,进而获得该目标与双目摄像机之间的距离,应理解,上述举例用于说明,本申请不作具体限定。In the specific implementation, after determining the difference between the pixel coordinates of each feature point in the first target area and the pixel coordinates in the second target area, some credible pixel differences are taken as the parallax, or the average value is taken as the target disparity information, and then use the disparity information to perform distance calculation, which is not specifically limited in this application. For example, if the first target area includes feature points A1 and B1 of target X, the second target area includes feature points A2 and B2 of target X, where A1 and A2 are the same feature point, and B1 and B2 are the same For a feature point, after determining the pixel difference D1 between the feature point A1 and the feature point A2, and the pixel difference D2 between the feature point B1 and the feature point B2, the average value of the pixel difference D1 and the pixel difference D2 can be determined. The parallax of the target, and then the distance between the target and the binocular camera is obtained. It should be understood that the above examples are for illustration, and are not specifically limited in this application.
示例性的,如图11所示,图11是本申请提供的一种目标定位方法在一应用场景下的特征点匹配结果示意图,以测量人物Y与双目摄像机的距离这一实际的应用场景为例,使用本申请提供的目标定位方法,可先根据测量精度需求(比如人物Y是近距离目标,测量误差为正负1米),结合公式(4)~(6)确定目标基线,然后将目标基线发送至多目摄像机120,获得由该目标基线对应的摄像机组拍摄的第一图像和第二图像,再将该第一图像和第二图像输入图9所示的检测匹配模型,获得如图11所示的第一检测匹配结果和第二检测匹配结果,根据第一检测匹配结果和第二检测匹配结果中的目标框和标签,获得包含人物Y的第一目标区域和第二目标区域,对该第一目标区域和第二目标区域进行特征点检测和匹配后可获得如图11所示的特征点匹配结果,根据存在对应关系的多组特征点之间的像素差异,确定人物Y的视差,可定位出人物Y离相机的距离为14.2m。应理解,图11用于举例说明,本申请不作具体限定。Exemplarily, as shown in FIG. 11, FIG. 11 is a schematic diagram of the feature point matching result of a target positioning method provided by the present application in an application scenario, to measure the actual application scenario of the distance between the person Y and the binocular camera. For example, using the target positioning method provided by this application, you can first determine the target baseline according to the measurement accuracy requirements (for example, the person Y is a short-range target, and the measurement error is plus or minus 1 meter), combined with formulas (4) to (6), and then Send the target baseline to the multi-camera 120, obtain the first image and the second image captured by the camera group corresponding to the target baseline, and then input the first image and the second image into the detection matching model shown in FIG. The first detection matching result and the second detection matching result shown in FIG. 11 , according to the target frame and label in the first detection matching result and the second detection matching result, the first target area and the second target area including the person Y are obtained. , after the feature point detection and matching of the first target area and the second target area, the feature point matching result as shown in Figure 11 can be obtained. The parallax is 14.2m away from the camera. It should be understood that FIG. 11 is used for illustration, and is not specifically limited in the present application.
可以理解的,由于视差是根据特征点之间的差异确定的,而不是对第一目标区域和第二目标区域中每一个像素之间的差异确定,不但可以降低计算量,提高视差计算效率,而且由于特征点不仅能够处于像素内,也可处于像素与像素之间,换句话说,基于像素匹配的方式确定视差的精度是整数级别,基于特征点匹配的方式确定视差的精度为小数级别,因此本申请通过特征点匹配的方式进行视差计算的精度更高,进而使得测距定位的准确度更高。It can be understood that since the parallax is determined according to the difference between the feature points, rather than the difference between each pixel in the first target area and the second target area, it can not only reduce the amount of calculation, but also improve the calculation efficiency of the parallax. Moreover, since the feature points can not only be in the pixel, but also between the pixels, in other words, the accuracy of determining the parallax based on pixel matching is at the integer level, and the accuracy of determining the parallax based on the feature point matching is at the decimal level. Therefore, in the present application, the parallax calculation is performed with higher accuracy by means of feature point matching, thereby making the accuracy of ranging and positioning higher.
本申请提供的方案还可提高无纹理物体的视差计算准确度,进而提高无纹理物体的测距定位精度。可以理解的,在使用多目摄像机拍摄无纹理物体时,无纹理物体的像素差别很小,导致计算不同路图像的像素点之间的差异来确定目标视差的方法,精度很差。但是使用本申请提供的方案,将目标所在的第一目标区域和第二目标区域提取出来,然后对第一目标区域和第二目标区域进行特征点匹配获得特征点匹配结果,根据特征点匹配结果确定视差,可以提升无纹理物体的匹配精度。The solution provided by the present application can also improve the parallax calculation accuracy of the textureless object, thereby improving the ranging and positioning accuracy of the textureless object. It is understandable that when using a multi-eye camera to shoot textureless objects, the pixel difference of the textureless objects is very small, resulting in the method of calculating the difference between the pixels of different road images to determine the target parallax, and the accuracy is very poor. However, using the solution provided in this application, extract the first target area and the second target area where the target is located, and then perform feature point matching on the first target area and the second target area to obtain the feature point matching result. According to the feature point matching result Determining parallax can improve the matching accuracy of untextured objects.
示例性的,以测量无纹理物体Z与双目摄像机的距离这一实际的应用场景为例,如图 12所示,图12是本申请提供的一种无纹理物体的示意图,假设无纹理物体Z为棋盘格,将棋盘格放置在距离双目摄像机7.5m的位置处,使用某品牌双目相机拍摄棋盘格输出的深度值为6.7m,而使用本申请提供的方案输出的深度值为7.2米,因此本申请提供的方案视差计算的准确度更高,具有更好的测距定位精度。Illustratively, taking the actual application scenario of measuring the distance between the textureless object Z and the binocular camera as an example, as shown in FIG. 12 , FIG. 12 is a schematic diagram of a textureless object provided by the present application, assuming a textureless object. Z is a checkerboard, and the checkerboard is placed at a distance of 7.5m from the binocular camera. Using a certain brand of binocular camera to shoot the checkerboard, the output depth value is 6.7m, while the output depth value of the solution provided by this application is 7.2 Therefore, the solution provided by the present application has higher accuracy of parallax calculation and better ranging and positioning accuracy.
本申请提供的方案还可提高遮挡场景下的视差计算准确度,进而提高被遮挡物体的测距定位精度。可以理解的,由于被遮挡物体的像素被遮盖,表现为遮挡物的像素,导致计算不同路图像的像素点之间的差异来确定目标的视差的方法,精度很差,但是使用本申请提供的方案,在使用图9所示的检测匹配模型对目标进行目标检测和匹配后,能够将被遮挡物体的位置预估出来,对被遮挡物体进行补充,获得补充后的第一目标区域和第二目标区域,再对其进行特征点检测和匹配,获得特征点匹配结果,根据特征待匹配结果确定目标的视差信息,获得目标与多目摄像机之间的距离,这样计算出的视差准确度更高,被遮挡物体的测距精度也更高。The solution provided by the present application can also improve the accuracy of parallax calculation in an occluded scene, thereby improving the accuracy of ranging and positioning of occluded objects. It is understandable that since the pixels of the occluded object are covered and appear as the pixels of the occluder, the method of calculating the difference between the pixel points of different road images to determine the parallax of the target has poor accuracy, but the method provided by this application is used. Scheme, after using the detection and matching model shown in Figure 9 to perform target detection and matching on the target, the position of the occluded object can be estimated, the occluded object can be supplemented, and the supplemented first target area and second target area can be obtained. target area, and then perform feature point detection and matching on it to obtain the feature point matching result, determine the parallax information of the target according to the feature to be matched result, and obtain the distance between the target and the multi-camera, so that the calculated parallax accuracy is higher , the ranging accuracy of occluded objects is also higher.
举例来说,如图13所示,图13是本申请提供的一种遮挡场景下的确定第一目标区域和第二目标区域的步骤流程示意图,假设目标004第一目标区域中未被目标005遮挡,而在第二目标区域中目标004被目标005遮挡,如果直接根据第一目标区域和第二目标区域之间的像素之间差异确定目标的视差信息,由于目标004在右侧图像中被目标005遮挡,将导致最后获得视差不准确,进而导致测距定位精度低。使用本申请提供的方案在对该组第一目标区域和第二目标区域中的目标004进行视差计算时,可以先将第二目标区域中目标004的位置预估出来,然后再进行特征点检测和匹配,获得特征点匹配结果,从而获得目标004的视差,进而获得目标004的测距定位结果,本申请提供的方案在遮挡场景下的视差计算准确度更高。For example, as shown in FIG. 13 , FIG. 13 is a schematic flowchart of the steps of determining the first target area and the second target area in an occlusion scenario provided by the present application. It is assumed that the first target area of the target 004 is not covered by the target 005 occlusion, and in the second target area, the target 004 is occluded by the target 005. If the disparity information of the target is directly determined according to the difference between the pixels between the first target area and the second target area, since the target 004 is blocked in the right image If the target 005 is occluded, the final obtained parallax will be inaccurate, resulting in low ranging and positioning accuracy. When using the solution provided by this application to perform parallax calculation on the target 004 in the first target area and the second target area, you can first estimate the position of the target 004 in the second target area, and then perform feature point detection. and matching to obtain the feature point matching result, thereby obtaining the disparity of the target 004, and then obtaining the ranging and positioning result of the target 004. The solution provided by the present application has higher disparity calculation accuracy in the occlusion scene.
综上可知,本申请提供了一种目标定位方法,该方法可根据待测的目标确定目标基线,使用该目标基线的摄像机组对目标进行采集获得第一图像和第二图像,然后对第一图像和第二图像进行目标检测和匹配,获得目标所在的第一目标区域和第二目标区域,最后对第一目标区域和第二目标区域进行特征点的检测和匹配,获得特征点匹配结果,根据特征点匹配结果确定每个特征点的视差信息,从而确定目标的位置信息。该系统可根据待测的目标灵活选择目标基线的摄像机组进行数据采集,避免由于固定基线的多目摄像机带来的测距范围受限的问题,提高目标定位系统的测距范围,同时,该系统根据特征点的视差信息确定目标的位置信息,无需对第一图像区域和第二图像区域中每个像素进行匹配和视差计算,从而降低定位测距时所需的计算资源,同时避免背景干扰、噪声等问题,提升测距定位的精度。To sum up, the present application provides a target positioning method, which can determine a target baseline according to the target to be measured, use the camera group of the target baseline to collect the target to obtain a first image and a second image, and then perform the first image and the second image. The image and the second image are subjected to target detection and matching, and the first target area and the second target area where the target is located are obtained, and finally the feature point detection and matching are performed on the first target area and the second target area, and the feature point matching result is obtained, The disparity information of each feature point is determined according to the feature point matching result, thereby determining the position information of the target. The system can flexibly select the target baseline camera group for data collection according to the target to be measured, avoid the problem of limited ranging range caused by fixed baseline multi-eye cameras, and improve the ranging range of the target positioning system. The system determines the location information of the target according to the parallax information of the feature points, and does not need to perform matching and parallax calculation for each pixel in the first image area and the second image area, thereby reducing the computational resources required for positioning and ranging, and avoiding background interference. , noise and other problems, improve the accuracy of ranging and positioning.
上面详细阐述了本申请实施例的方法,为了便于更好地实施本申请实施例上述方案,相应地,下面还提供用于配合实施上述方案的相关设备。The methods of the embodiments of the present application are described in detail above. In order to facilitate better implementation of the above solutions in the embodiments of the present application, correspondingly, related equipment for implementing the above solutions is also provided below.
本申请中目标定位系统110按照功能进行模块或者单元的划分可以有多种划分方式。例如前述图2所示的,目标定位系统110可以包括基线确定单元111、同步单元112以及检测匹配单元113。具体各个模块的功能可以参照前述描述,此处不赘述。在另一种实施例中,目标定位系统110还可以根据功能进行进一步的单元划分,例如,图14是本申请提供的另 一种目标定位系统110的结构示意图。In the present application, the target positioning system 110 may be divided into modules or units according to functions, and there may be various ways of division. For example, as shown in the aforementioned FIG. 2 , the target positioning system 110 may include a baseline determination unit 111 , a synchronization unit 112 and a detection matching unit 113 . For specific functions of each module, reference may be made to the foregoing description, which will not be repeated here. In another embodiment, the target positioning system 110 may be further divided into units according to functions. For example, FIG. 14 is a schematic structural diagram of another target positioning system 110 provided by the present application.
如图14所示,本申请提供了一种目标定位系统110,如图14所示,该目标定位系统110看包括:基线确定单元1410、获取单元1420、同步单元1430、检测匹配单元1440以及位置确定单元1450。As shown in FIG. 14, the present application provides a target positioning system 110. As shown in FIG. 14, the target positioning system 110 includes: a baseline determination unit 1410, an acquisition unit 1420, a synchronization unit 1430, a detection matching unit 1440, and a position Determining unit 1450.
获取单元1420,用于获取第一图像和第二图像,第一图像和第二图像是多目摄像机在同一时刻对同一目标进行拍摄获得;an acquisition unit 1420, configured to acquire a first image and a second image, the first image and the second image are obtained by photographing the same target at the same time by a multi-camera camera;
检测匹配单元1440,用于对第一图像和第二图像进行目标检测和匹配,获得第一图像的第一目标区域和第二图像的第二目标区域,其中,第一目标区域和第二目标区域包括目标;The detection and matching unit 1440 is configured to perform target detection and matching on the first image and the second image, and obtain the first target area of the first image and the second target area of the second image, wherein the first target area and the second target area are The area includes the target;
检测匹配单元1440,用于对第一目标区域和第二目标区域进行特征点检测和匹配,获得特征点匹配结果,其中,特征点匹配结果包括第一目标区域中的特征点与第二目标区域中的特征点之间的对应关系,存在对应关系的特征点描述目标的同一个特征;The detection and matching unit 1440 is configured to perform feature point detection and matching on the first target area and the second target area, and obtain a feature point matching result, wherein the feature point matching result includes the feature points in the first target area and the second target area. The corresponding relationship between the feature points in , the feature points with the corresponding relationship describe the same feature of the target;
位置确定单元1450,用于根据特征点匹配结果和多目摄像机的参数信息,确定目标的位置信息。The position determining unit 1450 is configured to determine the position information of the target according to the feature point matching result and the parameter information of the multi-camera.
在一实施例中,参数信息至少包括多目摄像机的基线长度和多目摄像机的焦距;位置确定单元1450,用于根据特征点匹配结果中存在对应关系的特征点之间的像素差异,获得目标的视差信息,视差信息包括第一目标区域中的特征点的像素坐标与第二目标区域中存在对应关系的特征点的像素坐标之间的差距;位置确定单元1450,用于根据目标的视差信息、多目摄像机的基线长度以及多目摄像机的焦距,确定目标与摄像机之间的距离,获得目标的位置信息。In one embodiment, the parameter information includes at least the baseline length of the multi-camera camera and the focal length of the multi-camera camera; the position determination unit 1450 is configured to obtain the target according to the pixel difference between the feature points with corresponding relationships in the feature point matching result. The disparity information, the disparity information includes the difference between the pixel coordinates of the feature points in the first target area and the pixel coordinates of the feature points that have a corresponding relationship in the second target area; the position determination unit 1450 is used for disparity information according to the target. , the baseline length of the multi-eye camera and the focal length of the multi-eye camera, determine the distance between the target and the camera, and obtain the position information of the target.
在一实施例中,多目摄像机包括多个摄像机组,多个摄像机组中的每组摄像机包括多个摄像机,基线确定单元1410,用于获取多目摄像机的基线数据,基线数据包括每组摄像机中的多个摄像机之间的基线长度;基线确定单元1410,用于根据目标的测量精度需求,从基线数据中获取目标基线;获取单元1420,用于根据目标基线获取第一图像和第二图像,其中,第一图像和第二图像是目标基线对应的摄像机组拍摄获得。In one embodiment, the multi-camera camera includes a plurality of camera groups, and each camera group in the plurality of camera groups includes a plurality of cameras, and the baseline determination unit 1410 is configured to acquire baseline data of the multi-camera camera, and the baseline data includes each group of cameras. The baseline length between the multiple cameras in the baseline; the baseline determination unit 1410 is used to obtain the target baseline from the baseline data according to the measurement accuracy requirements of the target; the acquisition unit 1420 is used to obtain the first image and the second image according to the target baseline , wherein the first image and the second image are captured by the camera group corresponding to the target baseline.
在一实施例中,基线确定单元1410,用于向多目摄像机发送携带目标基线的基线调整请求,基线调整请求用于指示多目摄像机调整该多目摄像机包括的摄像机组的基线长度到所述目标基线;获取单元1420,用于接收目标基线对应的摄像机组拍摄的第一图像和第二图像。In one embodiment, the baseline determination unit 1410 is configured to send a baseline adjustment request carrying a target baseline to the multi-camera camera, where the baseline adjustment request is used to instruct the multi-camera camera to adjust the baseline length of the camera group included in the multi-camera camera to the Target baseline; the acquiring unit 1420 is configured to receive the first image and the second image captured by the camera group corresponding to the target baseline.
在一实施例中,基线确定单元1410,用于确定每组摄像机的第一精度指标和第二精度指标,其中,第一精度指标与每组摄像机的基线长度呈反比例关系,第一精度指标与每组摄像机的共视区域呈正比例关系,第二精度指标与每组摄像机的基线长度和焦距呈正比例关系,共视区域是每组摄像机中的多个摄像机共同拍摄到的区域;基线确定单元1410,用于根据目标的测量精度需求确定第一精度指标和第二精度指标的权重;基线确定单元1410,用于根据第一精度指标、第二精度指标和权重,获得所每组摄像机的综合指标;基线确定单元1410,用于根据每组摄像机的综合指标,确定目标基线。In one embodiment, the baseline determination unit 1410 is configured to determine the first accuracy index and the second accuracy index of each group of cameras, wherein the first accuracy index is inversely proportional to the baseline length of each group of cameras, and the first accuracy index is inversely proportional to the length of the baseline of each group of cameras. The common viewing area of each group of cameras is in a proportional relationship, the second accuracy index is proportional to the baseline length and focal length of each group of cameras, and the common viewing area is the area photographed by multiple cameras in each group of cameras; the baseline determination unit 1410 , used to determine the weight of the first accuracy index and the second accuracy index according to the measurement accuracy requirements of the target; the baseline determination unit 1410 is used to obtain the comprehensive index of each group of cameras according to the first accuracy index, the second accuracy index and the weight ; The baseline determination unit 1410 is used to determine the target baseline according to the comprehensive index of each group of cameras.
在一实施例中,同步单元1430,用于接收多目摄像机对目标进行拍摄获得的第一路视频和第二路视频;同步单元1430,用于对第一路视频和第二路视频进行时间同步处理,获 得同一时刻的第一图像和第二图像,其中,第一图像是第一路视频中的图像帧,第二图像是第二路视频中的图像帧。In one embodiment, the synchronization unit 1430 is used to receive the first video and the second video obtained by shooting the target with the multi-eye camera; the synchronization unit 1430 is used to time the first video and the second video. Synchronous processing is performed to obtain a first image and a second image at the same moment, wherein the first image is an image frame in the first channel of video, and the second image is an image frame in the second channel of video.
在一实施例中,同步单元1430,用于从第一路视频中获取参考帧,从第二路视频中获取多个运动帧,其中,参考帧和多个运动帧中包括运动物体;同步单元1430,用于将参考帧与多个运动帧进行特征点匹配,获得多个运动帧中的同步帧,其中,同步帧中的特征点与参考帧中对应的特征点之间连线的平行度满足预设条件;同步单元1430,用于根据参考帧和同步帧对第一路视频和第二路视频进行时间同步校正,获得同一时刻的第一图像和第二图像。In one embodiment, the synchronization unit 1430 is configured to obtain a reference frame from the first channel video, and obtain a plurality of motion frames from the second channel video, wherein the reference frame and the plurality of motion frames include moving objects; the synchronization unit 1430, for performing feature point matching between the reference frame and multiple motion frames, to obtain a synchronization frame in the multiple motion frames, wherein the parallelism of the line between the feature point in the synchronization frame and the corresponding feature point in the reference frame The preset conditions are met; the synchronization unit 1430 is configured to perform time synchronization correction on the video of the first channel and the video of the second channel according to the reference frame and the synchronization frame, and obtain the first image and the second image at the same moment.
在一实施例中,检测匹配单元1440,用于将第一图像输入检测匹配模型,获得第一图像的第一检测匹配结果,将第二图像输入检测匹配模型,获得第二图像的第二检测匹配结果,其中,第一检测匹配结果和第二检测匹配结果包括目标框和标签,目标框用于指示目标在图像中的区域,同一目标的标签相同;检测匹配单元1440,用于根据第一检测匹配结果获得第一目标区域,根据第二检测匹配结果获得第二目标区域。In one embodiment, the detection matching unit 1440 is configured to input the first image into the detection matching model, obtain the first detection matching result of the first image, input the second image into the detection matching model, and obtain the second detection matching result of the second image. The matching result, wherein the first detection matching result and the second detection matching result include a target frame and a label, the target frame is used to indicate the area of the target in the image, and the label of the same target is the same; the detection matching unit 1440 is used for according to the first The first target area is obtained by detecting the matching result, and the second target area is obtained according to the second detecting and matching result.
应理解,目标定位系统110内部的单元模块也可以有多种划分,各个模块可以是软件模块,也可以是硬件模块,也可以部分是软件模块部分是硬件模块,本申请不对其进行限制。并且,图2和图14均为示例性的划分方式,举例来说,在一些可行的方案中,图14中的获取单元1420也可以省略;在另一些可行的方案中,图14中的位置确定单元1450也可以省略;在另一些可行的方案中,图14中的检测匹配单元1440也可进一步划分多个模块,比如用于获得第一目标区域和第二目标区域的图像检测匹配模块,以及用于获得特征点匹配结果的特征点检测模块,本申请不对此进行限定。It should be understood that the unit modules inside the target positioning system 110 may also be divided into multiple divisions, and each module may be a software module, a hardware module, or part of a software module and part of a hardware module, which is not limited in this application. 2 and 14 are both exemplary division manners. For example, in some feasible solutions, the obtaining unit 1420 in FIG. 14 may also be omitted; in other feasible solutions, the location in FIG. 14 The determination unit 1450 may also be omitted; in other feasible solutions, the detection and matching unit 1440 in FIG. 14 may be further divided into multiple modules, such as an image detection and matching module for obtaining the first target area and the second target area, and a feature point detection module for obtaining a feature point matching result, which is not limited in this application.
综上可知,本申请提供了一种目标定位系统,该系统可根据待测的目标确定目标基线,使用该目标基线的摄像机组对目标进行采集获得第一图像和第二图像,然后对第一图像和第二图像进行目标检测和匹配,获得目标所在的第一目标区域和第二目标区域,最后对第一目标区域和第二目标区域进行特征点的检测和匹配,获得特征点匹配结果,根据特征点匹配结果确定每个特征点的视差信息,从而确定目标的位置信息。该系统可根据待测的目标灵活选择目标基线的摄像机组进行数据采集,避免由于固定基线的多目摄像机带来的测距范围受限的问题,提高目标定位系统的测距范围,同时,该系统根据特征点的视差信息确定目标的位置信息,无需对第一图像区域和第二图像区域中每个像素进行匹配和视差计算,从而降低定位测距时所需的计算资源,同时避免背景干扰、噪声等问题,提升测距定位的精度。To sum up, the present application provides a target positioning system, which can determine a target baseline according to the target to be measured, use the camera group of the target baseline to collect the target to obtain a first image and a second image, and then perform the first image and the second image. The image and the second image are subjected to target detection and matching, and the first target area and the second target area where the target is located are obtained, and finally the feature point detection and matching are performed on the first target area and the second target area, and the feature point matching result is obtained, The disparity information of each feature point is determined according to the feature point matching result, thereby determining the position information of the target. The system can flexibly select the target baseline camera group for data collection according to the target to be measured, avoid the problem of limited ranging range caused by fixed baseline multi-eye cameras, and improve the ranging range of the target positioning system. The system determines the location information of the target according to the parallax information of the feature points, and does not need to perform matching and parallax calculation for each pixel in the first image area and the second image area, thereby reducing the computational resources required for positioning and ranging, and avoiding background interference. , noise and other problems, improve the accuracy of ranging and positioning.
图15是本申请提供的一种计算设备900的结构示意图,该计算设备900可以是前述内容中的目标定位系统110。如图15所示,计算设备900包括:处理器910、通信接口920以及存储器930。其中,处理器910、通信接口920以及存储器930可以通过内部总线940相互连接,也可通过无线传输等其他手段实现通信。本申请实施例以通过总线940连接为例,总线940可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。总线940可以分为地址总线、数据总线、控制总线等。为便于表示,图15中仅用一条粗线表示,但并不表示仅有一 根总线或一种类型的总线。FIG. 15 is a schematic structural diagram of a computing device 900 provided by the present application, and the computing device 900 may be the target positioning system 110 in the foregoing content. As shown in FIG. 15 , the computing device 900 includes a processor 910 , a communication interface 920 and a memory 930 . The processor 910, the communication interface 920 and the memory 930 can be connected to each other through the internal bus 940, and can also communicate through other means such as wireless transmission. The embodiment of the present application takes the connection through the bus 940 as an example, and the bus 940 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus or the like. The bus 940 can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is shown in Figure 15, but it does not mean that there is only one bus or one type of bus.
处理器910可以由至少一个通用处理器构成,例如中央处理器(central processing unit,CPU),或者CPU和硬件芯片的组合。上述硬件芯片可以是专用集成电路(application-specific integrated circuit,ASIC)、可编程逻辑器件(programmable logic device,PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(complex programmable logic device,CPLD)、现场可编程逻辑门阵列(field-programmable gate array,FPGA)、通用阵列逻辑(generic array logic,GAL)或其任意组合。处理器910执行各种类型的数字存储指令,例如存储在存储器930中的软件或者固件程序,它能使计算设备900提供多种服务。The processor 910 may be composed of at least one general-purpose processor, such as a central processing unit (CPU), or a combination of a CPU and a hardware chip. The above-mentioned hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD) or a combination thereof. The above-mentioned PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL) or any combination thereof. Processor 910 executes various types of digitally stored instructions, such as software or firmware programs stored in memory 930, which enable computing device 900 to provide various services.
存储器930用于存储程序代码,并由处理器910来控制执行,以执行上述实施例中目标定位系统的处理步骤。程序代码中可以包括一个或多个软件模块,这一个或多个软件模块可以为图14实施例中提供的软件模块,如获取单元、检测匹配单元和位置确定单元,其中,获取单元用于获取第一图像和第二图像,检测匹配单元用于将第一图像和第二图像输入检测匹配模型,获得第一目标区域和第二目标区域,然后对第一目标区域和第二目标区域进行特征点检测和匹配,获得特征点匹配结果,位置确定单元用于根据特征点匹配结果和多目摄像机的参数信息,确定目标的位置信息。具体可用于执行图6实施例中的S310-步骤S350及其可选步骤,还可以用于实现图1-图13实施例描述的目标定位系统110的其他功能,这里不再进行赘述。The memory 930 is used for storing program codes, and is controlled and executed by the processor 910 to execute the processing steps of the target positioning system in the above-mentioned embodiment. The program code may include one or more software modules, and the one or more software modules may be the software modules provided in the embodiment of FIG. 14 , such as an acquisition unit, a detection matching unit and a position determination unit, wherein the acquisition unit is used to acquire The first image and the second image, the detection and matching unit is used to input the first image and the second image into the detection and matching model, obtain the first target area and the second target area, and then perform the feature on the first target area and the second target area. Point detection and matching to obtain the feature point matching result, and the position determination unit is used for determining the position information of the target according to the feature point matching result and the parameter information of the multi-camera. Specifically, it can be used to execute S310-step S350 in the embodiment of FIG. 6 and its optional steps, and can also be used to implement other functions of the target positioning system 110 described in the embodiment of FIG. 1 to FIG. 13 , which will not be repeated here.
需要说明的是,本实施例可以是通用的物理服务器实现的,例如,ARM服务器或者X86服务器,也可以是基于通用的物理服务器结合NFV技术实现的虚拟机实现的,虚拟机指通过 软件模拟的具有完整 硬件系统功能的、运行在一个完全 隔离环境中的完整 计算机系 ,本申请不作具体限定。 It should be noted that this embodiment can be implemented by a general physical server, for example, an ARM server or an X86 server, or can be implemented based on a general physical server combined with a virtual machine implemented by NFV technology . A complete computer system with complete hardware system functions and running in a completely isolated environment is not specifically limited in this application.
存储器930可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM);存储器1030也可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM)、快闪存储器(flash memory)、硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD);存储器930还可以包括上述种类的组合。存储器930可以存储有程序代码,具体可以包括用于执行图1-图13实施例描述的其他步骤的程序代码,这里不再进行赘述。 Memory 930 may include volatile memory (volatile memory), such as random access memory (RAM); memory 1030 may also include non-volatile memory (non-volatile memory), such as read-only memory (read- only memory, ROM), flash memory (flash memory), hard disk drive (HDD) or solid-state drive (solid-state drive, SSD); the memory 930 may also include a combination of the above types. The memory 930 may store program codes, and may specifically include program codes for executing other steps described in the embodiments of FIG. 1 to FIG. 13 , which will not be repeated here.
通信接口920可以为有线接口(例如以太网接口),可以为内部接口(例如高速串行计算机扩展总线(peripheral component interconnect express,PCIe)总线接口)、有线接口(例如以太网接口)或无线接口(例如蜂窝网络接口或使用无线局域网接口),用于与与其他设备或模块进行通信。The communication interface 920 may be a wired interface (such as an Ethernet interface), an internal interface (such as a high-speed serial computer expansion bus (peripheral component interconnect express, PCIe) bus interface), a wired interface (such as an Ethernet interface), or a wireless interface ( such as a cellular network interface or using a wireless local area network interface) to communicate with other devices or modules.
需要说明的,图15仅仅是本申请实施例的一种可能的实现方式,实际应用中,计算设备900还可以包括更多或更少的部件,这里不作限制。关于本申请实施例中未示出或未描述的内容,可参见前述图1-图13实施例中的相关阐述,这里不再赘述。It should be noted that FIG. 15 is only a possible implementation manner of the embodiment of the present application. In practical applications, the computing device 900 may further include more or less components, which is not limited here. For content not shown or described in the embodiments of the present application, reference may be made to the relevant descriptions in the foregoing embodiments of FIG. 1 to FIG. 13 , which will not be repeated here.
应理解,图15所示的计算设备还可以是至少一个服务器构成的计算机集群,本申请不作具体限定。It should be understood that the computing device shown in FIG. 15 may also be a computer cluster composed of at least one server, which is not specifically limited in this application.
本申请实施例还提供一种计算机可读存储介质,计算机可读存储介质中存储有指令,当其在处理器上运行时,图1-图13所示的方法流程得以实现。Embodiments of the present application further provide a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium runs on a processor, the method flow shown in FIG. 1 to FIG. 13 is implemented.
本申请实施例还提供一种计算机程序产品,当计算机程序产品在处理器上运行时,图1-图13所示的方法流程得以实现。The embodiment of the present application further provides a computer program product, when the computer program product runs on the processor, the method flow shown in FIG. 1-FIG. 13 is realized.
上述实施例,可以全部或部分地通过软件、硬件、固件或其他任意组合来实现。当使用软件实现时,上述实施例可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载或执行所述计算机程序指令时,全部或部分地产生按照本发明实施例所述的流程或功能。所述计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集合的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,高密度数字视频光盘(digital video disc,DVD)、或者半导体介质。半导体介质可以是SSD。The above embodiments may be implemented in whole or in part by software, hardware, firmware or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded or executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated. The computer may be a general purpose computer, special purpose computer, computer network, or other programmable device. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, fiber optic, digital subscriber line, DSL) or wireless (eg, infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, or the like that contains one or more sets of available media. The available media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, high density digital video discs (DVDs), or semiconductor media. The semiconductor media may be SSDs.
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以权利要求的保护范围为准。The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited to this. Any person skilled in the art can easily think of various equivalents within the technical scope disclosed by the present invention. Modifications or substitutions should be included within the protection scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.

Claims (19)

  1. 一种目标定位的方法,其特征在于,所述方法包括:A method for target positioning, characterized in that the method comprises:
    获取第一图像和第二图像,所述第一图像和所述第二图像是多目摄像机在同一时刻对同一目标进行拍摄获得;acquiring a first image and a second image, the first image and the second image are obtained by shooting the same target at the same time by a multi-camera camera;
    对所述第一图像和所述第二图像进行目标检测和匹配,获得所述第一图像的第一目标区域和所述第二图像的第二目标区域,其中,所述第一目标区域和所述第二目标区域包括所述目标;Perform target detection and matching on the first image and the second image to obtain a first target area of the first image and a second target area of the second image, wherein the first target area and the second target area includes the target;
    对所述第一目标区域和所述第二目标区域进行特征点检测和匹配,获得特征点匹配结果,其中,所述特征点匹配结果包括所述第一目标区域中的特征点与所述第二目标区域中的特征点之间的对应关系,存在所述对应关系的特征点描述所述目标的同一个特征;Perform feature point detection and matching on the first target area and the second target area to obtain a feature point matching result, wherein the feature point matching result includes the feature points in the first target area and the first target area. The corresponding relationship between the feature points in the two target areas, the feature points with the corresponding relationship describe the same feature of the target;
    根据所述特征点匹配结果和所述多目摄像机的参数信息,确定所述目标的位置信息。The position information of the target is determined according to the feature point matching result and the parameter information of the multi-camera.
  2. 根据权利要求1所述的方法,其特征在于,所述参数信息至少包括所述多目摄像机的基线长度和所述多目摄像机的焦距;The method according to claim 1, wherein the parameter information includes at least a baseline length of the multi-camera camera and a focal length of the multi-camera camera;
    所述根据所述特征点匹配结果和所述多目摄像机的参数信息,确定所述目标的位置信息包括:The determining of the location information of the target according to the feature point matching result and the parameter information of the multi-camera includes:
    根据所述特征点匹配结果中存在对应关系的特征点之间的像素差异,获得所述目标的视差信息,所述视差信息包括所述第一目标区域中的特征点的像素坐标与所述第二目标区域中存在所述对应关系的特征点的像素坐标之间的差距;According to the pixel difference between the corresponding feature points in the feature point matching result, the disparity information of the target is obtained, and the disparity information includes the pixel coordinates of the feature points in the first target area and the first target area. The difference between the pixel coordinates of the feature points with the corresponding relationship in the two target regions;
    根据所述目标的视差信息、所述多目摄像机的基线长度以及所述多目摄像机的焦距,确定所述目标与所述摄像机之间的距离,获得所述目标的位置信息。According to the parallax information of the target, the baseline length of the multi-eye camera and the focal length of the multi-eye camera, the distance between the target and the camera is determined, and the position information of the target is obtained.
  3. 根据权利要求1或2所述的方法,其特征在于,所述多目摄像机包括多个摄像机组,所述多个摄像机组中的每组摄像机包括多个摄像机,所述获取第一图像和第二图像包括:The method according to claim 1 or 2, wherein the multi-camera camera comprises a plurality of camera groups, each camera group in the plurality of camera groups comprises a plurality of cameras, and the acquiring the first image and the second camera Two images include:
    获取所述多目摄像机的基线数据,所述基线数据包括所述每组摄像机中的多个摄像机之间的基线长度;acquiring baseline data of the multi-camera camera, the baseline data including the baseline lengths between the plurality of cameras in each group of cameras;
    根据所述目标的测量精度需求,从所述基线数据中获取目标基线;According to the measurement accuracy requirement of the target, obtain the target baseline from the baseline data;
    根据所述目标基线获取所述第一图像和所述第二图像,其中,所述第一图像和所述第二图像是所述目标基线对应的摄像机组拍摄获得。The first image and the second image are acquired according to the target baseline, wherein the first image and the second image are captured by a camera group corresponding to the target baseline.
  4. 根据权利要求1-3任一权利要求所述的方法,其特征在于,所述获取第一图像和第二图像,包括:The method according to any one of claims 1-3, wherein the acquiring the first image and the second image comprises:
    向所述多目摄像机发送携带目标基线的基线调整请求,所述基线调整请求用于指示所述多目摄像机调整所述多目摄像机中包括的摄像机组的基线长度到所述目标基线;sending a baseline adjustment request carrying a target baseline to the multi-eye camera, where the baseline adjustment request is used to instruct the multi-eye camera to adjust the baseline length of the camera group included in the multi-eye camera to the target baseline;
    接收所述目标基线对应的摄像机组拍摄的所述第一图像和所述第二图像。The first image and the second image captured by the camera group corresponding to the target baseline are received.
  5. 根据权利要求3或4所述的方法,其特征在于,所述根据所述目标的测量精度需求,从所述基线数据中获取目标基线包括:The method according to claim 3 or 4, wherein the obtaining the target baseline from the baseline data according to the measurement accuracy requirement of the target comprises:
    确定所述每组摄像机的第一精度指标和第二精度指标,其中,所述第一精度指标与所述每组摄像机的基线长度呈反比例关系,所述第一精度指标与所述每组摄像机的共视区域呈正 比例关系,所述第二精度指标与所述每组摄像机的基线长度和焦距呈正比例关系,所述共视区域是所述每组摄像机中的多个摄像机共同拍摄到的区域;Determine the first accuracy index and the second accuracy index of each group of cameras, wherein the first accuracy index is inversely proportional to the baseline length of each group of cameras, and the first accuracy index is related to each group of cameras. The common viewing area is in a proportional relationship, the second accuracy index is proportional to the baseline length and the focal length of each group of cameras, and the common viewing area is the area jointly photographed by a plurality of cameras in each group of cameras ;
    根据所述目标的测量精度需求确定所述第一精度指标和所述第二精度指标的权重;Determine the weight of the first accuracy index and the second accuracy index according to the measurement accuracy requirement of the target;
    根据所述第一精度指标、所述第二精度指标和所述权重,获得所每组摄像机的综合指标;According to the first accuracy index, the second accuracy index and the weight, obtain the comprehensive index of each group of cameras;
    根据所述每组摄像机的综合指标,确定所述目标基线。According to the comprehensive index of each group of cameras, the target baseline is determined.
  6. 根据权利要求1至5任一权利要求所述的方法,其特征在于,所述获取第一图像和第二图像包括:The method according to any one of claims 1 to 5, wherein the acquiring the first image and the second image comprises:
    接收所述多目摄像机对所述目标进行拍摄获得的第一路视频和第二路视频;receiving the first video and the second video obtained by shooting the target by the multi-camera;
    对所述第一路视频和第二路视频进行时间同步处理,获得同一时刻的所述第一图像和所述第二图像,其中,所述第一图像是所述第一路视频中的图像帧,所述第二图像是所述第二路视频中的图像帧。Perform time synchronization processing on the first channel video and the second channel video to obtain the first image and the second image at the same moment, wherein the first image is the image in the first channel video frame, and the second image is an image frame in the second channel of video.
  7. 根据权利要求6所述的方法,其特征在于,所述对所述第一路视频和第二路视频进行时间同步处理,获得同一时刻的所述第一图像和所述第二图像包括:The method according to claim 6, wherein the performing time synchronization processing on the first channel of video and the second channel of video to obtain the first image and the second image at the same moment comprises:
    从所述第一路视频中获取参考帧,从所述第二路视频中获取多个运动帧,其中,所述参考帧和所述多个运动帧中包括运动物体;Obtain a reference frame from the first video, and obtain a plurality of motion frames from the second video, wherein the reference frame and the plurality of motion frames include moving objects;
    将所述参考帧与所述多个运动帧进行特征点匹配,获得所述多个运动帧中的同步帧,其中,所述同步帧中的特征点与所述参考帧中对应的特征点之间连线的平行度满足预设条件;Perform feature point matching on the reference frame and the plurality of motion frames to obtain a synchronization frame in the plurality of motion frames, wherein the feature point in the synchronization frame and the corresponding feature point in the reference frame are determined. The parallelism of the connecting lines satisfies the preset condition;
    根据所述参考帧和所述同步帧对所述第一路视频和所述第二路视频进行时间同步校正,获得同一时刻的所述第一图像和所述第二图像。Time synchronization correction is performed on the first channel of video and the second channel of video according to the reference frame and the synchronization frame to obtain the first image and the second image at the same moment.
  8. 根据权利要求1至7任一权利要求所述的方法,其特征在于,所述将所述第一图像和所述第二图像进行目标检测和匹配,获得所述第一图像的第一目标区域,所述第二图像的第二目标区域,包括:The method according to any one of claims 1 to 7, wherein the first image and the second image are subjected to target detection and matching to obtain a first target area of the first image , the second target area of the second image, including:
    将所述第一图像输入检测匹配模型,获得所述第一图像的第一检测匹配结果,将所述第二图像输入所述检测匹配模型,获得所述第二图像的第二检测匹配结果,其中,所述第一检测匹配结果和所述第二检测匹配结果包括目标框和标签,所述目标框用于指示所述目标在图像中的区域,同一目标的所述标签相同;inputting the first image into a detection and matching model to obtain a first detection and matching result of the first image, inputting the second image into the detection and matching model to obtain a second detection and matching result of the second image, Wherein, the first detection matching result and the second detection matching result include a target frame and a label, the target frame is used to indicate the area of the target in the image, and the labels of the same target are the same;
    根据所述第一检测匹配结果获得第一目标区域,根据所述第二检测匹配结果获得所述第二目标区域。The first target area is obtained according to the first detection and matching result, and the second target area is obtained according to the second detection and matching result.
  9. 一种目标定位系统,其特征在于,该系统包括:A target positioning system, characterized in that the system comprises:
    获取单元,用于获取第一图像和第二图像,所述第一图像和所述第二图像是多目摄像机在同一时刻对同一目标进行拍摄获得;an acquisition unit, configured to acquire a first image and a second image, wherein the first image and the second image are obtained by photographing the same target at the same time by a multi-camera camera;
    检测匹配单元,用于对所述第一图像和所述第二图像进行目标检测和匹配,获得所述第一图像的第一目标区域和所述第二图像的第二目标区域,其中,所述第一目标区域和所述第二目标区域包括所述目标;a detection and matching unit, configured to perform target detection and matching on the first image and the second image to obtain a first target area of the first image and a second target area of the second image, wherein the the first target area and the second target area include the target;
    检测匹配单元,用于对所述第一目标区域和所述第二目标区域进行特征点检测和匹配,获得特征点匹配结果,其中,所述特征点匹配结果包括所述第一目标区域中的特征点与所述第二目标区域中的特征点之间的对应关系,存在所述对应关系的特征点描述所述目标的同一 个特征;The detection and matching unit is configured to perform feature point detection and matching on the first target area and the second target area, and obtain a feature point matching result, wherein the feature point matching result includes the first target area. The corresponding relationship between the feature points and the feature points in the second target area, and the feature points with the corresponding relationship describe the same feature of the target;
    位置确定单元,用于根据所述特征点匹配结果和所述多目摄像机的参数信息,确定所述目标的位置信息。A position determination unit, configured to determine the position information of the target according to the feature point matching result and the parameter information of the multi-camera.
  10. 根据权利要求9所述的系统,其特征在于,所述参数信息至少包括所述多目摄像机的基线长度和所述多目摄像机的焦距;The system according to claim 9, wherein the parameter information includes at least a baseline length of the multi-camera camera and a focal length of the multi-camera camera;
    所述位置确定单元,用于根据所述特征点匹配结果中存在对应关系的特征点之间的像素差异,获得所述目标的视差信息,所述视差信息包括所述第一目标区域中的特征点的像素坐标与所述第二目标区域中存在所述对应关系的特征点的像素坐标之间的差距;The position determination unit is configured to obtain parallax information of the target according to the pixel difference between the feature points with corresponding relationships in the feature point matching result, where the parallax information includes features in the first target area the difference between the pixel coordinates of the point and the pixel coordinates of the feature points that have the corresponding relationship in the second target area;
    所述位置确定单元,用于根据所述目标的视差信息、所述多目摄像机的基线长度以及所述多目摄像机的焦距,确定所述目标与所述摄像机之间的距离,获得所述目标的位置信息。The position determination unit is configured to determine the distance between the target and the camera according to the parallax information of the target, the baseline length of the multi-eye camera and the focal length of the multi-eye camera, and obtain the target location information.
  11. 根据权利要求9或10所述的系统,其特征在于,所述多目摄像机包括多个摄像机组,所述多个摄像机组中的每组摄像机包括多个摄像机,所述系统还包括基线确定单元,The system according to claim 9 or 10, wherein the multi-camera camera comprises a plurality of camera groups, each camera group in the plurality of camera groups comprises a plurality of cameras, and the system further comprises a baseline determination unit ,
    所述基线确定单元,用于获取所述多目摄像机的基线数据,所述基线数据包括所述每组摄像机中的多个摄像机之间的基线长度;the baseline determination unit, configured to acquire baseline data of the multi-camera camera, where the baseline data includes baseline lengths between a plurality of cameras in each group of cameras;
    所述基线确定单元,用于根据所述目标的测量精度需求,从所述基线数据中获取目标基线;the baseline determination unit, configured to obtain the target baseline from the baseline data according to the measurement accuracy requirement of the target;
    所述获取单元,用于根据所述目标基线获取所述第一图像和所述第二图像,其中,所述第一图像和所述第二图像是所述目标基线对应的摄像机组拍摄获得。The acquiring unit is configured to acquire the first image and the second image according to the target baseline, wherein the first image and the second image are captured by a camera group corresponding to the target baseline.
  12. 根据权利要求9至11任一权利要求所述的系统,其特征在于,The system according to any one of claims 9 to 11, characterized in that:
    所述基线确定单元,用于向所述多目摄像机发送携带目标基线的基线调整请求,所述基线调整请求用于指示所述多目摄像机调整所述多目摄像机中包括的摄像机组的基线长度到所述目标基线;The baseline determination unit is configured to send a baseline adjustment request carrying a target baseline to the multi-camera camera, where the baseline adjustment request is used to instruct the multi-camera camera to adjust the baseline length of the camera group included in the multi-camera camera to the target baseline;
    所述获取单元,用于接收所述目标基线对应的摄像机组拍摄的所述第一图像和所述第二图像。The acquiring unit is configured to receive the first image and the second image captured by the camera group corresponding to the target baseline.
  13. 根据权利要求11或12所述的系统,其特征在于,The system according to claim 11 or 12, wherein,
    所述基线确定单元,用于确定所述每组摄像机的第一精度指标和第二精度指标,其中,所述第一精度指标与所述每组摄像机的基线长度呈反比例关系,所述第一精度指标与所述每组摄像机的共视区域呈正比例关系,所述第二精度指标与所述每组摄像机的基线长度和焦距呈正比例关系,所述共视区域是所述每组摄像机中的多个摄像机共同拍摄到的区域;The baseline determination unit is configured to determine the first accuracy index and the second accuracy index of each group of cameras, wherein the first accuracy index is inversely proportional to the baseline length of each group of cameras, and the first The accuracy index is proportional to the common viewing area of each group of cameras, the second accuracy index is proportional to the baseline length and focal length of each group of cameras, and the common viewing area is the common viewing area of each group of cameras. Areas captured by multiple cameras;
    所述基线确定单元,用于根据所述目标的测量精度需求确定所述第一精度指标和所述第二精度指标的权重;the baseline determination unit, configured to determine the weight of the first accuracy index and the second accuracy index according to the measurement accuracy requirement of the target;
    所述基线确定单元,用于根据所述第一精度指标、所述第二精度指标和所述权重,获得所每组摄像机的综合指标;the baseline determination unit, configured to obtain the comprehensive index of each group of cameras according to the first accuracy index, the second accuracy index and the weight;
    所述基线确定单元,用于根据所述每组摄像机的综合指标,确定所述目标基线。The baseline determination unit is configured to determine the target baseline according to the comprehensive index of each group of cameras.
  14. 根据权利要求9至13任一权利要求所述的系统,其特征在于,所述系统还包括同步单元,The system according to any one of claims 9 to 13, wherein the system further comprises a synchronization unit,
    所述同步单元,用于接收所述多目摄像机对所述目标进行拍摄获得的第一路视频和第二路视频;the synchronization unit, configured to receive the first video and the second video obtained by shooting the target by the multi-camera;
    所述同步单元,用于对所述第一路视频和第二路视频进行时间同步处理,获得同一时刻的所述第一图像和所述第二图像,其中,所述第一图像是所述第一路视频中的图像帧,所述第二图像是所述第二路视频中的图像帧。The synchronization unit is configured to perform time synchronization processing on the first channel video and the second channel video to obtain the first image and the second image at the same moment, wherein the first image is the The image frame in the first channel of video, and the second image is the image frame in the second channel of video.
  15. 根据权利要求14所述的系统,其特征在于,The system of claim 14, wherein:
    所述同步单元,用于从所述第一路视频中获取参考帧,从所述第二路视频中获取多个运动帧,其中,所述参考帧和所述多个运动帧中包括运动物体;The synchronization unit is configured to obtain a reference frame from the first channel of video, and obtain a plurality of motion frames from the second channel of video, wherein the reference frame and the plurality of motion frames include moving objects ;
    所述同步单元,用于将所述参考帧与所述多个运动帧进行特征点匹配,获得所述多个运动帧中的同步帧,其中,所述同步帧中的特征点与所述参考帧中对应的特征点之间连线的平行度满足预设条件;The synchronization unit is configured to perform feature point matching between the reference frame and the plurality of motion frames to obtain a synchronization frame in the plurality of motion frames, wherein the feature points in the synchronization frame and the reference frame The parallelism of the lines between the corresponding feature points in the frame satisfies the preset condition;
    所述同步单元,用于根据所述参考帧和所述同步帧对所述第一路视频和所述第二路视频进行时间同步校正,获得同一时刻的所述第一图像和所述第二图像。The synchronization unit is configured to perform time synchronization correction on the video of the first channel and the video of the second channel according to the reference frame and the synchronization frame, so as to obtain the first image and the second video at the same moment. image.
  16. 根据权利要求9至15任一权利要求所述的系统,其特征在于,A system according to any one of claims 9 to 15, characterized in that:
    所述检测匹配单元,用于将所述第一图像输入检测匹配模型,获得所述第一图像的第一检测匹配结果,将所述第二图像输入所述检测匹配模型,获得所述第二图像的第二检测匹配结果,其中,所述第一检测匹配结果和所述第二检测匹配结果包括目标框和标签,所述目标框用于指示所述目标在图像中的区域,同一目标的所述标签相同;The detection and matching unit is configured to input the first image into a detection and matching model, obtain a first detection and matching result of the first image, and input the second image into the detection and matching model to obtain the second The second detection and matching result of the image, wherein the first detection and matching result and the second detection and matching result include a target frame and a label, and the target frame is used to indicate the area of the target in the image. said labels are the same;
    所述检测匹配单元,用于根据所述第一检测匹配结果获得第一目标区域,根据所述第二检测匹配结果获得所述第二目标区域。The detection and matching unit is configured to obtain a first target area according to the first detection and matching result, and obtain the second target area according to the second detection and matching result.
  17. 一种计算机可读存储介质,其特征在于,包括指令,当所述指令在计算设备上运行时,使得所述计算设备执行如权利要求1至8任一权利要求所述的方法。A computer-readable storage medium, comprising instructions that, when executed on a computing device, cause the computing device to perform the method of any one of claims 1 to 8.
  18. 一种计算设备,其特征在于,包括处理器和存储器,所述处理器执行所述存储器中的代码执行如权利要求1至8任一权利要求所述的方法。A computing device, comprising a processor and a memory, wherein the processor executes the code in the memory to execute the method according to any one of claims 1 to 8.
  19. 一种计算机程序产品,其特征在于,包括计算机程序,当所述计算机程序被计算设备读取并执行时,使得所述计算设备执行如权利要求1至8任一权利要求所述的方法。A computer program product, characterized by comprising a computer program, which, when the computer program is read and executed by a computing device, causes the computing device to execute the method according to any one of claims 1 to 8 .
PCT/CN2021/139421 2020-12-31 2021-12-18 Target positioning method and system, and related device WO2022143237A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202011638235.5 2020-12-31
CN202011638235 2020-12-31
CN202110567480.XA CN114693785A (en) 2020-12-31 2021-05-24 Target positioning method, system and related equipment
CN202110567480.X 2021-05-24

Publications (1)

Publication Number Publication Date
WO2022143237A1 true WO2022143237A1 (en) 2022-07-07

Family

ID=82136525

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/139421 WO2022143237A1 (en) 2020-12-31 2021-12-18 Target positioning method and system, and related device

Country Status (2)

Country Link
CN (1) CN114693785A (en)
WO (1) WO2022143237A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115424353A (en) * 2022-09-07 2022-12-02 杭银消费金融股份有限公司 AI model-based service user feature identification method and system
CN116819229A (en) * 2023-06-26 2023-09-29 广东电网有限责任公司 Distance measurement method, device, equipment and storage medium for power transmission line
CN117315033A (en) * 2023-11-29 2023-12-29 上海仙工智能科技有限公司 Neural network-based identification positioning method and system and storage medium
CN117455940A (en) * 2023-12-25 2024-01-26 四川汉唐云分布式存储技术有限公司 Cloud-based customer behavior detection method, system, equipment and storage medium
WO2024060981A1 (en) * 2022-09-20 2024-03-28 深圳市其域创新科技有限公司 Three-dimensional mesh optimization method, device, and storage medium
CN117876608A (en) * 2024-03-11 2024-04-12 魔视智能科技(武汉)有限公司 Three-dimensional image reconstruction method, three-dimensional image reconstruction device, computer equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117409340B (en) * 2023-12-14 2024-03-22 上海海事大学 Unmanned aerial vehicle cluster multi-view fusion aerial photography port monitoring method, system and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109579868A (en) * 2018-12-11 2019-04-05 上海元城汽车技术有限公司 The outer object localization method of vehicle, device and automobile
CN110322702A (en) * 2019-07-08 2019-10-11 中原工学院 A kind of Vehicular intelligent speed-measuring method based on Binocular Stereo Vision System
CN110349221A (en) * 2019-07-16 2019-10-18 北京航空航天大学 A kind of three-dimensional laser radar merges scaling method with binocular visible light sensor
US20200134377A1 (en) * 2018-10-25 2020-04-30 Adobe Systems Incorporated Logo detection

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200134377A1 (en) * 2018-10-25 2020-04-30 Adobe Systems Incorporated Logo detection
CN109579868A (en) * 2018-12-11 2019-04-05 上海元城汽车技术有限公司 The outer object localization method of vehicle, device and automobile
CN110322702A (en) * 2019-07-08 2019-10-11 中原工学院 A kind of Vehicular intelligent speed-measuring method based on Binocular Stereo Vision System
CN110349221A (en) * 2019-07-16 2019-10-18 北京航空航天大学 A kind of three-dimensional laser radar merges scaling method with binocular visible light sensor

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115424353A (en) * 2022-09-07 2022-12-02 杭银消费金融股份有限公司 AI model-based service user feature identification method and system
CN115424353B (en) * 2022-09-07 2023-05-05 杭银消费金融股份有限公司 Service user characteristic identification method and system based on AI model
WO2024060981A1 (en) * 2022-09-20 2024-03-28 深圳市其域创新科技有限公司 Three-dimensional mesh optimization method, device, and storage medium
CN116819229A (en) * 2023-06-26 2023-09-29 广东电网有限责任公司 Distance measurement method, device, equipment and storage medium for power transmission line
CN117315033A (en) * 2023-11-29 2023-12-29 上海仙工智能科技有限公司 Neural network-based identification positioning method and system and storage medium
CN117315033B (en) * 2023-11-29 2024-03-19 上海仙工智能科技有限公司 Neural network-based identification positioning method and system and storage medium
CN117455940A (en) * 2023-12-25 2024-01-26 四川汉唐云分布式存储技术有限公司 Cloud-based customer behavior detection method, system, equipment and storage medium
CN117455940B (en) * 2023-12-25 2024-02-27 四川汉唐云分布式存储技术有限公司 Cloud-based customer behavior detection method, system, equipment and storage medium
CN117876608A (en) * 2024-03-11 2024-04-12 魔视智能科技(武汉)有限公司 Three-dimensional image reconstruction method, three-dimensional image reconstruction device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN114693785A (en) 2022-07-01

Similar Documents

Publication Publication Date Title
WO2022143237A1 (en) Target positioning method and system, and related device
CN110349213B (en) Pose determining method and device based on depth information, medium and electronic equipment
JP6484729B2 (en) Unmanned aircraft depth image acquisition method, acquisition device, and unmanned aircraft
US11328479B2 (en) Reconstruction method, reconstruction device, and generation device
CN112384891B (en) Method and system for point cloud coloring
CN102997891B (en) Device and method for measuring scene depth
US8463024B1 (en) Combining narrow-baseline and wide-baseline stereo for three-dimensional modeling
CN103903263B (en) A kind of 360 degrees omnidirection distance-finding method based on Ladybug panorama camera image
CN113711276A (en) Scale-aware monocular positioning and mapping
CN115376109B (en) Obstacle detection method, obstacle detection device, and storage medium
CN115797408A (en) Target tracking method and device fusing multi-view image and three-dimensional point cloud
CN112950717A (en) Space calibration method and system
CN111882655B (en) Method, device, system, computer equipment and storage medium for three-dimensional reconstruction
WO2023083256A1 (en) Pose display method and apparatus, and system, server and storage medium
Schraml et al. An event-driven stereo system for real-time 3-D 360 panoramic vision
CN116468786A (en) Semantic SLAM method based on point-line combination and oriented to dynamic environment
CN115019208A (en) Road surface three-dimensional reconstruction method and system for dynamic traffic scene
CN113379801B (en) High-altitude parabolic monitoring and positioning method based on machine vision
CN113608234A (en) City data acquisition system
CN115937810A (en) Sensor fusion method based on binocular camera guidance
CN113895482B (en) Train speed measuring method and device based on trackside equipment
CN114782496A (en) Object tracking method and device, storage medium and electronic device
CN112215036B (en) Cross-mirror tracking method, device, equipment and storage medium
CN115761558A (en) Method and device for determining key frame in visual positioning
CN107610170B (en) Multi-view image refocusing depth acquisition method and system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21913978

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21913978

Country of ref document: EP

Kind code of ref document: A1