CN111798496B - Visual locking method and device - Google Patents

Visual locking method and device Download PDF

Info

Publication number
CN111798496B
CN111798496B CN202010542145.XA CN202010542145A CN111798496B CN 111798496 B CN111798496 B CN 111798496B CN 202010542145 A CN202010542145 A CN 202010542145A CN 111798496 B CN111798496 B CN 111798496B
Authority
CN
China
Prior art keywords
tracking target
camera
module
deep learning
right camera
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010542145.XA
Other languages
Chinese (zh)
Other versions
CN111798496A (en
Inventor
熊明磊
陈龙冬
李鑫海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Boya Gongdao Beijing Robot Technology Co Ltd
Original Assignee
Boya Gongdao Beijing Robot Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Boya Gongdao Beijing Robot Technology Co Ltd filed Critical Boya Gongdao Beijing Robot Technology Co Ltd
Priority to CN202010542145.XA priority Critical patent/CN111798496B/en
Publication of CN111798496A publication Critical patent/CN111798496A/en
Application granted granted Critical
Publication of CN111798496B publication Critical patent/CN111798496B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/292Multi-camera tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/579Depth or shape recovery from multiple images from motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention discloses a visual locking method and a visual locking device, relates to the technical field of underwater robots, and solves the technical problem that a tracking method in the prior art only stops on a two-dimensional image and is poor in tracking effect. The vision locking method of the invention enables the unmanned underwater vehicle and the tracking target to keep six-axis synchronization by acquiring the three-dimensional position data of the tracking target and controlling the operation mode of the unmanned underwater vehicle in real time based on the acquired three-dimensional position data. The vision locking method can automatically feed back and adjust the position of the unmanned submersible in real time under the conditions of water flow interference fluctuation and the like, so that the unmanned submersible keeps synchronous with a tracked target, and the unmanned submersible can conveniently shoot or grab the tracked target. Namely, the visual locking method of the present invention can improve the tracking effect of the unmanned underwater vehicle.

Description

Visual locking method and device
Technical Field
The invention relates to the technical field of underwater robots, in particular to a visual locking method and device.
Background
The ocean occupies 71 percent of the surface area of the earth, has a volume of 14 hundred million cubic kilometers, contains extremely rich biological resources and mineral resources in the ocean floor and ocean, and has extremely strong attraction and challenge as the ocean floor detection is similar to the space detection. The unmanned submersible and the facilities matched with the unmanned submersible are products of various modern high technologies and system integration thereof, and have special significance for marine economy, marine industry, marine development and marine high-tech in China.
The existing unmanned submersible generally uses coherent filtering (KCF column) or a characteristic matching optical flow method to estimate the position of a moving object, the method only stays on a two-dimensional image, and only has two degrees of freedom, and the tracking effect on certain shielding, motion blurring and 'fog' formed by underwater light absorption and scattering is extremely poor. Therefore, to keep the object synchronized, it is a technical problem to be solved by those skilled in the art to provide a multi-dimensional and precise visual locking method and apparatus.
Disclosure of Invention
One of the purposes of the present invention is to provide a visual locking method and apparatus, which solve the technical problem that the tracking method in the prior art only stays on a two-dimensional image and the tracking effect is poor. The various technical effects that can be produced by the preferred technical solution of the present invention are described in detail below.
In order to achieve the purpose, the invention provides the following technical scheme:
according to the vision locking method, the unmanned submersible and the tracking target keep six-axis synchronization by acquiring the three-dimensional position data of the tracking target and controlling the operation mode of the unmanned submersible in real time based on the acquired three-dimensional position data.
According to a preferred embodiment, the visual locking method comprises the following steps:
s1: initializing equipment to finish correction of the binocular camera;
s2: acquiring a video stream of a binocular camera, and pushing the acquired video stream to an upper computer and a deep learning module;
s3: selecting a tracking target, and pushing a tracking target image to a deep learning module;
s4: the deep learning module acquires three-dimensional position data of a tracking target in the left camera and/or the right camera based on the acquired binocular video stream and the tracking target image;
s5: and the control module judges whether the tracking target is in the left camera and/or the right camera or not based on the received three-dimensional position data, and controls the motion mode of the unmanned submersible vehicle based on the judgment result so as to ensure that the unmanned submersible vehicle and the tracking target keep six-axis synchronization.
According to a preferred embodiment, in step S1, the correction of the binocular camera is accomplished by: after equipment is initialized, obtaining internal and external parameters and distortion parameters of the binocular camera, and completing distortion correction of the binocular camera based on the obtained internal parameters and distortion parameters; and determining the position relation between the binocular cameras based on the acquired external parameters.
According to a preferred embodiment, in step S3, the tracking target is selected by: and manually selecting a tracking target through an upper computer, and/or automatically detecting the tracking target through a target detection system.
According to a preferred embodiment, in step S4, the three-dimensional position data includes an X value, a Y value, and a depth value of the tracking target.
According to a preferred embodiment, the depth value is depth information of the tracking target with respect to the left camera and/or the right camera.
According to a preferred embodiment, in step S4, the deep learning module acquires three-dimensional position data of the tracking target relative to the left camera and/or the right camera by:
s41: extracting the backbone characteristics of the tracking target image and the left and right camera views by using a residual error network;
s42: extracting a tracking target graph and feature graphs of a left camera view and/or a right camera view at different depth positions to generate a network by a twin network region in a plurality of cascaded stages, and regressing the predicted positions and categories of the tracking target of the left camera view and/or the right camera view; wherein the predicted position of the tracking target is the X value and the Y value of the tracking target;
s43: calculating matching cost by using a twin network based on feature maps of left and right camera views with different depths, returning the feature maps to the same size as an original map by using a convolutional neural network, then using a plurality of self-encoders connected in series to optimize a disparity map in a multi-stage manner, and acquiring the depth diff of a left camera and/or a right camera based on the optimized disparity map;
s44: and calculating the depth D of the tracking target relative to the left camera and/or the right camera according to the binocular camera base line B, the left camera and/or the right camera focal length F and the depth diff, wherein D is F B/diff.
According to a preferred embodiment, in step S5, the control module determines whether the tracking target is in the left camera and/or the right camera by: and comparing the detected X value and Y value of the tracking target with the visual field range of the left camera and/or the right camera, wherein the visual field range of the left camera and/or the right camera is (width/2-width/10, width/2+ width/10), (height/2-height/10, height/2 + height/10), and the width and the height are respectively the width and the height of the resolution.
According to a preferred embodiment, the control module controls the mode of motion of the unmanned vehicle by: the control module calculates a yaw angle, a pitch angle, and/or a velocity of the unmanned submersible based on a distance between the tracked target and the unmanned submersible, and controls a mode of motion of the unmanned submersible based on the obtained yaw angle, pitch angle, and/or velocity.
The vision locking device of the invention utilizes the vision locking method of any technical proposal of the invention to keep the unmanned submersible and the tracking target in six-axis synchronization, and comprises an initialization module, a data acquisition module, a tracking target selection module, a deep learning module and a control module, wherein,
the initialization module is used for initializing equipment and finishing the correction of the binocular camera;
the data acquisition module and the tracking target selection module are connected with the deep learning module, wherein the data acquisition module is used for acquiring a video stream of a binocular camera and pushing the acquired video stream to the deep learning module, the tracking target selection module is used for selecting a tracking target and pushing a tracking target image to the deep learning module, and the deep learning module acquires three-dimensional position data of the tracking target in the left camera and/or the right camera based on the acquired binocular video stream and the tracking target image;
the deep learning module is connected with the control module, the control module is connected with the unmanned submersible, the deep learning module transmits acquired three-dimensional position data of the tracking target in the left camera and/or the right camera to the control module in real time, the control module judges whether the tracking target is in the left camera and/or the right camera or not based on the received three-dimensional position data, and controls the motion mode of the unmanned submersible based on the judgment result, so that the unmanned submersible and the tracking target keep six-axis synchronization.
The visual locking method and the visual locking device provided by the invention at least have the following beneficial technical effects:
according to the vision locking method, the three-dimensional position data of the tracked target is acquired, the running mode of the unmanned submersible is controlled in real time based on the acquired three-dimensional position data, the unmanned submersible and the tracked target keep six-axis synchronization, the position of the unmanned submersible can be automatically fed back and adjusted in real time under the conditions of water flow interference fluctuation and the like, the unmanned submersible and the tracked target keep synchronous, and the unmanned submersible can conveniently carry out shooting or grabbing and other operations on the tracked target.
The vision locking method of the invention ensures that the unmanned submersible and the tracked target keep six-axis synchronization by acquiring the three-dimensional position data of the tracked target and controlling the running mode of the unmanned submersible in real time based on the acquired three-dimensional position data, can improve the tracking effect of the unmanned submersible, and solves the technical problem that the tracking method in the prior art only stays on a two-dimensional image and has poor tracking effect.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of the steps of a preferred embodiment of the visual locking method of the present invention;
FIG. 2 is a schematic diagram illustrating the steps of the deep learning module for obtaining three-dimensional position data of a tracked target according to an embodiment of the present invention;
fig. 3 is a schematic view of a preferred embodiment of the visual locking apparatus of the present invention.
In the figure: 1. initializing a module; 2. a data acquisition module; 3. a tracking target selection module; 4. a deep learning module; 5. and a control module.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail below. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without any inventive step, are within the scope of the present invention.
The visual locking method and apparatus of the present embodiment will be described in detail with reference to fig. 1 to 3.
According to the visual locking method, the unmanned underwater vehicle and the tracking target keep six-axis synchronization by acquiring the three-dimensional position data of the tracking target and controlling the operation mode of the unmanned underwater vehicle in real time based on the acquired three-dimensional position data. Preferably, the six axes in this embodiment are the positive X-axis direction, the negative X-axis direction, the positive Y-axis direction, the negative Y-axis direction, the positive Z-axis direction, and the negative Z-axis direction. Specifically, the establishment of the XYZ coordinate system is the same as that in the prior art, and is not described herein again.
According to the visual locking method, the three-dimensional position data of the tracked target is acquired, the operation mode of the unmanned submersible is controlled in real time based on the acquired three-dimensional position data, the unmanned submersible and the tracked target keep six-axis synchronization, the position of the unmanned submersible can be automatically fed back and adjusted in real time under the conditions of water flow interference fluctuation and the like, the unmanned submersible and the tracked target keep synchronous, and the unmanned submersible can conveniently carry out shooting or grabbing and other operations on the tracked target. That is, the visual locking method of the embodiment obtains the three-dimensional position data of the tracked target and controls the operation mode of the unmanned underwater vehicle in real time based on the obtained three-dimensional position data, so that the unmanned underwater vehicle and the tracked target keep six-axis synchronization, the tracking effect of the unmanned underwater vehicle can be improved, and the technical problem that the tracking method in the prior art only stays on a two-dimensional image and is poor in tracking effect is solved.
As shown in fig. 1, the visual locking method of the preferred embodiment of the present invention includes the following steps:
s1: and initializing equipment to finish the correction of the binocular camera.
S2: and acquiring the video stream of the binocular camera, and pushing the acquired video stream to the upper computer and the deep learning module 4. Preferably, one path of the obtained video stream of the binocular camera is pushed to the upper computer for real-time display, and the other path of the obtained video stream is pushed to the deep learning module 4 for data analysis.
S3: and selecting a tracking target and pushing a tracking target map to the deep learning module 4.
S4: the deep learning module 4 acquires three-dimensional position data of the tracking target in the left camera and/or the right camera based on the acquired binocular video stream and the tracking target map.
S5: and the three-dimensional position data of the tracking target in the left camera and/or the right camera, which is acquired by the deep learning module 4, is transmitted to the control module 5 in real time, the control module 5 judges whether the tracking target is in the left camera and/or the right camera or not based on the received three-dimensional position data, and controls the motion mode of the unmanned submersible based on the judgment result, so that the unmanned submersible and the tracking target keep six-axis synchronization.
According to a preferred embodiment, in step S1, the correction of the binocular camera is accomplished by: after equipment is initialized, obtaining internal and external parameters and distortion parameters of the binocular camera, and completing distortion correction of the binocular camera based on the obtained internal parameters and distortion parameters; and determining the position relation between the binocular cameras based on the acquired external parameters. Preferably, the method for obtaining the inside and outside parameters and the distortion parameters of the binocular camera, the distortion correction method of the binocular camera and the method for determining the position relationship between the binocular cameras are the same as those in the prior art, and are not described herein again.
According to a preferred embodiment, in step S3, the tracking target is selected by: and manually selecting a tracking target through an upper computer, and/or automatically detecting the tracking target through a target detection system. That is, the preferred technical solution of the present embodiment may select the tracking target in a manual and/or automatic manner.
According to a preferred embodiment, in step S4, the three-dimensional position data includes an X value, a Y value, and a depth value of the tracking target. Preferably, the depth value is depth information of the tracking target with respect to the left camera and/or the right camera. More preferably, the depth value is depth information of the tracking target with respect to the left camera.
According to a preferred embodiment, in step S4, the deep learning module 4 acquires three-dimensional position data of the tracking target relative to the left camera and/or the right camera by:
s41: and extracting the backbone features of the tracking target image and the left and right camera views by using a residual error network.
S42: extracting a tracking target graph and feature graphs of a left camera view and/or a right camera view at different depth positions to generate a network by a twin network region in a plurality of cascaded stages, and regressing the predicted positions and categories of the tracking target of the left camera view and/or the right camera view; and the predicted position of the tracking target is the X value and the Y value of the tracking target.
S43: based on feature maps of left and right camera views with different depths, a twin network is used for calculating matching cost, a convolutional neural network is used for returning the feature maps to the same size as an original map, a plurality of self-encoders connected in series are used for optimizing the disparity map in a multi-stage mode, and the depth diff of the left camera and/or the right camera is obtained based on the optimized disparity map.
S44: and calculating the depth D of the tracking target relative to the left camera and/or the right camera according to the binocular camera base line B, the left camera and/or the right camera focal length F and the depth diff, wherein D is F B/diff.
Fig. 2 is a schematic diagram illustrating a step of acquiring three-dimensional position data of a tracking target by the deep learning module according to a preferred embodiment of the present invention. Preferably, the deep learning module 4 calculates three-dimensional position data of each frame based on the acquired binocular video stream and the tracking target map.
As shown in fig. 2, the method first extracts the backbone features using a residual error network (resnet50) for the tracking target map and the left and right camera views, respectively. And then extracting a tracking target map and feature maps of the left camera view and/or the right camera view at different depth positions to generate a twin network region generation network in a plurality of stages in cascade, and regressing the predicted position (bbox) and the predicted class (cls) of the left camera view tracking target object. The predicted positions of the tracking targets are the X and Y values of the tracking targets. And calculating matching cost by using a twin network (siamese) for the feature maps of the left camera view and the right camera view with different depths, returning the feature maps to the same size as the original map by using a convolutional neural network, then using a plurality of self-encoders connected in series to carry out multi-stage optimization on the disparity map, and acquiring the depth diff of the left camera and/or the right camera based on the optimized disparity map. And finally, calculating the depth D of the tracking target relative to the left camera and/or the right camera according to the binocular camera base line B, the left camera and/or the right camera focal length F and the depth diff, wherein D is F B/diff.
Preferably, the depth D in this embodiment may be information of the object with respect to the left camera, or may be information of the object with respect to the right camera, and the calculation methods thereof are all the same.
Preferably, the residual error network (resnet50), the twin network (siamese) and the convolutional neural network are all methods in the prior art, and the specific processes thereof are not described again.
According to a preferred embodiment, in step S5, the control module 5 determines whether the tracking target is in the left camera and/or the right camera by: and comparing the detected X value and Y value of the tracking target with the visual field range of the left camera and/or the right camera, wherein the visual field range of the left camera and/or the right camera is (width/2-width/10, width/2+ width/10), (height/2-height/10, height/2 + height/10), and the width and the height are respectively the width and the height of the resolution. Specifically, the deep learning module 4 transmits the calculated three-dimensional position data (X, Y, depth) to the control module 5 in real time, and the control module 5 compares the received position data with the visual field range of the left camera and/or the right camera to determine whether the tracking target is in the left camera and/or the right camera.
According to a preferred embodiment, the control module 5 controls the motion of the unmanned vehicle by: the control module 5 calculates a yaw angle, a pitch angle, and/or a velocity of the unmanned submersible based on a distance between the tracking target and the unmanned submersible, and controls a motion pattern of the unmanned submersible based on the obtained yaw angle, pitch angle, and/or velocity. The control module 5 calculates the yaw angle, the pitch angle and/or the speed of the unmanned submersible based on the judgment result and the distance between the tracking target and the unmanned submersible, controls the motion mode of the unmanned submersible based on the obtained yaw angle, pitch angle and/or speed, and controls the tracking target to the center of the visual field of the left camera and/or the right camera after the unmanned submersible receives the regulation and control information of the control module 5, so that the unmanned submersible and the tracking target keep six-axis synchronization.
The vision locking device of the embodiment utilizes the vision locking method of any technical scheme of the embodiment to enable the unmanned submersible vehicle and the tracking target to keep six-axis synchronization.
Preferably, the visual locking device comprises an initialization module 1, a data acquisition module 2, a tracking target selection module 3, a deep learning module 4 and a control module 5, as shown in fig. 3. The initialization module 1 is used for initializing equipment and finishing correction of the binocular camera. The data acquisition module 2 and the tracking target selection module 3 are connected with the deep learning module 4, wherein the data acquisition module 2 is used for acquiring a video stream of a binocular camera and pushing the acquired video stream to the deep learning module 4, the tracking target selection module 3 is used for selecting a tracking target and pushing a tracking target image to the deep learning module 4, and the deep learning module 4 acquires three-dimensional position data of the tracking target in the left camera and/or the right camera based on the acquired binocular video stream and the tracking target image. The deep learning module 4 is connected with the control module 5, the control module 5 is connected with the unmanned submersible, the deep learning module 4 transmits acquired three-dimensional position data of the tracking target in the left camera and/or the right camera to the control module 5 in real time, the control module 5 judges whether the tracking target is in the left camera and/or the right camera based on the received three-dimensional position data, and controls the motion mode of the unmanned submersible based on the judgment result, so that the unmanned submersible and the tracking target keep six-axis synchronization.
The vision locking device comprises an initialization module 1, a data acquisition module 2, a tracking target selection module 3, a deep learning module 4 and a control module 5, wherein after the remote control unmanned submersible selects a tracking target, the tracking target is tracked, six-axis synchronization is realized, namely, under the condition of water flow interference fluctuation and the like, the position of the unmanned submersible can be automatically fed back and adjusted in real time, so that the unmanned submersible keeps synchronization with the tracking target, and the unmanned submersible can conveniently carry out operations such as shooting or grabbing on the tracking target.
According to the visual locking method and device, on one hand, the distance of a tracked target object can be accurately deduced, depth information is provided, and accurate control of six-axis freedom is achieved; on the other hand, the method also has the advantage of higher feature extraction quality, and can ensure more stable tracking quality under the conditions of shielding, multiple similar targets, motion blurring and severe change of ambient illumination.
It is understood that the same or similar parts in the present embodiment may be mutually referred to, and the same or similar contents in other embodiments may be referred to for the contents which are not described in detail in some embodiments.
It is noted that, in the description of the present application, the meaning of "a plurality" means at least two unless otherwise specified.
The term "connection" as used herein may refer to one or more of a data connection, a communication connection, a wired connection, a wireless connection, a connection via a physical connection, and the like, as will be appreciated by those skilled in the art.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (8)

1. A vision locking method for enabling an unmanned underwater vehicle to maintain six-axis synchronization with a tracked target by acquiring three-dimensional position data of the tracked target and controlling an operation mode of the unmanned underwater vehicle in real time based on the acquired three-dimensional position data, comprising the steps of:
s1: initializing equipment to finish correction of the binocular camera;
s2: acquiring a video stream of a binocular camera, and pushing the acquired video stream to an upper computer and a deep learning module;
s3: selecting a tracking target, and pushing a tracking target image to a deep learning module;
s4: the deep learning module acquires three-dimensional position data of a tracking target in the left camera and/or the right camera based on the acquired binocular video stream and the tracking target image;
s5: the three-dimensional position data of the tracking target in the left camera and/or the right camera, which are acquired by the deep learning module, are transmitted to the control module in real time, the control module judges whether the tracking target is in the left camera and/or the right camera or not based on the received three-dimensional position data, and controls the motion mode of the unmanned submersible based on the judgment result, so that the unmanned submersible and the tracking target keep six-axis synchronization;
in step S4, the deep learning module acquires three-dimensional position data of a tracking target relative to the left camera and/or the right camera by:
s41: extracting the backbone characteristics of the tracking target image and the left and right camera views by using a residual error network;
s42: extracting a tracking target graph and feature graphs of a left camera view and/or a right camera view at different depth positions, cascading the feature graphs and the feature graphs, inputting twin network regions of multiple stages to generate a network, and regressing the predicted positions and categories of the tracking target of the left camera view and/or the right camera view; wherein the predicted position of the tracking target is the X value and the Y value of the tracking target;
s43: calculating matching cost by using a twin network based on feature maps of left and right camera views with different depths, returning the feature maps to the same size as an original map by using a convolutional neural network, then using a plurality of self-encoders connected in series to optimize a disparity map in a multi-stage manner, and acquiring the depth diff of a left camera and/or a right camera based on the optimized disparity map;
s44: and calculating the depth D of the tracking target relative to the left camera and/or the right camera according to the binocular camera base line B, the left camera and/or the right camera focal length F and the depth diff, wherein D is F B/diff.
2. The visual locking method according to claim 1, wherein in step S1, the correction of the binocular camera is accomplished by: after equipment is initialized, obtaining internal and external parameters and distortion parameters of the binocular camera, and completing distortion correction of the binocular camera based on the obtained internal parameters and distortion parameters; and determining the position relation between the binocular cameras based on the acquired external parameters.
3. The visual locking method according to claim 1, wherein in step S3, the tracking target is selected by: and manually selecting a tracking target through an upper computer, and/or automatically detecting the tracking target through a target detection system.
4. The visual locking method of claim 1, wherein in step S4, the three-dimensional position data includes an X value, a Y value, and a depth value of a tracking target.
5. The visual locking method of claim 4, wherein the depth value is depth information of a tracking target relative to a left camera and/or a right camera.
6. The visual locking method according to claim 1, wherein in step S5, the control module determines whether the tracking target is in the left camera and/or the right camera by:
and comparing the detected X value and Y value of the tracking target with the visual field range of the left camera and/or the right camera, wherein the visual field range of the left camera and/or the right camera is (width/2-width/10, width/2+ width/10), (height/2-height/10, height/2 + height/10), and the width and the height are respectively the width and the height of the resolution.
7. The visual locking method of claim 1, wherein the control module controls the manner of motion of the unmanned vehicle by:
the control module calculates a yaw angle, a pitch angle, and/or a velocity of the unmanned submersible based on a distance between the tracked target and the unmanned submersible, and controls a mode of motion of the unmanned submersible based on the obtained yaw angle, pitch angle, and/or velocity.
8. A visual locking device, characterized in that the unmanned underwater vehicle is kept in six-axis synchronization with a tracking target by the visual locking method as claimed in any one of claims 1 to 7, and
the visual locking device comprises an initialization module, a data acquisition module, a tracking target selection module, a deep learning module and a control module, wherein,
the initialization module is used for initializing equipment and finishing the correction of the binocular camera;
the data acquisition module and the tracking target selection module are connected with the deep learning module, wherein the data acquisition module is used for acquiring a video stream of a binocular camera and pushing the acquired video stream to the deep learning module, the tracking target selection module is used for selecting a tracking target and pushing a tracking target image to the deep learning module, and the deep learning module acquires three-dimensional position data of the tracking target in the left camera and/or the right camera based on the acquired binocular video stream and the tracking target image;
the deep learning module is connected with the control module, the control module is connected with the unmanned submersible, the deep learning module transmits acquired three-dimensional position data of the tracking target in the left camera and/or the right camera to the control module in real time, the control module judges whether the tracking target is in the left camera and/or the right camera or not based on the received three-dimensional position data, and controls the motion mode of the unmanned submersible based on the judgment result, so that the unmanned submersible and the tracking target keep six-axis synchronization.
CN202010542145.XA 2020-06-15 2020-06-15 Visual locking method and device Active CN111798496B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010542145.XA CN111798496B (en) 2020-06-15 2020-06-15 Visual locking method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010542145.XA CN111798496B (en) 2020-06-15 2020-06-15 Visual locking method and device

Publications (2)

Publication Number Publication Date
CN111798496A CN111798496A (en) 2020-10-20
CN111798496B true CN111798496B (en) 2021-11-02

Family

ID=72804310

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010542145.XA Active CN111798496B (en) 2020-06-15 2020-06-15 Visual locking method and device

Country Status (1)

Country Link
CN (1) CN111798496B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114277725A (en) * 2021-12-20 2022-04-05 民航成都电子技术有限责任公司 Airfield runway target foreign matter disposal equipment

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103745460A (en) * 2013-12-30 2014-04-23 深圳市开天源自动化工程有限公司 Method, device and system for tracking water body organisms on basis of three-dimensional image analysis
CN106989730A (en) * 2017-04-27 2017-07-28 上海大学 A kind of system and method that diving under water device control is carried out based on binocular flake panoramic vision
CN107999955A (en) * 2017-12-29 2018-05-08 华南理工大学 A kind of six-shaft industrial robot line laser automatic tracking system and an automatic tracking method
CN108536157A (en) * 2018-05-22 2018-09-14 上海迈陆海洋科技发展有限公司 A kind of Intelligent Underwater Robot and its system, object mark tracking
CN108876855A (en) * 2018-05-28 2018-11-23 哈尔滨工程大学 A kind of sea cucumber detection and binocular visual positioning method based on deep learning
CN108875683A (en) * 2018-06-30 2018-11-23 北京宙心科技有限公司 Robot vision tracking method and system
CN109062229A (en) * 2018-08-03 2018-12-21 北京理工大学 The navigator of underwater robot system based on binocular vision follows formation method
CN110488847A (en) * 2019-08-09 2019-11-22 中国科学院自动化研究所 The bionic underwater robot Hovering control mthods, systems and devices of visual servo
CN110543859A (en) * 2019-09-05 2019-12-06 大连海事大学 sea cucumber autonomous recognition and grabbing method based on deep learning and binocular positioning
CN110539062A (en) * 2019-09-29 2019-12-06 华南理工大学 deep sea pipeline plasma additive manufacturing in-situ repair equipment and method
CN111062990A (en) * 2019-12-13 2020-04-24 哈尔滨工程大学 Binocular vision positioning method for underwater robot target grabbing

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090315704A1 (en) * 2008-06-19 2009-12-24 Global Biomedical Development, Llc, A Georgia Limited Liability Company Method and Integrated System for Tracking Luggage

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103745460A (en) * 2013-12-30 2014-04-23 深圳市开天源自动化工程有限公司 Method, device and system for tracking water body organisms on basis of three-dimensional image analysis
CN106989730A (en) * 2017-04-27 2017-07-28 上海大学 A kind of system and method that diving under water device control is carried out based on binocular flake panoramic vision
CN107999955A (en) * 2017-12-29 2018-05-08 华南理工大学 A kind of six-shaft industrial robot line laser automatic tracking system and an automatic tracking method
CN108536157A (en) * 2018-05-22 2018-09-14 上海迈陆海洋科技发展有限公司 A kind of Intelligent Underwater Robot and its system, object mark tracking
CN108876855A (en) * 2018-05-28 2018-11-23 哈尔滨工程大学 A kind of sea cucumber detection and binocular visual positioning method based on deep learning
CN108875683A (en) * 2018-06-30 2018-11-23 北京宙心科技有限公司 Robot vision tracking method and system
CN109062229A (en) * 2018-08-03 2018-12-21 北京理工大学 The navigator of underwater robot system based on binocular vision follows formation method
CN110488847A (en) * 2019-08-09 2019-11-22 中国科学院自动化研究所 The bionic underwater robot Hovering control mthods, systems and devices of visual servo
CN110543859A (en) * 2019-09-05 2019-12-06 大连海事大学 sea cucumber autonomous recognition and grabbing method based on deep learning and binocular positioning
CN110539062A (en) * 2019-09-29 2019-12-06 华南理工大学 deep sea pipeline plasma additive manufacturing in-situ repair equipment and method
CN111062990A (en) * 2019-12-13 2020-04-24 哈尔滨工程大学 Binocular vision positioning method for underwater robot target grabbing

Also Published As

Publication number Publication date
CN111798496A (en) 2020-10-20

Similar Documents

Publication Publication Date Title
CN108229366B (en) Deep learning vehicle-mounted obstacle detection method based on radar and image data fusion
Beall et al. 3D reconstruction of underwater structures
CN112785702A (en) SLAM method based on tight coupling of 2D laser radar and binocular camera
CN109604777A (en) Welding seam traking system and method based on laser structure light
CN111837144A (en) Enhanced image depth sensing using machine learning
US9940725B2 (en) Method for estimating the speed of movement of a camera
CN109143247B (en) Three-eye underwater detection method for acousto-optic imaging
CN109191504A (en) A kind of unmanned plane target tracking
CN100554877C (en) A kind of real-time binocular vision guidance method towards underwater research vehicle
WO2005033629A2 (en) Multi-camera inspection of underwater structures
CN106780631A (en) A kind of robot closed loop detection method based on deep learning
US20090297036A1 (en) Object detection on a pixel plane in a digital image sequence
CA2870480A1 (en) Hybrid precision tracking
CN110412584A (en) A kind of mobile quick splicing system of underwater Forward-Looking Sonar
Rahman et al. Contour based reconstruction of underwater structures using sonar, visual, inertial, and depth sensor
Wang et al. Monocular visual SLAM algorithm for autonomous vessel sailing in harbor area
CN111798496B (en) Visual locking method and device
Salvi et al. Visual SLAM for 3D large-scale seabed acquisition employing underwater vehicles
Cortés-Pérez et al. A mirror-based active vision system for underwater robots: From the design to active object tracking application
CN115937810A (en) Sensor fusion method based on binocular camera guidance
CN108469729A (en) A kind of human body target identification and follower method based on RGB-D information
Wu et al. Research progress of obstacle detection based on monocular vision
CN112862865A (en) Detection and identification method and device for underwater robot and computer storage medium
Germi et al. Estimation of moving obstacle dynamics with mobile RGB-D camera
CN113534824B (en) Visual positioning and close-range dense formation method for underwater robot clusters

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant