CN117456531B - Multi-view pure rotation anomaly identification and automatic mark training method, equipment and medium - Google Patents

Multi-view pure rotation anomaly identification and automatic mark training method, equipment and medium Download PDF

Info

Publication number
CN117456531B
CN117456531B CN202311785200.8A CN202311785200A CN117456531B CN 117456531 B CN117456531 B CN 117456531B CN 202311785200 A CN202311785200 A CN 202311785200A CN 117456531 B CN117456531 B CN 117456531B
Authority
CN
China
Prior art keywords
view
pure rotation
pure
training
anomaly
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311785200.8A
Other languages
Chinese (zh)
Other versions
CN117456531A (en
Inventor
陈果
蒋书豪
蔡奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Leshan Vocational and Technical College
Original Assignee
Leshan Vocational and Technical College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Leshan Vocational and Technical College filed Critical Leshan Vocational and Technical College
Priority to CN202311785200.8A priority Critical patent/CN117456531B/en
Publication of CN117456531A publication Critical patent/CN117456531A/en
Application granted granted Critical
Publication of CN117456531B publication Critical patent/CN117456531B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/757Matching configurations of points or features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-view pure rotation anomaly identification and automatic marking training method, equipment and medium, which are characterized in that image data acquired by a camera are acquired; double-view relative pose estimation and double-view pure rotation index calculation; calculating a multi-view pure rotation index; it is identified whether there are pure rotated views in the multiple views that result in global position solution anomalies. The method and the device can accurately identify the view set which causes the pure rotation abnormality in the multiple views. Compared with a double-view identification method, the method starts from multi-view geometric constraint, the pure rotation anomaly judgment is more comprehensive and finer, and the robustness of the camera displacement estimation and cluster adjustment technology is ensured. Based on the multi-view pure rotation anomaly identification scheme, an automatic marking and training scheme of the pure rotation anomaly view is designed, and a view set which causes the pure rotation anomaly in the multi-view can be identified in advance before visual geometry calculation by utilizing a trained network model.

Description

Multi-view pure rotation anomaly identification and automatic mark training method, equipment and medium
Technical Field
The invention relates to the technical field of computer vision, in particular to a multi-view pure rotation anomaly identification and automatic marking training method, equipment and medium.
Background
Three-dimensional vision computing is the basis for many computer vision applications. The accurate camera pure rotation common knowledge is an important link in three-dimensional vision calculation, and can help to improve the accuracy, robustness and performance of camera pose estimation, three-dimensional reconstruction, instant positioning and map construction SLAM, target tracking and other computer vision applications. For example, the cluster adjustment Bundle Adjustment (BA) technology is used as a bottom core technology of visual navigation and reconstruction, and when a camera generates pure rotation motion, the technology can generate a situation that optimization cannot be converged, so that corresponding pose estimation and three-dimensional reconstruction results are abnormal. In addition, the current latest camera position estimation method solves for the camera position by a linear constraint (LiGT constraint) on the camera position. If a view exists in multiple views as a pure rotational motion view, the equation of the LiGT constraint on the view will degrade, so that the camera position cannot be normally solved. Therefore, the pure rotation abnormality is accurately determined, and the method is significant for improving the robustness of the visual system operation. This means that a pure set of rotated views that may lead to pose and scene recovery crashes need to be identified from multiple views before employing BA technology or LiGT algorithm.
In fact, if there is a view of the multiple views that is a pure rotational motion view, the camera position of the pure rotational view and the large number of 3D feature points observed by the view may not be initialized, which would lead to a breakdown of BA optimization, so early recognition is required. Furthermore, the observation matrix in the LiGT constraintLFurther rank deficiency will also result in an inability to estimate camera position normally. The current computer vision field usually adopts an identification method based on a double-view geometric model, such as parallax angle size, intersection constraint and the like, to judge pure rotation, but the double-view pure rotation judgment cannot confirm which view in double views possibly causes pure rotation abnormality of camera position estimation and three-dimensional reconstruction in multiple views, and cannot identify whether potential pure rotation abnormality exists in the current multi-view estimation to be processed, so that robustness of multi-view pose recovery and three-dimensional reconstruction cannot be well ensured. In addition, the conventional recognition scheme can recognize the pure rotation abnormal view after a plurality of visual geometry calculation links. Whether the pure rotation anomaly view is an emerging research direction in the related field can be identified in advance by utilizing a network model trained by deep learning before visual geometry calculation.
Disclosure of Invention
The invention aims to provide a multi-view pure rotation anomaly identification and automatic marking training method, equipment and medium, and designs a multi-view pure rotation anomaly view identification scheme which can accurately identify a view set causing anomaly in multiple views. Compared with a double-view identification method, the method starts from multi-view geometric constraint, the pure rotation anomaly judgment is more comprehensive and finer, and the robustness of the camera displacement estimation and cluster adjustment technology is ensured. Based on the multi-view pure rotation anomaly identification scheme, an automatic marking and training scheme of the pure rotation anomaly view is designed, and a view set which causes the pure rotation anomaly in the multi-view can be identified in advance before visual geometry calculation by utilizing a trained network model.
The invention is realized by the following technical scheme:
aiming at the defects in the prior art, the first aspect of the invention provides a multi-view pure rotation anomaly identification and automatic mark training method, which comprises the following specific steps:
anomaly identification for multi-view pure rotation:
acquiring image data acquired by a camera, calculating multi-view pure rotation indexes of all views according to a double-view estimation result, and identifying whether pure rotation views causing global position solving abnormality exist in the multi-views;
automatic labeling and training are carried out on the pure rotation abnormal view:
identifying and marking a view set which causes pure rotation abnormality in multiple views by utilizing the multi-view pure rotation index, and constructing a local view set containing the pure rotation abnormality;
constructing a deep neural network model, and generating a training set for training the deep neural network model according to the local view set containing the pure rotation anomalies;
and identifying pure rotation abnormal views in the view set according to the trained deep neural network model.
According to the invention, by designing the multi-view pure rotation abnormal view identification scheme, the abnormal view set caused by the multi-view can be accurately identified. From the multi-view geometric constraint, the pure rotation anomaly judgment is more comprehensive and finer, and the robustness of the camera displacement estimation and cluster adjustment technology is ensured. Based on the multi-view pure rotation anomaly identification scheme, an automatic marking and training scheme of the pure rotation anomaly view is designed, and a view set which causes the pure rotation anomaly in the multi-view can be identified in advance before visual geometry calculation by utilizing a trained network model.
Further, the performing anomaly identification on the multi-view pure rotation specifically includes:
step 11: image data acquired by a camera forms a view setDefine the set of co-view views +.>For representing and viewing->A set of views having matching relationships;
acquiring viewsView +.>And->Image matching is carried out, and double-view pure rotation identification index +.>Wherein i=1, 2,3, …, m, j=1, 2,3, …, m;
step 12: taking dual view pure rotation recognition thresholdCalculating multi-view pure rotation identification indexWherein the function H (·) is hertzWieder step function, +.>Representation->The number of views in (a);
step 13: fetching multi-view pure rotation recognition thresholdWhen->Consider view->Is a pure rotation anomaly view.
Further, the step 12 further includes utilizing a dual-view pure rotation indexConstructing a multi-view pure rotation identification index>
Further, the automatic labeling and training of the pure rotation anomaly view specifically includes:
step 21: composing a view set from image data acquired by a cameraAbnormality recognition based on multi-view pure rotation, composing a set of views having pure rotation abnormality +.>
Step 22: for viewsConstructing a set of co-view views +.>Generating the ith partial viewCollection setWherein->Placed at->The middle position of the collection element;
step 23: traversing the pure rotation abnormal view set V to construct n partial view setsAs a training set of the deep neural network model;
step 24: from a pure rotational anomaly view-setFor training set->Marking, in particular if +.>View of intermediate position->Then use the pure rotation anomaly tag corresponding to the view +.>Marking, otherwise->
Step 25: construction of a loss functionWherein->Representing a predicted value given by the deep neural network model;
step 26: camera acquisition according to trained modelView-set, according toGenerating a local view set for the test if the model outputs +.>Then explain the current view->The view is a pure rotation abnormal view, and the view is a normal view.
Further, the calculating the multi-view pure rotation identification index further includes: and introducing a multi-view pure rotation identification index to automatically mark the pure rotation abnormal view.
Further, when the training set is generated, the method further comprises:
automatically generating a local view set for training;
and constructing a loss function for pre-identifying the pure rotation abnormal view by using the marked pure rotation abnormal view, and training the network model by using a training set.
Further, after the model is trained, the method further comprises the steps of constructing a local view set for testing through a view set acquired by a camera, and pre-identifying pure rotation abnormal views of the test set by utilizing the trained network model.
A second aspect of the present invention provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing a multi-view pure rotation anomaly identification and automatic marker training method when the program is executed.
A third aspect of the present invention provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements a multi-view pure rotation anomaly identification and automatic marker training method.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the invention provides a multi-view pure rotation abnormal view identification scheme based on multi-view pure pose vision geometry, is applicable to different scene structures and camera motion conditions according to the characteristic that pure rotation indexes tend to be zero due to pure rotation abnormal views, and can accurately identify view sets which cause pure rotation abnormality in the multi-view;
the invention provides a new automatic marking and training scheme of pure rotation abnormal views based on a multi-view pure rotation abnormal recognition scheme, and a trained network model is utilized to pre-recognize a view set which causes pure rotation abnormal in multiple views before visual geometry calculation;
compared with a double-view identification method, the method provided by the invention has the advantages that from the multi-view geometric constraint, the pure rotation anomaly judgment is more comprehensive and finer, and the robustness of the camera displacement estimation and cluster adjustment technology is ensured.
Drawings
In order to more clearly illustrate the technical solutions of the exemplary embodiments of the present invention, the drawings that are needed in the examples will be briefly described below, it being understood that the following drawings only illustrate some examples of the present invention and therefore should not be considered as limiting the scope, and that other related drawings may be obtained from these drawings without inventive effort for a person skilled in the art. In the drawings:
FIG. 1 is an overview of an automatic labeling training scheme for pure rotation recognition in an embodiment of the present invention;
FIG. 2 is a comparison of different simulated motion types in an embodiment of the present invention; wherein a is linear motion and pure rotary motion in the Z-axis direction; b is linear motion and pure rotary motion in the X-axis direction; c is spiral motion and pure rotation motion in the Z-axis direction; d is spiral motion and pure rotation motion in the X-axis direction;
FIG. 3 illustrates position errors for various simulated motion types in an embodiment of the present invention; wherein a is linear motion and pure rotary motion in the Z-axis direction; b is linear motion and pure rotary motion in the X-axis direction; c is spiral motion and pure rotation motion in the Z-axis direction; d is spiral motion and pure rotation motion in the X-axis direction;
FIG. 4 is a comparison of the re-projection errors for different motion types in an embodiment of the present invention; wherein a is linear motion and pure rotary motion in the Z-axis direction; b is linear motion and pure rotary motion in the X-axis direction; c is spiral motion and pure rotation motion in the Z-axis direction;
FIG. 5 is a performance of pure rotation identification in an embodiment of the invention; wherein a is an M3 index; b is PRI index; c is the position error in 20 monte carlo tests;
FIG. 6 is an example of a visualization of a virtual KITTI data set in an embodiment of the invention;
FIG. 7 is a graph showing the result of identifying the front and rear position errors through multi-view pure rotation anomalies in an embodiment of the invention.
Description of the embodiments
For the purpose of making apparent the objects, technical solutions and advantages of the present invention, the present invention will be further described in detail with reference to the following examples and the accompanying drawings, wherein the exemplary embodiments of the present invention and the descriptions thereof are for illustrating the present invention only and are not to be construed as limiting the present invention.
As a possible embodiment, as shown in fig. 1, the first aspect of the present embodiment provides a multi-view pure rotation anomaly identification and automatic marking training method, which designs a multi-view pure rotation anomaly view identification scheme based on a dual-view pure rotation identification index, and can accurately identify a view set causing anomaly in multiple views. Compared with a double-view identification method, the method starts from multi-view geometric constraint, the pure rotation anomaly judgment is more comprehensive and finer, and the robustness of the camera displacement estimation and cluster adjustment technology is ensured. Based on the multi-view pure rotation anomaly identification scheme, an automatic marking and training scheme of the pure rotation anomaly view is designed, and a view set which causes the pure rotation anomaly in the multi-view can be identified in advance before visual geometry calculation by utilizing a trained network model.
Specifically, the multi-view pure rotation anomaly identification scheme comprises the following specific steps:
step 1: calculating a double-view pure rotation identification index;
step 1.1: assume that image data acquired by a camera constitutes a view setDefining a set of common view viewsFor representing and viewing->A set of views having a matching relationship, wherein the view set +.>Representation and viewA set of views capable of constructing at least 5 matching point pair relationships, view +.>View->And->Can be correctly matched +.>Pairs of image feature points form a set->WhereinAnd->Respectively represent view +.>And->The kth matched normalized image feature point is arranged;
step 1.2: definition matrixTaking the singular vector +.>And from +.>Construction matrix->
Step 1.3: SVD decomposition of Q, i.eWherein->Is arranged in descending order of singular values->A diagonal array is formed;
step 1.4: calculating vectorsWhere det (UG) represents the determinant of matrix UG;
step 1.5: two kinds of double-view pure rotation identification indexes are calculatedAnd->,/>Representation view +.>And->Is a relative attitude of (2);
step 2: taking dual view pure rotation recognition thresholdValue ofCalculating multi-view pure rotation identification indexWherein the function H (·) is a Herveliedel step function, < >>Representation->The number of views in (a); in particular, for all views, a multi-view pure rotation index set will be obtained +.>. The index calculated according to step 1.5 +.>And->Taking the appropriate threshold value +.>And->The corresponding multi-view pure rotation identification index set can be calculated and respectively marked as index +.>And->
Step 13: fetching multi-view pure rotation recognition thresholdWhen->Consider view->Is a pure rotation anomaly view;
step 4: traversing all viewsThe view-set can be divided into a pure rotation abnormal view-set and a normal view-set, respectively denoted +.>And->
Step 5: restoring normal view set according to relative pose resultGlobal pose R of (2);
step 6: for normal view setsSolving the global displacement t by utilizing a LiGT algorithm according to the global attitude R and the image observation;
step 7: restoration of normal view sets using BA optimizationCorresponding camera pose and 3D feature points.
The automatic marking and training scheme of the pure rotation abnormal view comprises the following specific steps:
step 1: composing a view set from image data acquired by a cameraAccording to the multi-view pure rotation abnormality recognition scheme, calculating a multi-view pure rotation recognition index +.>
Step 3: for viewsConstruction structureBuilding and current view->A set of views which can be viewed together +.>Generating the i-th partial view set +.>Wherein->Placed at->The middle position of the collection element;
step 4: traversing the pure rotation abnormal view set V to construct n partial view setsAs a training set of the deep neural network model;
step 5: from a pure rotational anomaly view-setFor training set->Marking, in particular if +.>View of intermediate position->Then use the pure rotation anomaly tag corresponding to the view +.>Marking, otherwise->
Step 6: construction of a loss functionWherein->Representing a predicted value given by the deep neural network model;
step 7: according to the trained model, the view set acquired by the camera is as followsGenerating a local view set for the test if the model outputs +.>Then explain the current view->For the pure rotation abnormal view, on the contrary, for the normal view, the second aspect of the present embodiment provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and capable of running on the processor, where the processor implements a multi-view pure rotation abnormal recognition and automatic marking training method when executing the program.
A third aspect of the present embodiment provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a multi-view pure rotation anomaly identification and automatic marker training method.
As one possible embodiment, to verify the validity of the multi-view pure rotation anomaly identification scheme, this embodiment compares the two camera position estimation methods LUD (Theia platform) and LiGT (openMVG platform) employed in the mainstream global SfM platform and currently State-of-art. Linear motion can easily lead to position estimation anomalies, and LUDs exhibit anomalies under collinear motion. To increase the challenges of anomaly identification, simulation experiments combine linear motion with a large number of pure rotational motions, thereby designing a relatively complex and anomaly-prone simulation environment. Specifically, as shown in fig. 2, the simulation experiment tests four different camera movements by adjusting the normal camera direction and movement pattern. The simulated camera motion trajectory includes 40 views, including 20 normal motion views and 20 pure rotational motion views. The position of the pure rotational motion view is fixed at the position of the last normal motion view. As shown in fig. 2 (a) and (b), in the case where the simulation experiment tests different camera poses by adjusting two normal camera poses, the influence of the contrast-only rotated view-set on the recovery of multi-view pose estimation is analyzed. In order to test the effect of the new solution in combination of non-linear motion and pure rotational motion, a spiral motion structure as shown in fig. 2 (c) was designed. Furthermore, as shown in fig. 2 (d), the simulation experiment also simulates an experimental scenario without pure rotational motion. The number of 3D feature points is 4000 points initially, and the center of a camera motion straight line is uniformly distributed in a cube. The imaging of the image characteristic points in the simulation environment adds chiral constraint, imaging plane length and width and other limitations so as to approximate to a real imaging scene as much as possible. Because the embodiment mainly verifies that the pure rotation abnormality exists in the method identification effect, the simulation experiment adopts ideal zero noise simulation conditions for testing. The simulation experiment passes through 20 Montaikano repeated tests so as to ensure the repeatability of the experimental effect.
Fig. 3 shows the results of the error experiments, giving the following observations: (1) Simulation experiments with X-axis directed towards combined linear and pure rotational motion are challenging scenarios. Under ideal conditions, the position of the camera recovered by the LUD method is abnormal, and the position error of the normal motion structure (such as circular motion in fig. 2 (d)) reaches 10 -13 But in other simulation cases there were anomalies (error about 10 0 ). Even after BA optimization, the result of the LUD-BA is difficult to converge to a true value, so the simulated motions in fig. 2 (a) - (c) are challenging for the LUD algorithm. In contrast, the LiGT algorithm only has anomalies in the combined motion of the X-axis toward straight line and pure rotation (shown in fig. 2 (a)), with a position error level of 10 -1 To 10 0 Between them. Experiments confirm that the combined motion of the X-axis towards straight line and pure rotation is also challenging for the currently best global displacement algorithm (LiGT algorithm); (2) The LiGT algorithm has a certain robustness and accuracy, i.e. the ideal camera position estimation level can be maintained in the tests of fig. 3 (a), (c) and (d). Therefore, in the subsequent pure rotation identification experiment, the implementation is only required to combineThe LiGT algorithm verifies the performance of the multi-view pure rotation recognition scheme.
In each monte carno test we calculated the re-projection error of BA. Fig. 4 shows that anomalies in camera position estimates cannot be identified using BA re-projection errors. Specifically, while fig. 4 shows that BA optimization can reduce the re-projection errors of the LUD and LiGT algorithms, by comparing the position errors in fig. 3 (a) - (c) one by one with the results in fig. 4, it can be observed that the camera position estimate is still abnormal even though BA optimization has been performed. For example, taking the combined motion of X-axis towards straight line and pure rotation as an example, the BA re-projection error of LiGT algorithm before BA optimization is distributed at 10 -5 Left and right (at this time, position error 10) -1~ 10 0 ) The BA reprojection error corresponding to LiGT-BA after BA optimization is reduced to 10 -10 In the vicinity (at this time, the position error is still 10 -1~ 10 0 )。
FIGS. 5 (a) - (b) illustrate multi-view based pure rotation metricsAnd->The four simulated motion values calculated in the first Meng Taika nuo test and the position error of the LiGT algorithm after multi-view pure rotation recognition. In this embodiment +.>And->The threshold values for identifying pure rotation abnormality are +.>And. For convenience of presentation, the present embodiment uses +.>And->Respectively indicates that the multi-view pure rotation index is passed>And->And identifying and eliminating abnormal results. As shown in FIG. 5 (a), there are several views +.>A value exceeding 0.8; thus, an additional threshold +.>. After filtering out these extremely pure rotational movements, FIG. 5 (c) shows +.>And->The LiGT algorithm position error of (2) is about 10 -8 The LiGT algorithm performance is recovered to be normal after the multi-view pure rotation anomaly identification scheme is processed, and the effectiveness of the new scheme is verified.
The present embodiment uses multi-view pure rotation index statistics at 0 to 2 pixel noise levelsPosition errors of LiGT algorithm before and after processing. For convenience of presentation, here will be->Simplified to PRD. The results shown within the table in fig. 7 still verify the effectiveness of the multi-view pure rotation anomaly identification scheme in identifying and correcting pure rotation anomaly views in the camera position estimate. In particular, it can be seen from the table that +.>The PRD-LiGT position error after identifying and eliminating the pure rotation abnormal view is obviously reduced. Under the noiseless condition, the PRD-LiGT position error is reduced to 3.87 multiplied by 10 -8 Compared with the original LiGT algorithm, the position error (4.46 multiplied by 10 -1 ) A number of orders of magnitude lower. As the noise level increases, the PRD-LiGT position error continues to remain below the accuracy advantage of the original LiGT algorithm by nearly 3 orders of magnitude.
To verify the effectiveness of the automatic labeling and training scheme for purely rotational anomaly views, as one possible embodiment, the present example employs a Long-term Recurrent Convolutional Networks (LRCNs) network that combines Convolutional Neural Networks (CNNs) and Long-short term memory (LSTM) networks to extract the spatiotemporal features of the view sequence. We used the Virtual KITTI dataset to evaluate the effectiveness of the pure rotational anomaly pre-recognition model. This dataset is a real composite video dataset for a variety of video understanding tasks, containing 50 high resolution monocular videos generated from five virtual worlds under different imaging and weather conditions. The video is completely marked, 3D camera gesture information is provided for each frame of picture, and the method is suitable for evaluating the effectiveness of the automatic marking and training scheme of the pure rotation abnormal view. For the full assessment we divided the dataset into training set (80%), validation set (10%) and test set (10%) using random partitioning. We select the res net50 as the backbone network for the model. The training step included 10 time periods using Adam optimization algorithm with a learning rate of 0.001. The whole training process is fixedly provided with a local view set with the length of 32 views. The evaluation result shows that the automatic marking and the trained pure rotation abnormality pre-recognition model of the embodiment are effective, the accuracy of 91% is realized on the test set, and the feasibility and the potential of the new scheme in practical application are verified. For example, fig. 6 shows the results of a qualitative visualization on a Virtual KITTI dataset, which demonstrate the effectiveness and robustness of the proposed model to detect purely rotational anomaly views under different environmental and illumination conditions.
The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (7)

1. The multi-view pure rotation anomaly identification and automatic marking training method is characterized by comprising the following specific steps of:
anomaly identification for multi-view pure rotation:
acquiring image data acquired by a camera, calculating multi-view pure rotation indexes of all views according to a double-view estimation result, and identifying whether pure rotation views causing global position solving abnormality exist in the multi-views;
the anomaly identification for the multi-view pure rotation specifically comprises the following steps:
step 11: image data acquired by a camera form a view set V= { I 1 ,...,I n Defining a set of common view V i For representing and viewing I i A set of views having matching relationships;
acquisition of View I j ∈V i View I i And I j Image matching is carried out, and double-view pure rotation identification indexes are calculated by utilizing the matched image feature pointsWhere i=1, 2,3, …, m, j=1, 2,3, …, m;
step 12: taking a double-view pure rotation recognition threshold delta, and calculating a multi-view pure rotation recognition indexWherein the function H (·) is a Herveliedel step function, n i Represents V i The number of views in (a);
step 13: fetching multi-view pure rotation recognition thresholdWhen->View I is considered i Is a pure rotation anomaly view;
automatic labeling and training are carried out on the pure rotation abnormal view:
identifying and marking a view set which causes pure rotation abnormality in multiple views by utilizing the multi-view pure rotation index, and constructing a local view set containing the pure rotation abnormality;
constructing a deep neural network model, and generating a training set for training the deep neural network model according to the local view set containing the pure rotation anomalies;
identifying pure rotation abnormal views in the view set according to the trained deep neural network model;
the automatic marking and training of the pure rotation abnormal view specifically comprises the following steps:
step 21: image data acquired by a camera form a view set V= { I 1 ,...,I n Anomaly recognition based on multi-view pure rotation, composing set V for views with pure rotation anomalies outlier
Step 22: for view I i E, V, constructing a common view set V i Generating the ith local view set W i ={I i }∪V i Wherein I i Placed at W i The middle position of the collection element;
step 23: traversing the pure rotation abnormal view set V to construct n partial view sets { W } j |I j E, V, as a training set of the deep neural network model;
step 24: from a pure rotation anomaly view set V outlier For training set W i Marking, in particular if W i View I of intermediate position i ∈V outlier Then use the pure rotation anomaly label y corresponding to the view i =1, otherwise y i =0;
Step 25: construction of a loss functionWherein the method comprises the steps of/>Representing a predicted value given by the deep neural network model;
step 26: according to the trained model, the set of views acquired by the camera is then recorded as W in step 22 i ={I i }∪V i Is used to generate a set of local views for testing if the model outputs y i =1, then the current view I is explained i The view is a pure rotation abnormal view, and the view is a normal view.
2. The multi-view pure rotation anomaly identification and automatic tagging training method according to claim 1, wherein the step 12 further comprises utilizing a dual-view pure rotation indexConstructing a multi-view pure rotation identification index>
3. The multi-view pure rotation anomaly identification and automatic tagging training method according to claim 1, wherein the calculating the multi-view pure rotation identification index further comprises: and introducing a multi-view pure rotation identification index to automatically mark the pure rotation abnormal view.
4. The multi-view pure rotation anomaly identification and automatic labeling training method of claim 3, wherein the training set generation further comprises:
automatically generating a local view set for training;
and constructing a loss function for pre-identifying the pure rotation abnormal view by using the marked pure rotation abnormal view, and training the network model by using a training set.
5. The method for multi-view pure rotation anomaly identification and automatic labeling training according to claim 4, further comprising constructing a local view set for testing by a view set acquired by a camera after training the model, and pre-identifying pure rotation anomaly views for the test set by using the trained network model.
6. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the multi-view pure rotation anomaly identification and automatic marker training method of any one of claims 1 to 5 when the program is executed by the processor.
7. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the multi-view pure rotation anomaly identification and automatic marker training method of any one of claims 1 to 5.
CN202311785200.8A 2023-12-25 2023-12-25 Multi-view pure rotation anomaly identification and automatic mark training method, equipment and medium Active CN117456531B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311785200.8A CN117456531B (en) 2023-12-25 2023-12-25 Multi-view pure rotation anomaly identification and automatic mark training method, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311785200.8A CN117456531B (en) 2023-12-25 2023-12-25 Multi-view pure rotation anomaly identification and automatic mark training method, equipment and medium

Publications (2)

Publication Number Publication Date
CN117456531A CN117456531A (en) 2024-01-26
CN117456531B true CN117456531B (en) 2024-03-19

Family

ID=89589589

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311785200.8A Active CN117456531B (en) 2023-12-25 2023-12-25 Multi-view pure rotation anomaly identification and automatic mark training method, equipment and medium

Country Status (1)

Country Link
CN (1) CN117456531B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080052309A (en) * 2006-12-05 2008-06-11 한국전자통신연구원 Apparatus for generating multi-view image and its method
CN109214308A (en) * 2018-08-15 2019-01-15 武汉唯理科技有限公司 A kind of traffic abnormity image identification method based on focal loss function
CN111161355A (en) * 2019-12-11 2020-05-15 上海交通大学 Pure pose resolving method and system for multi-view camera pose and scene
CN111369608A (en) * 2020-05-29 2020-07-03 南京晓庄学院 Visual odometer method based on image depth estimation
CN113947621A (en) * 2021-10-22 2022-01-18 上海交通大学 Method and system for estimating displacement and three-dimensional scene point coordinates of multi-view camera

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8837811B2 (en) * 2010-06-17 2014-09-16 Microsoft Corporation Multi-stage linear structure from motion

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080052309A (en) * 2006-12-05 2008-06-11 한국전자통신연구원 Apparatus for generating multi-view image and its method
CN109214308A (en) * 2018-08-15 2019-01-15 武汉唯理科技有限公司 A kind of traffic abnormity image identification method based on focal loss function
CN111161355A (en) * 2019-12-11 2020-05-15 上海交通大学 Pure pose resolving method and system for multi-view camera pose and scene
WO2021114434A1 (en) * 2019-12-11 2021-06-17 上海交通大学 Pure pose solution method and system for multi-view camera pose and scene
CN111369608A (en) * 2020-05-29 2020-07-03 南京晓庄学院 Visual odometer method based on image depth estimation
CN113947621A (en) * 2021-10-22 2022-01-18 上海交通大学 Method and system for estimating displacement and three-dimensional scene point coordinates of multi-view camera

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
图像匹配方法研究综述;贾迪;朱宁丹;杨宁华;吴思;李玉秀;赵明远;;中国图象图形学报;20190516(05);17-39 *

Also Published As

Publication number Publication date
CN117456531A (en) 2024-01-26

Similar Documents

Publication Publication Date Title
Shamwell et al. Unsupervised deep visual-inertial odometry with online error correction for RGB-D imagery
Uittenbogaard et al. Privacy protection in street-view panoramas using depth and multi-view imagery
CN107481279B (en) Monocular video depth map calculation method
CN107424171B (en) Block-based anti-occlusion target tracking method
Liu et al. Robust and efficient relative pose with a multi-camera system for autonomous driving in highly dynamic environments
CN111402294A (en) Target tracking method, target tracking device, computer-readable storage medium and computer equipment
US20210166450A1 (en) Motion trajectory drawing method and apparatus, and device and storage medium
CN109035327B (en) Panoramic camera attitude estimation method based on deep learning
CN113450410B (en) Monocular depth and pose joint estimation method based on epipolar geometry
US9551579B1 (en) Automatic connection of images using visual features
CN109974743A (en) A kind of RGB-D visual odometry optimized based on GMS characteristic matching and sliding window pose figure
Liao et al. A deep ordinal distortion estimation approach for distortion rectification
CN110610486A (en) Monocular image depth estimation method and device
US20160163114A1 (en) Absolute rotation estimation including outlier detection via low-rank and sparse matrix decomposition
Kim et al. Agnostic change captioning with cycle consistency
US20180322671A1 (en) Method and apparatus for visualizing a ball trajectory
CN109902675A (en) The method and apparatus of the pose acquisition methods of object, scene reconstruction
Joo et al. Linear RGB-D SLAM for structured environments
Liu et al. Unsupervised global and local homography estimation with motion basis learning
CN105447886A (en) Dynamic cinema playback control method
CN117456531B (en) Multi-view pure rotation anomaly identification and automatic mark training method, equipment and medium
US20220067357A1 (en) Full skeletal 3d pose recovery from monocular camera
Min et al. Coeb-slam: A robust vslam in dynamic environments combined object detection, epipolar geometry constraint, and blur filtering
Gao et al. Cat3d: Create anything in 3d with multi-view diffusion models
Babu V et al. A deeper insight into the undemon: Unsupervised deep network for depth and ego-motion estimation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant