CN110335319A

CN110335319A - Camera positioning and the map reconstruction method and system of a kind of semantics-driven

Info

Publication number: CN110335319A
Application number: CN201910557726.8A
Authority: CN
Inventors: 桑农; 王玘; 高常鑫
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2019-06-26
Filing date: 2019-06-26
Publication date: 2019-10-15
Anticipated expiration: 2039-06-26
Also published as: CN110335319B

Abstract

The invention discloses a kind of positioning of the camera of semantics-driven and map reconstruction method, belong to technical field of computer vision.The invention firstly uses carry out semantic segmentation to current frame image characteristic point；Later according to similarity and semantic classes, matching is carried out to all characteristic points in present frame and key frame using similar matching method and obtains matching pair；Camera posture is initialized by matchings all in present frame and key frame again；Semantic judgement is combined to update Feature Points Matching pair using tripleplane's method later；Posture is recycled to minimize to all Feature Points Matchings to being updated；Finally three-dimensional map is constructed using camera posture；The invention also achieves a kind of positioning of the camera of semantics-driven and map reconstruction system.Technical solution of the present invention has not only done multiple processing in camera positioning stage, and the constraint of some clouds has also been done in phase of regeneration, so that semantic segmentation and the combination that camera positions and reconstructing system is closer, obtain more accurate positioning result and more perfect reconstruction effect.

Description

Camera positioning and the map reconstruction method and system of a kind of semantics-driven

Technical field

The invention belongs to technical field of computer vision, more particularly, to camera positioning and the ground of a kind of semantics-driven Figure method for reconstructing.

Background technique

Currently, camera positioning with reconstruction technique with semantic segmentation technology do not combine or combine not closely.

For being not bound with the related algorithm of semantic segmentation, on the one hand, more difficult to deal with various environment such as dynamic State scene, weak texture scene.On the other hand, these algorithms reconstruct the cartographic model come and are often made of either a cloud Terrestrial reference composition, it is all based on the map of geological information, so the high-level understanding of any pair of ambient enviroment cannot be provided.

For combining the related algorithm of semantic segmentation, class label typically is sticked to identification object, has done one The optimization that a little removal dynamic objects influence, but not adequately using semantic segmentation as a result, in turn that semantic segmentation is close Involvement positioning and map reconstruction technological system in.

Summary of the invention

Aiming at the above defects or improvement requirements of the prior art, the present invention provides a kind of camera of semantics-driven positioning with Map reconstruction method, its object is to be positioned with the matching of the characteristic point in reconstruction process, again using semantic information optimization camera The optimization of projection error, the constraint of reconstruction point cloud, the detection of winding, the precision for thus allowing camera to position is higher, rebuilds comprising height Level understand and it is more complete.

To achieve the above object, described the present invention provides a kind of positioning of the camera of semantics-driven and map reconstruction method Method the following steps are included:

(1) extract the characteristic point of current frame image, and using the full convolutional neural networks put up to current frame image into Row semantic segmentation, each characteristic point obtain corresponding semantic classes；

(2) according to similarity and semantic classes, features all in present frame and key frame are clicked through using similar matching method Row matching, obtains Feature Points Matching pair；

The similar matching method specifically includes following sub-step:

(21) the same category object in present frame and key frame is obtained according to the semantic classes of characteristic point；

(22) the point cloud principal direction for calculating each object in same category object in present frame and key frame, if in present frame The difference of the point cloud principal direction of some object is less than given threshold in some object and key frame, then described two objects are object Matching pair；

(23) similarity mode is carried out to the characteristic point of two object regions of the object matches centering, obtained final Feature Points Matching pair；

(3) camera posture is initialized by all Feature Points Matchings in present frame and key frame；

(4) the corresponding three-dimensional point of matching characteristic point d is obtained in present frame using camera Attitude Calculation, utilizes camera internal reference Three-dimensional point is projected on present frame, judges subpoint whether where characteristic point d in object area, if not existing, in key The new match point of characteristic point d is found using similar matching method in the non-matching characteristic point of frame, constitutes new matching pair；

(5) it utilizes step (4) to all Feature Points Matchings to being updated, minimizes following formula and update camera posture:

Wherein, exp (ξ^∧) indicate camera posture representation of Lie algebra form；N indicates Feature Points Matching to quantity；u_iIt indicates Ith feature point is matched to image coordinate in the current frame；s_iIndicate i-th of scale factor；p_iIndicate i-th matching to Image coordinate in key frame；

(6) three-dimensional map is constructed using new camera posture, and object is obtained according to the semantic classes of object in three-dimensional map The resemblance of body will not meet the three-dimensional point deletion of object resemblance in object.

Further, the method also includes following steps:

(7) present frame is further judged using the semantic classes of object, point cloud quantity and point cloud principal direction in present frame With the presence or absence of winding, and if it exists, then eliminate accumulated error using closed-loop optimization；

(8) optimize global crucial figure using the method that non-linear least square figure optimizes, finally carry out global optimization.

Further, the step (23) specifically includes:

If the set of characteristic points of two object regions of object matches centering are respectively as follows:

(231) a characteristic point ai is chosen from set A, successively calculate in characteristic point ai and set B all characteristic points it Between similarity；If the similarity in set B between a characteristic point bj and characteristic point ai is maximum, and is greater than the similarity of setting Threshold value, then bj and ai is characterized a matching pair,

(232) another characteristic point is chosen from set A, is repeated step (231), is found out all spies of set A until all Levy the matching pair of point.

Further, the step (3) specifically:

(31) essential matrix E is calculated using 8 methods；

(32) by SVD (Singular Value Decomposition, singular value decomposition), essential matrix is decomposed, Obtain four kinds of possible solutions, i.e. camera posture；

(33) according to every kind of possible camera posture and Feature Points Matching pair, three-dimensional point cloud is calculated, if the position of point cloud It sets and meets camera imaging model, then corresponding camera posture is initialization camera posture.

Further, judge that present frame specifically includes following sub-step with the presence or absence of winding in the step (7):

(41) candidate winding frame is detected by bag of words (Bag of word, BOW)；

(42) the candidate winding frame that will test out carries out the comparison of semantic classes with present frame again, finds out with identical number The candidate winding frame of amount and identical semantic classes；

(43) the reconstruction point cloud quantity for comparing above-mentioned candidate winding frame again, saves the candidate that similarity is greater than given threshold Winding frame；

(44) finally by the principal direction for comparing the point cloud that present frame is reconstructed with each candidate winding frame participation, retain Greater than the candidate winding frame of similarity threshold, as winding；

(45) accumulated error is eliminated using closed-loop optimization.

Further, accumulated error is eliminated using closed-loop optimization in the step (45) specifically include following sub-step:

(451) by calculating the matching pair between current key frame and winding key frame, the transformation between two frames is solved；

(452) Closed-cycle correction is carried out if Feature Points Matching is to correction threshold is met, and calculates each using propagation algorithm The correct posture of key frame.

Further, the step (8) specifically:

(81) using the posture of each key frame and point cloud as vertex；

(82) establish the binding side between vertex, binding side be relative motion estimation between two pose nodes, point cloud with The semantic constraint between mappings constraint, point cloud between camera；

(83) vertex is solved full as optimized variable, Bian Zuowei bound term using Gauss-Newton method (GAUSS-NEWTON) The best vertex of the above-mentioned constraint of foot, that is, camera posture and point cloud position after finding out optimization.

Further, the selection standard of the key frame are as follows: if first meet condition creates key frame:

Nth frame after last round of map reconstruction is set to new key frame；

It has passed through N frame after the insertion of a upper key frame and be set to new key frame；

The Feature Points Matching that present frame traces into is less than number and refers to key frame Feature Points Matching logarithm purpose percent 90, then settled previous frame is new key frame.

It is another aspect of this invention to provide that the present invention provides a kind of positioning of the camera of semantics-driven and map reconstruction system System, the system comprises following parts:

First module, for extracting the characteristic point of current frame image, and using the full convolutional neural networks put up to working as Prior image frame carries out semantic segmentation, and each characteristic point obtains corresponding semantic classes；

Second module is used for according to similarity and semantic classes, using similar matching method to institute in present frame and key frame There is characteristic point to be matched, obtains Feature Points Matching pair；

Include similar matching unit in second module, similar matching unit specifically includes following part:

First subelement, for obtaining the same category object in present frame and key frame according to the semantic classes of characteristic point Body；

Second subelement, for calculating the point Yun Zhufang of each object in same category object in present frame and key frame To described if the difference of the point cloud principal direction of some object is less than given threshold in some object and key frame in present frame Two objects are object matches pair；

Third subelement carries out similarity for the characteristic point to two object regions of the object matches centering Match, obtains final Feature Points Matching pair；

Third module, for being initialized by all Feature Points Matchings in present frame and key frame to camera posture；

4th module, for obtaining in present frame the corresponding three-dimensional point of matching characteristic point d, benefit using camera Attitude Calculation Three-dimensional point is projected on present frame with camera internal reference, judges subpoint whether where characteristic point d in object area, if not The new match point of characteristic point d is then being found using similar matching method in the non-matching characteristic point of key frame, is constituting new matching It is right；

5th module, for using the 4th module to all Feature Points Matchings to being updated, later minimize following formula more New camera posture:

6th module, for constructing three-dimensional map using new camera posture, and according to the semanteme of object in three-dimensional map Classification obtains the resemblance of object, and the three-dimensional point deletion of object resemblance will not be met in object.

Further, the system also includes:

7th module, for further using the semantic classes of object, point cloud quantity and point cloud principal direction in present frame Judge present frame with the presence or absence of winding, and if it exists, then to eliminate accumulated error using closed-loop optimization；

8th module, the global crucial figure of method optimization for being optimized using non-linear least square figure, is finally carried out complete Office's optimization.

In general, through the invention it is contemplated above technical scheme is compared with the prior art, have below beneficial to effect Fruit:

(1) the method for the present invention uses the similar matching process based on semantic segmentation and carries out Feature Points Matching, this method benefit With the semantic label information of each frame image, the direction of object after semantic classification and classification is matched as constraint, is increased The constraint condition in matching process is added, so that the range of Feature Points Matching is reduced, to save Feature Points Matching Time, and reduce the Feature Points Matching pair of many mistakes, improve matched precision, the estimation for camera posture provides Good calculating environment；

(2) the method for the present invention uses semantic-based re-projection optimization method, and this method combines the language of each frame image Adopted information increases the constraint condition of re-projection point, the re-projection point of a part of mistake has been filtered, so that re-projection optimization efficiency Be improved, and the re-projection point due to eliminating a part of mistake so that the precision of camera pose refinement obtained into The raising of one step, so that camera tracking is more accurate, it is not easy to drift about since error is excessive；

(3) method that the method for the present invention uses the optimization of semantic-based figure, the object that this method utilizes semantic segmentation to go out Geological information so that not only according to the mapping between the transformation and point cloud and camera posture between camera posture come the phase that optimizes Machine posture and point cloud, also pass through the position between geometrical constraint obligatory point Yun Yudian cloud, also affect the excellent of camera posture indirectly Change adjustment, to obtain more accurate camera posture and point cloud

(4) the method for the present invention uses semantic-based winding detection method, and this method is by each frame image, semantic label Categorical measure as bound term, further judge after finding a series of candidate winding frames by BOW, so that looking into The winding frame and present frame looked for are increasingly similar, and the accuracy of winding is higher, so that winding optimization elimination error is more accurate.

Detailed description of the invention

Fig. 1 is the general flow chart of the method for the present invention；

Fig. 2 is to decompose essential matrix using SVD in the method for the present invention, obtains four kinds of possible camera posture schematic diagrames；

Fig. 3 is three-dimensional point perspective view in the method for the present invention；

Fig. 4 is the schematic diagram of global optimization in the method for the present invention.

Specific embodiment

In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that described herein, specific examples are only used to explain the present invention, not For limiting the present invention.As long as in addition, technical characteristic involved in the various embodiments of the present invention described below that Not constituting conflict between this can be combined with each other.

As shown in Figure 1, the method for the present invention the following steps are included:

The similar matching method specifically includes following sub-step:

(5) it uses step (4) to all Feature Points Matchings to being updated, minimizes following formula and update camera posture:

(6) three-dimensional map is constructed using new camera posture, and object is obtained according to the semantic classes of object in three-dimensional map The resemblance of body will not meet the three-dimensional point deletion of object resemblance in object；

(8) optimize global crucial figure using the method that non-linear least square figure optimizes, finally carry out Global B A optimization.

The method of the present invention is introduced now in conjunction with one embodiment of the present of invention:

1. semantic segmentation: extracting the characteristic point of current frame image, and using the full convolutional neural networks put up to current Frame image carries out semantic segmentation, and each characteristic point obtains corresponding semantic classes.

2. tracking: by finding corresponding relationship as much as possible between present frame and local map, Lai Youhua present frame Pose estimation.Specifically:

A.ORB feature extraction and semantic segmentation: setting present frame for input frame, extracts ORB characteristic point and corresponding Present frame is put into segmentation network by ORB Feature Descriptor, waits until its prediction result.

B. estimate camera motion: similar by the similarity of ORB Feature Descriptor and semantic classes information, use first The characteristic point of matching method matches present frame and previous frame, specific practice are as follows:

(b1) position of the same category object in present frame and previous frame is obtained according to the semantic classes of characteristic point；

(b2) between the characteristic point in same category of object space, calculate two-by-two between the sub- similarity of description, every acquisition One group of highest similarity, then as final Feature Points Matching to preserving；

Then camera posture is predicted using motor pattern.Motion model assumes camera uniform motion, passes through previous frame camera Pose, two interframe Feature Points Matching relationship and speed estimate the pose of present frame, if Feature Points Matching is low to number When threshold value, it is changed to Key Mode.It attempts to carry out Feature Points Matching with nearest key frame, matching process is same as above, if working as A number of pairs for previous frame and arest neighbors key frame is still below threshold value, then matches present frame with global all key frames, seek The highest key frame of a number of pairs is looked for, solves camera pose using PnP algorithm, specific practice is as follows:

(bb1) essential matrix E is calculated using 8 methods；

(bb2) it is decomposed by SVD, essential matrix is decomposed, four kinds of possible solutions (spin matrix, translation matrix) are obtained, That is posture；

(bb3) according to every kind of possible posture and Feature Points Matching pair, three-dimensional point cloud is calculated, passes through judgement point cloud Position determines which solution chosen, that is, calculates the posture of camera, as shown in Figure 2.

Then the re-projection based on semantic segmentation is carried out using pose of the matched characteristic point to previous frame to optimize, worked as The posture of previous frame, specific practice are as follows:

If project to the characteristic point in image fallen on it is not identical as the classification of characteristic point matching in original image Place, then it is assumed that this is unqualified to characteristic point re-projection, weed out this to matching pair, be not involved in the optimization of objective function.Such as Shown in figure, the corresponding two-dimensional image point of P spatial point is p1, in characteristic matching stage, by previous frame characteristic point p1 and present frame Characteristic point p2 matching together, it is believed that P should project to the position of p2, however because the mistake of camera Attitude estimation Difference, drop point have fallen on p ' usually not in the position of p2.If the semantic label of p ' pixel and p1 be not identical, then it is assumed that p1 It is no longer participate in the estimation of camera motion, as shown in Figure 3 to weed out this to match point with p2 matching error.

To institute's re-projection point with a grain of salt, subpoint is calculated at a distance from characteristic point matching in same image, most Smallization all distances update camera posture:

Wherein, exp (ξ^∧) indicate camera posture representation of Lie algebra form；N indicates Feature Points Matching to quantity；u_iIt indicates I-th pair is matched to image coordinate in the current frame；s_iIndicate i-th of scale factor；p_iI-th pair matching pair is indicated, in present frame Image coordinate in tracking frame；

C. track local map: find the key frame for having common three-dimensional space point in local map with present frame and and this The adjacent key frame of a little key frames.The three-dimensional point for projecting to three-dimensional space corresponding in above-mentioned key frame is projected in present frame, Be updated and match with the characteristic point in present frame, finally with it is all matching to again optimize camera pose, optimal way with it is upper One step is identical.

D. key frame judges: if first meet condition creates key frame: apart from upper one 15 frames of global reorientation, It has passed through 15 frames after the insertion of a upper key frame, the characteristic point that present frame traces into is less than number and matches with reference to key frame Logarithm purpose 90 percent.(being the key frame for having most common observation three-dimensional point with present frame with reference to key frame) is if be unsatisfactory for Condition then carries out boundling adjustment, optimizes the posture of a upper key frame.

3. semantic label merges: after creation key frame, visual range degree will be total in local map with current key frame and be higher than centainly The key frame of threshold value is used to update the corresponding semantic label probability of each pixel in current key frame.Visual range degree is by two interframe altogether Matching observed to number and jointly three-dimensional space same point number determine.

4. figure is built in part: after semantic label fusion, current key frame being inserted into local map, and filter out redundancy Three-dimensional space point and key frame finally carry out local boundling adjustment.

A. it is inserted into key frame: pose figure is added using the posture of key frame as node, addition has altogether with current key frame With the optimization side of the key frame of observation three-dimensional space point.

B. local boundling adjustment: by current key frame, adjacent key frame, the key frame for having common observation three-dimensional point and Corresponding three-dimensional space point, which is put into pose figure, to be optimized.Each key frame is checked, if there is 90 percent feature Point is all exceeded three other key frames and observes, just weeds out this key frame.

5. winding detects: detecting if the number of key frames in map is less than 10 without winding, if more than 10, exist first The key frame for having a common BoW word with current key frame is found in map, then counts shared most with the BoW of current key frame More word numbers are used as threshold value for 80 the percent of this number, the key frame that word number is greater than threshold value are found, as candidate key Frame.A series of candidate winding frames that will test out carry out the comparison of semantic classes with present frame again, find out with identical quantity, The candidate winding frame of identical semantic classes compares the reconstruction point cloud quantity of above-mentioned candidate winding frame again, saves similarity and be greater than The candidate winding frame of certain threshold value, finally by the master for comparing the point cloud that present frame is reconstructed with each candidate winding frame participation Direction remains larger than the candidate winding frame of certain similarity, as winding, rear by calculating current key frame and winding key frame Between matching pair, solve two frames between transformation Closed-cycle correction is carried out if Feature Points Matching is to number is met, use propagation Algorithm calculates the correct transformed value of each key frame.

6. finally carrying out figure optimization and global optimization.

(1) using the posture of each key frame and point cloud as vertex；

(2) establish the binding side between vertex, binding side be relative motion between two pose nodes estimate (being denoted as T), The semantic constraint (as shown in the figure) between mappings constraint (being denoted as M), point cloud between point cloud and camera；

(3) vertex solves the best top for meeting above-mentioned constraint using L-M method as optimized variable, Bian Zuowei bound term Point, that is, camera posture and point cloud position after finding out optimization, as shown in Figure 4.

The camera positioning and map reconstruction system of a kind of semantics-driven, the system are first further illustrated in conjunction with specific embodiments System includes following part:

The above content as it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, It is not intended to limit the invention, any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention, It should all be included in the protection scope of the present invention.

Claims

1. camera positioning and the map reconstruction method of a kind of semantics-driven, which is characterized in that the method specifically includes following step It is rapid:

(1) characteristic point of current frame image is extracted, and language is carried out to current frame image using the full convolutional neural networks put up Justice segmentation, each characteristic point obtain corresponding semantic classes；

(2) according to similarity and semantic classes, using similar matching method to characteristic points all in present frame and key frame progress Match, obtains Feature Points Matching pair；

The similar matching method specifically includes following sub-step:

(22) the point cloud principal direction for calculating each object in same category object in present frame and key frame, if some in present frame The difference of the point cloud principal direction of some object is less than given threshold in object and key frame, then described two objects are object matches It is right；

(23) similarity mode is carried out to the characteristic point of two object regions of the object matches centering, obtains final feature Point matching pair；

(4) the corresponding three-dimensional point of matching characteristic point d is obtained in present frame using camera Attitude Calculation, using camera internal reference by three Dimension point projects on present frame, whether judges subpoint where characteristic point d in object area, if not existing, key frame not The new match point of characteristic point d is found using similar matching method in matching characteristic point, constitutes new matching pair；

Wherein, exp (ξ ^) indicates the representation of Lie algebra form of camera posture；N indicates Feature Points Matching to quantity；u_iIt indicates i-th Feature Points Matching is to image coordinate in the current frame；s_iIndicate i-th of scale factor；p_iIndicate i-th of matching in key frame In image coordinate；

(6) three-dimensional map is constructed using new camera posture, and object is obtained according to the semantic classes of object in three-dimensional map Resemblance will not meet the three-dimensional point deletion of object resemblance in object.

2. camera positioning and the map reconstruction method of a kind of semantics-driven according to claim 1, which is characterized in that described Method is further comprising the steps of:

(7) further whether present frame is judged using the semantic classes of object, point cloud quantity and point cloud principal direction in present frame There are winding, and if it exists, then eliminates accumulated error using closed-loop optimization；

3. camera positioning and the map reconstruction method of a kind of semantics-driven according to claim 1, which is characterized in that described Step (23) specifically includes:

(231) a characteristic point ai is chosen from set A, is successively calculated in characteristic point ai and set B between all characteristic points Similarity；If the similarity in set B between a characteristic point bj and characteristic point ai is maximum, and is greater than the similarity threshold of setting Value, then bj and ai is characterized a matching pair,

(232) another characteristic point is chosen from set A, is repeated step (231), is found out all characteristic points of set A until all Matching pair.

4. camera positioning and the map reconstruction method of a kind of semantics-driven according to claim 1, which is characterized in that described Step (3) specifically:

(31) essential matrix E is calculated using 8 methods；

(32) by SVD (Singular Value Decomposition, singular value decomposition), essential matrix is decomposed, is obtained Four kinds of possible solutions, i.e. camera posture；

(33) according to every kind of possible camera posture and Feature Points Matching pair, three-dimensional point cloud is calculated, if the position symbol of point cloud Camera imaging model is closed, then corresponding camera posture is initialization camera posture.

5. camera positioning and the map reconstruction method of a kind of semantics-driven according to claim 2, which is characterized in that described Judge that present frame specifically includes following sub-step with the presence or absence of winding in step (7):

(41) candidate winding frame is detected by bag of words (Bag of word, BOW)；

(42) the candidate winding frame that will test out carries out the comparison of semantic classes with present frame again, find out with identical quantity and The candidate winding frame of identical semantic classes；

(43) the reconstruction point cloud quantity for comparing above-mentioned candidate winding frame again, saves the candidate winding that similarity is greater than given threshold Frame；

(44) it finally by the principal direction for comparing the point cloud that present frame is reconstructed with each candidate winding frame participation, remains larger than The candidate winding frame of similarity threshold, as winding；

(45) accumulated error is eliminated using closed-loop optimization.

6. camera positioning and the map reconstruction method of a kind of semantics-driven according to claim 5, which is characterized in that described Accumulated error, which is eliminated, using closed-loop optimization in step (45) specifically includes following sub-step:

(452) Closed-cycle correction is carried out if Feature Points Matching is to correction threshold is met, and calculates each key using propagation algorithm The correct posture of frame.

7. camera positioning and the map reconstruction method of a kind of semantics-driven according to claim 1, which is characterized in that described Step (8) specifically:

(81) using the posture of each key frame and point cloud as vertex；

(82) binding side between vertex is established, binding side is relative motion estimation, point cloud and the camera between two pose nodes Between mappings constraint, point cloud between semantic constraint；

(83) vertex is solved using Gauss-Newton method (GAUSS-NEWTON) and is met as optimized variable, Bian Zuowei bound term The best vertex for stating constraint, that is, camera posture and point cloud position after finding out optimization.

8. the camera positioning and map reconstruction method, feature of a kind of any semantics-driven according to claim 1~7 It is, the selection standard of the key frame are as follows: if first meet condition creates key frame:

Nth frame after last round of map reconstruction is set to new key frame；

The Feature Points Matching that present frame traces into is less than number and refers to key frame Feature Points Matching logarithm purpose 90 percent, Then settled previous frame is new key frame.

9. camera positioning and the map reconstruction system of a kind of semantics-driven, which is characterized in that the system comprises following parts:

First module, for extracting the characteristic point of current frame image, and using the full convolutional neural networks put up to present frame Image carries out semantic segmentation, and each characteristic point obtains corresponding semantic classes；

Second module is used for according to similarity and semantic classes, using similar matching method to all spies in present frame and key frame Sign point is matched, and Feature Points Matching pair is obtained；

First subelement, for obtaining the same category object in present frame and key frame according to the semantic classes of characteristic point；

Second subelement, for calculating the point cloud principal direction of each object in same category object in present frame and key frame, if The difference of the point cloud principal direction of some object is less than given threshold in some object and key frame in present frame, then described two objects Body is object matches pair；

Third subelement carries out similarity mode for the characteristic point to two object regions of the object matches centering, Obtain final Feature Points Matching pair；

4th module utilizes phase for obtaining in present frame the corresponding three-dimensional point of matching characteristic point d using camera Attitude Calculation Machine internal reference projects to three-dimensional point on present frame, whether judges subpoint where characteristic point d in object area, if not existing, The new match point of characteristic point d is found using similar matching method in the non-matching characteristic point of key frame, constitutes new matching pair；

5th module, for using the 4th module to all Feature Points Matchings to being updated, later minimize following formula more cenotype Machine posture:

6th module, for constructing three-dimensional map using new camera posture, and according to the semantic classes of object in three-dimensional map The resemblance for obtaining object, will not meet the three-dimensional point deletion of object resemblance in object.

10. being existed according to a kind of camera positioning of semantics-driven any in claim 9 and map reconstruction method, feature In, the system also includes:

7th module, for further being judged using the semantic classes of object, point cloud quantity and point cloud principal direction in present frame Present frame whether there is winding, and if it exists, then eliminate accumulated error using closed-loop optimization；

8th module, the global crucial figure of method optimization for being optimized using non-linear least square figure, is finally carried out global excellent Change.