CN105069809A

CN105069809A - Camera positioning method and system based on planar mixed marker

Info

Publication number: CN105069809A
Application number: CN201510547761.3A
Authority: CN
Inventors: 吴毅红; 雷娟
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2015-08-31
Filing date: 2015-08-31
Publication date: 2015-11-18
Anticipated expiration: 2035-08-31
Also published as: CN105069809B

Abstract

The invention provides a camera positioning method and system based on a planar mixed marker. The method includes an off-line stage and an on-line stage. In the off-line stage, the characteristic points of a marker image are extracted, and bag-of-words model is constructed. In the on-line stage, an image matching the marker image is found from an image database by matching the characteristics of the constructed bag-of-words model with the characteristics of a bag-of-words model of the image in the image database. In this way, the position and the pose of a camera are obtained. A planar mixed marker is employed for camera positioning, so the method has the advantages of both quick detection based on an artificial marker and smooth tracking based on a natural image marker, and then the stability and the real-time effect of positioning are improved.

Description

A kind of camera localization method based on planar hybrid marker and system

Technical field

The invention belongs to computer vision field, be specifically related to a kind of camera localization method based on planar hybrid marker and system.

Background technology

Plane mark thing be a class by engineer, carry out the planar object of vision location for auxiliary camera.The advantage such as to be simple and easy to and testing process robust is quick due to it, the camera positioning system based on plane mark thing is one of mobile camera positioning system of current most popular view-based access control model.Plane mark owner will comprise manual identification's thing and natural image marker two class.

Manual identification's thing is widely used in the camera location scene lacking image feature information.Meanwhile, some camera positioning system, for the object simplifying feature detection and reduce scene structure impact, also uses artificial marker to carry out camera location mostly.First set is by the tele-conferencing system of University of Washington exploitation in 1999 based on the camera positioning system of manual identification's thing.The user of this system can use virtual blank to carry out cooperation and exchange in a space.Within 2003, Tech Uni Wien devises a set of interior of building navigational system based on manual identification's thing, and this system is the first camera positioning system using artificial marker independent operating on the mobile apparatus.At present, the camera positioning system based on manual identification's thing has appeared at such as educational training, has manufactured many application scenarios such as maintenance, commercial entertainment, navigation.Although manual identification's thing is widely used in enhancing system, but still also exists and such as scene is had invasive, cannot be blocked and the problem such as positioning result is level and smooth not processing section.

Different from the camera localization method based on manual identification's thing, the camera localization method based on natural image uses point, limit, Texture eigenvalue, identifies and position camera marker.In actual environment, the plane mark thing that can be used for camera location is seen everywhere, such as books, audio-visual product front cover, photography and paint, the packing of product and advertising poster etc.Therefore, the mobile camera localization method based on plane natural image just obtains and pays close attention to widely from being born.Compared to manual identification's thing, plane natural image texture information enriches, more be conducive to the robustness and the precision that promote location, but also to there is feature extraction and matching calculated amount large for the method utilizing physical feature to carry out following the tracks of, the realization on mobile platform is the shortcoming such as real-time not easily.

Summary of the invention

(1) technical matters that will solve

The object of the invention is to, a kind of camera localization method based on planar hybrid marker and system are provided, stability and the real-time of camera location can be improved.

(2) technical scheme

The invention provides a kind of camera localization method based on planar hybrid marker, planar hybrid marker comprises marker image and is centered around the frame of this marker image peripheral, and method comprises:

S1, off-line phase, extracts the unique point of marker image, and unique point is classified, and add up the frequency of occurrences of each category feature point, obtain corresponding word bag model;

S2, on-line stage, camera is when camera plane mixing marker, detect the marker image in frame, the word bag model of image in word bag model and image data base is adopted to carry out characteristic matching, find the image with marker images match in image data base, thus obtain position and the attitude of camera.

The present invention also provides a kind of camera positioning system based on planar hybrid marker, and planar hybrid marker comprises marker image and is centered around the frame of this marker image peripheral, and this system comprises:

Off-line equipment, for extracting the unique point of marker image, and classifies unique point, and adds up the frequency of occurrences of each category feature point, obtains corresponding word bag model;

On-line equipment, for camera when camera plane mixes marker, detect the marker image in frame, the word bag model of image in word bag model and image data base is adopted to carry out characteristic matching, find the image with marker images match in image data base, thus obtain position and the attitude of camera.

(3) beneficial effect

Camera localization method provided by the invention and system, adopt planar hybrid marker to carry out camera location, has concurrently and to detect fast based on manual identification's thing and based on the advantage of natural image marker track smoothing, to improve stability and the real-time of location.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of the camera localization method based on planar hybrid marker that the embodiment of the present invention provides.

Fig. 2 is the schematic diagram of the planar hybrid marker that the embodiment of the present invention provides.

Fig. 3 is the process flow diagram of off-line phase in the camera localization method that provides of the embodiment of the present invention.

Fig. 4 is the design sketch obtaining marker image in the embodiment of the present invention.

Fig. 5 is the design sketch carrying out Epipolar geometry verification in the embodiment of the present invention.

Fig. 6 is the schematic diagram of the camera positioning system based on planar hybrid marker that the embodiment of the present invention provides.

Fig. 7 is the design sketch that the present embodiment carries out augmented reality on a mobile platform.

Embodiment

The invention provides a kind of camera localization method based on planar hybrid marker and system, extract the unique point of marker image in off-line phase, and build word bag model; At on-line stage, adopt the word bag model of image in the word bag model and image data base built to carry out characteristic matching, find the image with marker images match in image data base, thus obtain position and the attitude of camera.

According to one embodiment of the present invention, planar hybrid marker comprises quadrilateral marker image and is centered around the frame of this marker image peripheral, its frame can be dark square frame, to limit marker image, therefore carry out adaptive threshold fuzziness just can obtain stable testing result rapidly by binaryzation.Simultaneously because the natural image added in dark border is relevant with the dummy object of enhancing, therefore user's content that this marker will strengthen can be pointed out intuitively.

According to one embodiment of the present invention, camera localization method comprises:

According to one embodiment of the present invention, step S1 comprises:

S11, extracts the unique point of marker image, and sets up the SURF descriptor of unique point;

S12, carries out cluster according to the distance between descriptor to unique point, obtains vocabulary, and wherein, clustering method can adopt level k-means method etc., and vocabulary comprises multiple characteristics of image word, the classification of the corresponding unique point of each characteristics of image word;

S13, add up the frequency of occurrences of each category feature point, obtain word bag model, wherein, word bag model comprises the set of unique point and descriptor and the frequency histogram of vocabulary, the present invention adopts word bag model to carry out quick-searching in a database to the potential marker region in frame of video, thus avoids the process of linear search.

According to one embodiment of the present invention, step S2 comprises:

S21, initial phase, camera, when camera plane mixing marker, passes through contour detecting, obtain the marker image in frame, extract the unique point in marker image and descriptor, obtain word bag model, and this word bag model is carried out frequency histogram with image word bag model in image data base compare, obtain matching image, calculate the homograph between marker image and matching image, obtain rotation matrix and translation vector and be optimized, thus obtaining initial position and the attitude of camera; Because the surrounding of marker image is provided with dark border, greatly can reduce the generation of error hiding when detecting marker image, improving the precision of Feature Points Matching, compared with the method using natural image to follow the tracks of, the present invention has resume speed faster.

S22, interframe tracking phase, camera is when the frame of video of taking is t frame, use the unique point of t-1 frame identification object image as tracking characteristics point, pyramid optical flow approach is adopted to obtain the word bag model of unique point in t frame and correspondence thereof, again this word bag model is carried out frequency histogram with image word bag model in image data base to compare, obtain the rotation matrix of t frame and translation vector and be optimized, thus obtaining position and the attitude of the t frame of camera.

S23, relocation phase, when interframe tracking quality is lower than certain threshold value, as threshold value can be taken as 0.1, then carry out frame detection, the unique point in the marker image in extraction frame and the word bag model of correspondence, and the word bag model that this word bag model and initial phase detect is compared, if matching rate is too low, then using this marker image as new marker image, enter initial phase.

According to one embodiment of the present invention, step S21 comprises:

S211, detect the image border of frame of video, and use pixel expansion operation to make continuous edge, obtain multiple profile, polygonal approximation is carried out to all profiles, obtain multiple enclosed region, when enclosed region area and video frame images area ratio are greater than a threshold value, then using this enclosed region as marker image candidate region.

S212, summit, four, marker image candidate region is corrected to quadrilateral area, obtain correcting image, extract the feature on correcting image and count the normalized frequency histogram that in vocabulary, each vocabulary occurs, obtain corresponding word bag model, this word bag model is carried out frequency histogram with image word bag model in image data base compare, front k the marker selecting frequency histogram the most similar alternatively, after k candidate identification thing is normalized into correcting image size and carry out characteristic matching between correcting image and single should to estimate, singly to answer candidate identification thing that in model, some rate is the highest as this correcting image recognition result,

S213, unique point in calculating marker image candidate region and mating between the unique point in recognition result, and the homograph calculated between them thus decompose and obtain rotation matrix and translation vector, and utilize following formula to be optimized, obtain initial position and the attitude of camera:

{K, R, t} = \arg \min_{K, R, t} \underset{i = 1, ..., N}{Σ} d {(m_{i}, K [R t] X_{i})}^{2}

Wherein, K represents the intrinsic parameter (comprising focal length and the principal point information of camera) of camera, and R represents rotation matrix, and t represents translation vector, and N represents the characteristics of image quantity on marker image, X _irepresent i-th characteristics of image on marker image, d (m _i, K [Rt] X _i) represent X _ire-projection and picture point m _ibetween geometric distance, and adopt Levenberg-Marquardt iteration optimization algorithms to solve.

The present invention also provides a kind of camera positioning system based on planar hybrid marker, comprising:

According to one embodiment of the present invention, off-line equipment extracts unique point and the SURF descriptor of marker image, and according to the distance between descriptor, cluster is carried out to unique point, obtain vocabulary, wherein, vocabulary comprises multiple characteristics of image word, the classification of the corresponding unique point of each characteristics of image word; Add up the frequency of occurrences of each category feature point, obtain word bag model, wherein, word bag model comprises the set of unique point and descriptor and the frequency histogram of vocabulary.

According to one embodiment of the present invention, on-line equipment comprises:

Initialization module, for camera when camera plane mixes marker, by contour detecting, obtain the marker image in frame, extract the unique point in marker image and descriptor, obtain word bag model, and this word bag model is carried out frequency histogram with image word bag model in image data base compare, obtain matching image, calculate the homograph between marker image and matching image, obtain rotation matrix and translation vector and be optimized, thus obtaining initial position and the attitude of camera;

Interframe tracking module, for camera when the frame of video of taking is t frame, use the unique point of t-1 frame identification object image as tracking characteristics point, pyramid optical flow approach is adopted to obtain the word bag model of unique point in t frame and correspondence thereof, again this word bag model is carried out frequency histogram with image word bag model in image data base to compare, obtain the rotation matrix of t frame and translation vector and be optimized, thus obtaining position and the attitude of the t frame of camera;

Reorientation module, for when interframe tracking quality is lower than certain threshold value, carry out frame detection, unique point in marker image in extraction frame and the word bag model of correspondence, and the word bag model that this word bag model and initial phase detect is compared, if matching rate is too low, then using this marker image as new marker image, enter initial phase.

According to one embodiment of the present invention, initialization module comprises:

Detecting unit, for detecting the image border of frame of video, and use pixel expansion operation to make continuous edge, obtain multiple profile, polygonal approximation is carried out to all profiles, obtain multiple enclosed region, when enclosed region area and video frame images area ratio are greater than a threshold value, as, threshold value can be taken as 0.1, then using this enclosed region as marker image candidate region.

Recognition unit, for summit, four, marker image candidate region is corrected to quadrilateral area, obtain correcting image, extract the feature on correcting image and count the normalized frequency histogram that in vocabulary, each vocabulary occurs, obtain corresponding word bag model, this word bag model is carried out frequency histogram with image word bag model in image data base compare, front k the marker selecting frequency histogram the most similar alternatively, after k candidate identification thing is normalized into correcting image size and carry out characteristic matching between correcting image and single should to estimate, singly to answer candidate identification thing that in model, some rate is the highest as this correcting image recognition result,

Positioning unit, for calculating mating between unique point in marker image candidate region and the unique point in recognition result, and the homograph calculated between them thus decompose and obtain rotation matrix and translation vector, and utilize following formula to be optimized, obtain initial position and the attitude of camera:

{K, R, t} = \arg \min_{K, R, t} \underset{i = 1, ..., N}{Σ} d {(m_{i}, K [R t] X_{i})}^{2}

For making the object, technical solutions and advantages of the present invention clearly understand, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in more detail.

The process flow diagram of the camera localization method based on planar hybrid marker that Fig. 1 provides for the embodiment of the present invention, wherein, described planar hybrid marker as shown in Figure 2, planar hybrid marker comprises quadrilateral marker image and is centered around the frame of this marker image peripheral, its frame can be dark square frame, to limit marker image, therefore carry out adaptive threshold fuzziness just can obtain stable testing result rapidly by binaryzation.Simultaneously because the natural image added in dark border is relevant with the dummy object of enhancing, therefore user's content that this marker will strengthen can be pointed out intuitively.As shown in Figure 1, camera localization method comprises:

S1, off-line phase, extracts the unique point of marker image, and unique point is classified, and add up the frequency of occurrences of each category feature point, obtain corresponding word bag model.

Wherein, as shown in Figure 3, specifically comprise at off-line phase step S1:

S12, adopt level k-means method to carry out cluster to unique point according to the distance between descriptor, obtain vocabulary, wherein, vocabulary comprises multiple characteristics of image word, the classification of the corresponding unique point of each characteristics of image word;

S13, add up the frequency of occurrences of each category feature point, obtain word bag model, wherein, word bag model comprises the set of unique point and descriptor and the frequency histogram of vocabulary, the present embodiment adopts word bag model to carry out quick-searching in a database to the potential marker region in frame of video, thus avoids the process of linear search.

S2, on-line stage, camera is when camera plane mixing marker, detect the marker image in frame, adopt the word bag model of image in bag model and image data base to carry out characteristic matching, find the image with marker images match in image data base, thus obtain position and the attitude of camera.

Wherein, on-line stage specifically can comprise initial phase, interframe tracking phase and relocation phase, wherein:

S21, initial phase, camera, when camera plane mixing marker, passes through contour detecting, obtain the marker image in frame, extract the unique point in marker image and descriptor, obtain word bag model, and this word bag model is carried out frequency histogram with image word bag model in image data base compare, obtain matching image, calculate the homograph between marker image and matching image, obtain rotation matrix and translation vector and be optimized, thus obtaining initial position and the attitude of camera; Because the surrounding of marker image is provided with dark border, greatly can reduce the generation of error hiding when detecting marker image, improving the precision of Feature Points Matching, compared with the method using natural image to follow the tracks of, the present invention has resume speed faster.This enforcement is divided into detection, identification and location three processes are specifically described initial phase:

S211, as shown in Figure 4, detect the image border of frame of video, and use pixel expansion operation to make continuous edge, obtain multiple profile, polygonal approximation is carried out to all profiles, obtain multiple enclosed region, when enclosed region area and video frame images area ratio are greater than 0.1, then using this enclosed region as marker image candidate region.

S212, as shown in Figure 5, summit, four, marker image candidate region is corrected to quadrilateral area, obtain correcting image, extract the feature on correcting image and count the normalized frequency histogram that in vocabulary, each vocabulary occurs, obtain corresponding word bag model, this word bag model is carried out frequency histogram with image word bag model in image data base compare, front k the marker selecting frequency histogram the most similar alternatively, after k candidate identification thing is normalized into correcting image size and carry out characteristic matching between correcting image and single should to estimate, singly to answer candidate identification thing that in model, some rate is the highest as this correcting image recognition result.Figure 5 shows that front 3 width imagery exploitation Epipolar geometry query image and word bag model returned carry out the result of geometry verification, in carrying out with classification 3, classification 7, classification 8 respectively mating, some rate is 44%, 17.6% and 12%, classification 3 is final Query Result, and now k is 3.

{K, R, t} = \arg \min_{K, R, t} \underset{i = 1, ..., N}{Σ} d {(m_{i}, K [R t] X_{i})}^{2}

As shown in Figure 6, the embodiment of the present invention also provides a kind of camera positioning system based on planar hybrid marker, comprising:

Off-line equipment, for extracting the unique point of marker image, and classifies unique point, and adds up the frequency of occurrences of each category feature point, obtains corresponding word bag model.

On-line equipment, it comprises initialization module, interframe tracking module and reorientation module, wherein:

Initialization module comprises detecting unit, recognition unit and positioning unit, detecting unit is for detecting the image border of frame of video, and use pixel expansion operation to make continuous edge, obtain multiple profile, polygonal approximation is carried out to all profiles, obtain multiple enclosed region, when enclosed region area and video frame images area ratio are greater than 0.1, then using this enclosed region as marker image candidate region, recognition unit is used for summit, four, marker image candidate region to be corrected to quadrilateral area, obtain correcting image, extract the feature on correcting image and count the normalized frequency histogram that in vocabulary, each vocabulary occurs, obtain corresponding word bag model, this word bag model is carried out frequency histogram with image word bag model in image data base compare, front k the marker selecting frequency histogram the most similar alternatively, after k candidate identification thing is normalized into correcting image size and carry out characteristic matching between correcting image and single should to estimate, singly to answer candidate identification thing that in model, some rate is the highest as this correcting image recognition result, positioning unit is for calculating mating between unique point in marker image candidate region and the unique point in recognition result, and the homograph calculated between them thus decompose obtain rotation matrix and translation vector, and utilize following formula to be optimized, obtain initial position and the attitude of camera:

{K, R, t} = \arg \min_{K, R, t} \underset{i = 1, ..., N}{Σ} d {(m_{i}, K [R t] X_{i})}^{2}

Interframe tracking module, for camera when the frame of video of taking is t frame, use the unique point of t-1 frame identification object image as tracking characteristics point, pyramid optical flow approach is adopted to obtain the word bag model of unique point in t frame and correspondence thereof, again this word bag model is carried out frequency histogram with image word bag model in image data base to compare, obtain the rotation matrix of t frame and translation vector and be optimized, thus obtaining position and the attitude of the t frame of camera.

Be positioned in comparatively mixed and disorderly Desktop-scene by the planar hybrid marker of the present embodiment, when larger rotation, dimensional variation, viewpoint change occur marker, the representative frame of video of superposition object as shown in Figure 7.Wherein (a) ~ (i) is the operational effect figure directly using the screenshot capture of iPad to obtain.Figure (j) to figure (l) is for showing the effect that mobile terminal is followed the tracks of more intuitively, with the operational effect figure of other viewing angles.As can be seen from the figure, marker generation big angle rotary in the picture, in the situations such as dimensional variation, this chapter mobile augmented reality system still can carry out the superposition of dummy object preferably.

Mobile augmented reality for plane mark thing is applied, and the present invention, in conjunction with the advantage of manual identification's thing and natural image marker, devises a kind of new planar hybrid marker, and proposes a kind of camera localization method based on this type of plane mark thing.Camera localization method based on this type of marker reduces the time detecting marker in the video frame by the mode that frame detects, simultaneously owing to defining the region of image zooming-out unique point and descriptor, also avoid the impact of irrelevant contents on search method to a certain extent, improve the accuracy rate of marker identification.Can find in the recognition efficiency of 10 markers and accuracy rate are tested, due to the existence of frame, algorithm can picture material is calibrated after identify again, the changes such as larger viewpoint, yardstick, rotation can be tackled well, algorithm adopts SURF descriptor to identify therefore also have certain robustness to illumination simultaneously.Algorithm is not when carrying out geometry verification, and discrimination is 88%, and after adding geometry verification, discrimination can reach 98%.Experiment on mobile platform shows, utilize physical feature carry out interframe light stream can tackle well occur in tracing process rotation, yardstick and viewpoint change, locating speed is that 18 frames are per second, achieves real-time mobile augmented reality effect.

Above-described specific embodiment; object of the present invention, technical scheme and beneficial effect are further described; be understood that; the foregoing is only specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any amendment made, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims

1. based on a camera localization method for planar hybrid marker, it is characterized in that, described planar hybrid marker comprises quadrilateral marker image and is centered around the frame of this marker image peripheral, and method comprises:

S1, off-line phase, extracts the unique point of described marker image, and described unique point is classified, and adds up the frequency of occurrences of each category feature point, obtains corresponding word bag model;

S2, on-line stage, camera is when taking described planar hybrid marker, detect the marker image in described frame, the word bag model of image in institute's predicate bag model and image data base is adopted to carry out characteristic matching, find the image with described marker images match in image data base, thus obtain position and the attitude of camera.

2. camera localization method according to claim 1, is characterized in that, described step S1 comprises:

S11, extracts the unique point of described marker image, and sets up the SURF descriptor of described unique point;

S12, carries out cluster according to the distance between described descriptor to described unique point, obtains vocabulary, and wherein, described vocabulary comprises multiple characteristics of image word, the classification of the corresponding unique point of each characteristics of image word;

S13, adds up the frequency of occurrences of each category feature point, obtains word bag model, and wherein, word bag model comprises the set of described unique point and descriptor and the frequency histogram of described vocabulary.

3. camera localization method according to claim 2, is characterized in that, described step S2 comprises:

S21, initial phase, camera is when taking described planar hybrid marker, pass through contour detecting, obtain the marker image in described frame, extract the unique point in described marker image and descriptor, obtain word bag model, and this word bag model is carried out frequency histogram with image word bag model in image data base compare, obtain matching image, calculate the homograph between described marker image and matching image, obtain rotation matrix and translation vector and be optimized, thus obtaining initial position and the attitude of camera;

S22, interframe tracking phase, camera is when the frame of video of taking is t frame, use the unique point of t-1 frame identification object image as tracking characteristics point, pyramid optical flow approach is adopted to obtain the word bag model of unique point in t frame and correspondence thereof, again this word bag model is carried out frequency histogram with image word bag model in image data base to compare, obtain the rotation matrix of t frame and translation vector and be optimized, thus obtaining position and the attitude of the t frame of camera;

S23, relocation phase, when interframe tracking quality is lower than certain threshold value, carry out frame detection, extract the word bag model of unique point in the marker image in described frame and correspondence, and the word bag model that this word bag model and initial phase detect is compared, if matching rate is too low, then using this marker image as new marker image, enter initial phase.

4. camera localization method according to claim 3, is characterized in that, described step S21 comprises:

S212, four summits, described marker image candidate region are corrected to quadrilateral area, obtain correcting image, extract the feature on described correcting image and count the normalized frequency histogram that in vocabulary, each vocabulary occurs, obtain corresponding word bag model, this word bag model is carried out frequency histogram with image word bag model in image data base compare, front k the marker selecting frequency histogram the most similar alternatively, after described k candidate identification thing is normalized into described correcting image size and carry out characteristic matching between described correcting image and single should to estimate, singly to answer candidate identification thing that in model, some rate is the highest as this correcting image recognition result,

S213, calculate mating between unique point in described marker image candidate region and the unique point in recognition result, and the homograph calculated between them thus decompose and obtain rotation matrix and translation vector, and utilize following formula to be optimized, obtain initial position and the attitude of described camera:

{K, R, t} = \arg \min_{K, R, t} \underset{i = 1, ..., N}{Σ} d {(m_{i}, K [R t] X_{i})}^{2}

5. camera localization method according to claim 4, is characterized in that, described threshold value is 0.1.

6. based on a camera positioning system for planar hybrid marker, it is characterized in that, described planar hybrid marker comprises quadrilateral marker image and is centered around the frame of this marker image peripheral, and this system comprises:

Off-line equipment, for extracting the unique point of described marker image, and classifies described unique point, and adds up the frequency of occurrences of each category feature point, obtain corresponding word bag model;

On-line equipment, for camera when taking described planar hybrid marker, detect the marker image in described frame, the word bag model of image in institute's predicate bag model and image data base is adopted to carry out characteristic matching, find the image with described marker images match in image data base, thus obtain position and the attitude of camera.

7. camera positioning system according to claim 6, it is characterized in that, described off-line equipment extracts the unique point of described marker image, and set up the SURF descriptor of described unique point, according to the distance between described descriptor, cluster is carried out to described unique point, obtain vocabulary, wherein, described vocabulary comprises multiple characteristics of image word, the classification of the corresponding unique point of each characteristics of image word; Add up the frequency of occurrences of each category feature point, obtain word bag model, wherein, word bag model comprises the set of described unique point and descriptor and the frequency histogram of described vocabulary.

8. camera positioning system according to claim 7, is characterized in that, described on-line equipment comprises:

Initialization module, for camera when taking described planar hybrid marker, pass through contour detecting, obtain the marker image in described frame, extract the unique point in described marker image and descriptor, obtain word bag model, and this word bag model is carried out frequency histogram with image word bag model in image data base compare, obtain matching image, calculate the homograph between described marker image and matching image, obtain rotation matrix and translation vector and be optimized, thus obtaining initial position and the attitude of camera;

Reorientation module, for when interframe tracking quality is lower than certain threshold value, carry out frame detection, extract the word bag model of unique point in the marker image in described frame and correspondence, and the word bag model that this word bag model and initial phase detect is compared, if matching rate is too low, then using this marker image as new marker image, enter initial phase.

9. camera positioning system according to claim 8, is characterized in that, described initialization module comprises:

Detecting unit, for detecting the image border of frame of video, and use pixel expansion operation to make continuous edge, obtain multiple profile, polygonal approximation is carried out to all profiles, obtain multiple enclosed region, when enclosed region area and video frame images area ratio are greater than a threshold value, then using this enclosed region as marker image candidate region.

Recognition unit, for four summits, described marker image candidate region are corrected to quadrilateral area, obtain correcting image, extract the feature on described correcting image and count the normalized frequency histogram that in vocabulary, each vocabulary occurs, obtain corresponding word bag model, this word bag model is carried out frequency histogram with image word bag model in image data base compare, front k the marker selecting frequency histogram the most similar alternatively, after described k candidate identification thing is normalized into described correcting image size and carry out characteristic matching between described correcting image and single should to estimate, singly to answer candidate identification thing that in model, some rate is the highest as this correcting image recognition result,

Positioning unit, for calculating mating between unique point in described marker image candidate region and the unique point in recognition result, and the homograph calculated between them thus decompose obtain rotation matrix and translation vector, and utilize following formula to be optimized, obtain initial position and the attitude of described camera:

{K, R, t} = \arg \min_{K, R, t} \underset{i = 1, ..., N}{Σ} d {(m_{i}, K [R t] X_{i})}^{2}

Wherein, K represents the intrinsic parameter (comprising focal length and the principal point information of camera) of camera, and R represents rotation matrix, and t represents translation vector, and N represents the characteristics of image quantity on marker image, X _irepresent i-th characteristics of image on marker image, d (m _i, K [Rt] X _irepresent X _ire-projection and picture point m _ibetween geometric distance, and adopt Levenberg-Marquardt iteration optimization algorithms to solve.

10. camera positioning system according to claim 9, is characterized in that, described threshold value is 0.1.