CN101794444A

CN101794444A - Coordinate cyclic approach type dual orthogonal camera system video positioning method and system

Info

Publication number: CN101794444A
Application number: CN201010102367A
Authority: CN
Inventors: 顾宏斌; 汤勇; 顾人舒
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2010-01-28
Filing date: 2010-01-28
Publication date: 2010-08-04
Anticipated expiration: 2030-01-28
Also published as: CN101794444B

Abstract

The invention discloses a coordinate cyclic approach type dual orthogonal camera system video location method and a system. The optic axes of more than four planar cameras are superposed with the coordinate axes of an orthogonal coordinate system, and moreover, at least one axis is provided with a pair of cameras looking at each other. On the basis, a coordinate cyclic approach method is put forward for visual space location to design an iterative algorithm with excellent convergence, consequently, coarse initial values can be quickly converged into accurate values, the algorithm is simple, and thereby an object can be quickly and accurately located.

Description

Into formula pair type orthogonal camera system video positioning method and system are forced in the coordinate circulation

Technical field

The present invention relates to a kind of three-dimensional information localization method and camera chain device based on video.Be particularly related to a kind of multi-camera system and localization method of arranging by particular form thereof, this method makes four above plane camera optical axis arrange according to orthogonal manner, and on a certain axle a pair of video camera to looking is each other arranged at least.To the camera chain of this special arrangement, can force into iterative manner to approach static target quickly and accurately or follow the tracks of the three-dimensional position of dynamic object by the circulation of specially designed coordinate, can be used for the multiple occasion that needs the three-dimensional measurement location.

Background technology

The visual space location technology is the method for three-dimensional measurement that is based upon on the theory on computer vision.It utilizes the relatively-stationary camera in some positions, obtains several images of Same Scene from different perspectives, obtains its three-dimensional coordinate by the parallax of computer memory o'clock in two images.Vision localization based on camera chain has characteristics untouchable, that speed is fast, automaticity is high, and its convenience, low cost make it obtain using very widely.

The camera chain that realizes the visual space location is divided into monocular, binocular and multi-lens camera system.The research of monocular vision and binocular vision is comparatively thorough, and most of existing localization method all is based on monocular-camera system or binocular camera system, perhaps utilization monocular vision algorithm [1] under the multi-lens camera environment; This vision measuring method must be determined the position relation between each visual angle image coordinate system earlier in measuring process, could determine the projection matrix at each visual angle, and then obtain three-dimensional information.Therefore binocular vision system generally is that two cameras are arranged in parallel, and guarantees that optical axis is parallel, utilizes the triangulation principle image characteristic point is mated and to locate, and sets up linear equation by cartesian geometry and finds the solution, and its bearing accuracy is also relevant with the parallax size.

In the middle of the used for multi-vision visual technology was developing, the camera arrangements scheme of general used for multi-vision visual system was similar to binocular vision system, and the purpose that multiple-camera is arranged is in order to obtain bigger field angle.Multi-camera system according to quadrature arrangement is also arranged at present both at home and abroad, more external up-to-date having researched and proposed are extracted characteristics of image based on orthogonal camera system and are carried out three-dimensional method of following the tracks of, as Enrique Mu ~ noz[2] wait the efficient 3D tracking of people under the orthogonal camera that has proposed on the basis that characterizes video camera and target relative position function parameter by estimation.But existing object localization method based on orthogonal camera system is still the object localization method of continuing to use the monocular-camera system, use complicated stereoscopic vision model, adopt complicated computing method to attempt in same step calculating, to determine the Three-dimension Target coordinate figure simultaneously.

In computer vision system, also has the quadrature of employing iteration thought to carry out pose and estimate [3], but this algorithm is the monocular vision algorithm that still belongs to based on a feature that camera arrangements does not adopt orthogonal manner.

[1] You Suya, Xu Guangyou, the present situation of stereoscopic vision research and progress, Chinese graphic image journal [J] .1997,2 (1): 17～23.

[2]Enrique?Munoz，Efficient?Tracking?of?3D?Objects?Using?Multiple?OrthogonalCameras[C]，Electronic?Proceedings?of?the?19th?British?Machine?Vision?Conference，Leeds，UK，2008.

[3] Xu Yunxi, Jiang Yunliang, Chen Fang, the generalized orthogonal iterative algorithm that the multi-camera system pose is estimated, optics journal [J] .2009,29 (1): 72～77.

Summary of the invention

The objective of the invention is to overcome model complexity in the existing target localization technology, calculating is loaded down with trivial details, speed is slow and the shortcoming of a large amount of consumption calculations machine resources, a kind of video locating method and system that uses orthogonal camera system is provided, the method that system adopts the coordinate circulation to approach, structure has good constringent iterative algorithm, realize quick accurate localization, simultaneously realized error compensation, so bearing accuracy and speed of convergence can further improve owing on an axle, arrange two video cameras of antithesis.Adopting said method can improve efficient, precision and the sensitivity of location, can be used for robot vision, intelligent human-machine interaction, fields such as virtual reality and intelligent monitoring.

The present invention adopts following technical scheme for achieving the above object:

Into formula pair type orthogonal camera system video positioning method is forced in coordinate circulation of the present invention, it is characterized in that, a pair of video camera to looking is each other arranged on the X-axis, a video camera is respectively arranged on Y-axis, the Z axle, true origin is positioned at three camera optical axis intersection points, can obtain target imaging planimetric position U, target apart from there being following ratio relation between the vertical range H of camera photocentre distance L, target and optical axis and the corresponding focal length F of video camera:

L/F＝H/U (1)

Described method comprises the steps:

The first step, initialization: give the position initial value (x that sets the goal ₀, y ₀, z ₀), x wherein ₀, y ₀, z ₀Be respectively the spatial value of target on X-axis, Y-axis, Z axle;

Second step is corresponding to a video camera S on the X-axis ₁, according to x ₀Coordinate is obtained the distance L of target along this camera optical axis ₁=| P ₁-x ₀|, by (1) formula can obtain target therewith camera optical axis also be the coordinate figure y of target on Y-axis with respect to two vertical range H values of Y-axis and Z axle ₁With the coordinate figure z on the Z axle ₁Corresponding to another video camera S on the X-axis ₄, according to x ₀Coordinate is obtained the distance L of target along this camera optical axis ₄=| P ₄-x ₀|, by (1) formula can obtain target therewith camera optical axis also be the coordinate figure y of target on Y-axis with respect to two vertical range H values of Y-axis and Z axle ₄With the coordinate figure z on the Z axle ₄Obtain the coordinate mean value that these two video cameras calculate, i.e. y ₁₄=(y ₁+ y ₄)/2, z ₁₄=(z ₁+ z ₄)/2.

The 3rd step is corresponding to the video camera S on the Y-axis ₂, according to the coordinates of targets value y that calculates in second step ₁₄, obtain the distance L of target along this camera optical axis ₂=| P ₂-y ₁₄|, by (1) formula can obtain target therewith camera optical axis also be the coordinate figure x of target on X-axis with respect to two vertical range H values of X-axis and Z axle ₂With the coordinate figure z on the Z axle ₂

The 4th step is corresponding to the video camera S on the Z axle ₃, according to the z that calculates in the step 3 ₂Coordinate can be obtained the L of target along this camera optical axis distance ₃=| P ₃-z ₂|, by (1) formula can obtain target therewith camera optical axis also be the coordinate figure x of target on X-axis with respect to two vertical range H values of X-axis and Y-axis ₃With the coordinate figure y on the Y-axis ₃

In the 5th step, the coordinate figure of target on X-axis, Y-axis, Z axle that second step to the 4th step is obtained averaged, i.e. x=(x ₂+ x ₃)/2, y=(y ₁₄+ y ₃)/2, z=(z ₁₄+ z ₂)/2;

In the 6th step, convergence is judged: go on foot the coordinate figure mean value x that calculates, y, the initial value x that z and iteration begin as if the 5th ₀, y ₀, z ₀Converge to given accuracy ε, x then, y, z are as the final goal positional value, and iteration finishes; Otherwise with x, y, z is as the position initial value, even x ₀=x, y ₀=y, z ₀=z changeed for second step.

Beneficial effect of the present invention is to pass through to arrange the pair type orthogonal camera system in vision localization, has proposed the method that the coordinate circulation approaches.By arranged the video camera to even permutation on an axle, the error that the initial value deviation is caused on these two video cameras compensates mutually, thereby result of calculation is more accurate, and speed of convergence is exceedingly fast.This alternative manner has at first avoided using complicated stereoscopic vision model and complicated computing method, has outstanding efficient and error robustness, i.e. the error of any one coordinate figure can not amplified and propagated among next step result in each step; Secondly, method has good convergence, can realize accurately locating fast static target; Once more, to moving target, because the iterative algorithm of localization method can guarantee the promptness and the sensitivity of following the tracks of all the time improving precision; Again secondly, no matter to static still moving target, localization method of the present invention is all identical, need not introduce the generic operation of judgement, switching.At last, because algorithm only relates to the addition subtraction multiplication and division arithmetic, help on simple chip, realizing with hardware.In a word, localization method of the present invention is efficient, accurate and sensitive.

Description of drawings

Fig. 1 pinhole imaging system principle schematic;

Fig. 2 orthogonal camera video locating method schematic diagram;

Fig. 3 adopts pair type orthogonal camera system location finger fingertip schematic diagram;

Each coordinate system synoptic diagram in Fig. 4 positioning system.

Embodiment

Orientate example as with the finger fingertip in the virtual reality system, narration embodiment of the present invention.

As shown in Figure 1, be the pinhole imaging system principle schematic.

As Fig. 2, three video camera S ₁, S ₂, S ₃Optical axis is arranged along X, Y, Z axle, is arranged one and S on the X-axis again ₁To the video camera S that looks ₄, camera optical axis all points to initial point, and photocentre is P apart from the distance of initial point.According to the range of movement of staff in the virtual reality system, video camera photocentre and former dot spacing are from being made as about 120 centimetres.Video camera is demarcated, determined camera parameters, obtain focal length of camera.Finger fingertip sticks color mark, adopts the picture position of detecting target based on the method for color mark.If: x _Img1, x _Img2, x _Img3, x _Img4Be respectively video camera S ₁, S ₂, S ₃, S ₄On target image x axial coordinate, y _Img1, y _Img2, y _Img3, y _Img4Be respectively video camera S ₁, S ₂, S ₃, S ₄On target image y axial coordinate, z _Img1, z _Img2, z _Img3, z _Img4Be respectively video camera S ₁, S ₂, S ₃, S ₄On target image z axial coordinate, the target image coordinate in each video camera is initial point with the photocentre, change in coordinate axis direction and solid axes X, Y, Z-direction are consistent, as shown in Figure 4.

Orientate example as with single finger tip, concrete implementation step is as follows:

1. initialization: give the initial value (x that sets the goal ₀y ₀z ₀)=(0,0,0), and make x ₂=x ₃=x ₀, y ₁=y ₃=y ₄=y ₀, z ₁=z ₂=z ₄=z ₀Y wherein ₁, z ₁Be corresponding video camera S in the iterative process ₁The coordinate figure of the target that calculates on Y, Z axle, x ₂, z ₂Be corresponding video camera S in the iterative process ₂The coordinate figure of the target that calculates on X, Z axle, x ₃, y ₃Be corresponding video camera S in the iterative process ₃The coordinate figure of the target that calculates on X, Y-axis, y ₄, z ₄Be corresponding video camera S in the iterative process ₄The coordinate figure of the target that calculates on Y, Z axle.Start video acquisition, change 2.

2. to the video image in this sampling period, start finger tip and detect (as Fig. 3), determine the fingertip location in the image coordinate:

Employing detects finger tip based on the block algorithm of color mark.Method is described below: at first gather raw video image by image collecting device, appliance computer is transformed into the HSV space with raw video image.Then image is carried out piecemeal, all pixel H components in the piece are detected, if image block interior pixel point H component is in certain threshold range, then the pixel that satisfies condition in the piece is counted, the image block of number of pixels greater than certain threshold value kept, and it is exactly target image that the adjacent image piece that all satisfy condition in the image is communicated with the maximum region that obtains.Just can obtain the target image position by the horizontal ordinate of asking all pixels in this zone, the mean value of ordinate.If have at least a camera to detect target image, then change the step 3 down; Otherwise wait for the next sampling period, repeat this step.

3. adopt coordinate circulation approach method, iterative computation finger tip locus:

1. according to target along X-axis coordinate figure x ₀, by video camera S ₁, S ₄The image coordinate that obtains calculate Y, the Z axial coordinate of target, and ask its mean value, if video camera S ₁Perhaps S ₄Fail recognition objective, then keep former y ₁, z ₁Perhaps y ₄, z ₄Constant.

y ₁＝L ₁·y _img·1/f _y1 (2)

z ₁＝L ₁·z _img·1/f _z1 (3)

y ₄＝L ₄·y _img·4/f _y4 (4)

z ₄＝L ₄·z _img·4/f _z4 (5)

y ₁₄＝(y ₁+y ₄)/2 (6)

z ₁₄＝(z ₁+z ₄)/2 (7)

Wherein, L ₁=| P-x ₀|, L ₄=| P+x ₀|, f _Y1, f _Y4, f _Z1, f _Z4Focal length for the video camera correspondence.

2. use y ₁₄, by X, the Z axial coordinate of video camera image coordinate that S2 obtains calculating target

x ₂＝L ₂·x _img·2/f _x2 (8)

z ₂＝L ₂·z _img·2/f _z2 (9)

Wherein, L ₂=| P-y ₁₂|, f _X2, f _Z2Focal length for the video camera correspondence.If video camera S ₂Fail recognition objective, then keep former x ₂, z ₂Constant.

3. use z ₂, by video camera S ₃The image coordinate that obtains calculate X, the Y-axis coordinate of target

x ₃＝L ₃·x _img·3/f _x3 (10)

y ₃＝L ₃·y _img·3/f _y3 (11)

Wherein, L ₃=| P-z ₂|, f _X3, f _Y3Focal length for the video camera correspondence.If video camera S ₃Fail recognition objective, then keep former x ₃, y ₃Constant.

4. be averaged order

x＝(x ₂+x ₃)/2 (12)

Y=(y ₁₄+ y ₃)/2 or y=(y ₁+ y ₄+ y ₂)/3 (13)

Z=(z ₁₄+ z ₂)/2 or z=(z ₁+ z ₄+ z ₂)/3 (14)

5. if 4. coordinate figure mean value x, y, the initial value x that z and iteration begin ₀, y ₀, z ₀Converge to given accuracy ε, coordinate figure mean value x then, y, z are as the final goal positional value, and iteration finishes, and wait for the next sampling period; Otherwise make x ₂=x ₃=x ₀, y ₁=y ₂=y ₃=y ₀, z ₁=z ₂=z ₄=z ₀Change 1.

Experiment shows, just can obtain fingertip location by the iteration about 4 steps in the native system, and iteration is quick, and stable convergence, and the location iteration can be finished in the cycle at video sampling.Verified the correctness of the method for the invention and system.

Claims

1. into formula pair type orthogonal camera system video positioning method is forced in a coordinate circulation, it is characterized in that, a pair of video camera to looking is each other arranged on the X-axis, a video camera is respectively arranged on Y-axis, the Z axle, true origin is positioned at three camera optical axis intersection points, can obtain target imaging planimetric position U, target apart from there being following ratio relation between the vertical range H of camera photocentre distance L, target and optical axis and the corresponding focal length F of video camera:

L/F＝H/U (1)

Described method comprises the steps:

2. into formula pair type orthogonal camera system video positioning method is forced in coordinate circulation according to claim 1, it is characterized in that the method for video camera detection target image coordinate is as follows:

To video camera in the sampling period photograph video image, starting finger tip detects, determine that the fingertip location in the image coordinate promptly adopts the block algorithm based on color mark that finger tip is detected: at first gather raw video image by image collecting device, appliance computer is transformed into the HSV space with raw video image; Then image is carried out piecemeal, all pixel H components in the piece are detected, if image block interior pixel point H component is in certain threshold range, then the pixel that satisfies condition in the piece is counted, the image block of number of pixels greater than certain threshold value kept, and it is exactly target image that the adjacent image piece that all satisfy condition in the image is communicated with the maximum region that obtains; Just can obtain the target image position by the horizontal ordinate of asking all pixels in this zone, the mean value of ordinate.

3. into formula pair type orthogonal camera system video positioning system is forced in a coordinate circulation, it is characterized in that comprising image collecting device, computing machine and four video cameras, a pair of video camera to looking is each other arranged on the X-axis, a video camera is respectively arranged on Y-axis, the Z axle, true origin is positioned at four camera optical axis intersection points, connects input end and computer behind the output terminal concatenated images harvester of four video cameras.