Disclosure of Invention
The invention aims to overcome the defects of the prior art, and better and effectively solves the problems that the prior analog simulation system mostly adopts a perspective projection mode with a fixed viewpoint, and once the geometric size and the placing position of a display screen are determined, all parameters of a projection matrix are kept unchanged in the simulation process; the method for simulating the multi-screen visual scene of the simulator based on the dynamic viewpoints solves the problems that the virtual visual scene displayed by a projection screen cannot be dynamically updated along with the change of the eyes or the viewpoint of a flight student relative to the position of a cockpit, and the imaging content and the imaging effect are different from the effect of observing the external environment by the student through the cockpit window under the real condition.
In order to achieve the purpose, the invention adopts the technical scheme that:
the simulator multi-screen visual simulation method based on the dynamic viewpoint comprises the following steps,
step (A), respectively placing a Kinect sensor at two sides of a multi-screen to acquire head original point cloud data of a driving student, and performing conditional filtering and denoising processing on the acquired original point cloud data to obtain processed point cloud data;
step (B), adopting a three-surface target algorithm to carry out external reference rotation matrix on the Kinect sensor
And translation vector
Calibrating and unifying the processed point cloud data to a global coordinate system to complete the fusion of the processed point cloud data of the two Kinect sensors;
step (C), based on the fusion of the processed point cloud data, adopting
The algorithm obtains head pose parameters in real time, and completes the detection of the face area of the driving learner and the estimation of the head pose to obtain an object model of the virtual three-dimensional scene;
step (D), mapping the obtained virtual three-dimensional scene object model to a screen coordinate system for visualization through visual transformation, projective transformation, perspective division and viewport transformation;
and (E) respectively calculating a perspective matrix from the virtual viewpoint to each screen in a 3D space according to the actual placement position and the size of each screen, enhancing the stability of the projection parameters based on Accela filtering, and smoothing the motion curve of the head parameters to complete the multi-screen view simulation work of the simulator.
Preferably, in the step (A), a Kinect sensor is respectively arranged at two sides of a multi-screen to collect the head original point cloud data of the driving student, and the collected original point cloud data is subjected to conditional filtering and denoising to obtain processed point cloud data, and the specific steps are as follows,
step (A1) of performing conditional filtering on the acquired original point cloud data and using a conditional filtering algorithm to perform conditional filtering on the point cloud data
A shaft,
Shaft and
the axes are processed to filter out useless information and background of each axis,
filtering of the shaft as shown in equation (1),
wherein the content of the first and second substances,
,
representing a conditionally filtered set of point cloud data,
representing an input raw point cloud data set;
step (A2), denoising the acquired original point cloud data, using a bilateral filtering algorithm as shown in formula (2),
wherein, the first and the second end of the pipe are connected with each other,
representing the point cloud characteristic information before bilateral filtering,
representing the point cloud characteristic information acquired by the bilateral filtering algorithm,
the overall weight parameter is represented by a value,
representing the characteristic information of a normal vector before point cloud bilateral filtering; and the overall weight parameter
The specific steps of the definition are as follows,
a step (A21) of,
is defined as shown in the formula (3),
wherein the content of the first and second substances,
the representation feature holds a weight calculation function,
represents a smooth filtering weight calculation function,
indicating points
Is/are as follows
The number of the neighboring values is,
indicating points
To any point in its neighborhood
The distance of the vector of (a) to the target,
indicating points
Normal vector feature information in a neighboring region;
step (A22), in the formula (3)
And
as shown in equation (4) and equation (5), respectively,
wherein the content of the first and second substances,
and
indicating points
Gaussian filter coefficients of the tangent plane.
Preferably, in the step (B), a three-plane target algorithm is adopted to carry out the external reference rotation matrix on the Kinect sensor
And translation vector
Calibrating and unifying the processed point cloud data to be completeThe local coordinate system completes the fusion of the point cloud data processed by the two Kinect sensors, and the specific steps are as follows,
step (B1) of setting a vector
As vectors in a three-dimensional coordinate system, i.e.
,
Are points of a three-dimensional coordinate system, and
in a coordinate system
And
are respectively defined as
And
to, for
And
carrying out normalization processing to obtain unit vector
And
and unit vector
And
as shown in the formula (6),
wherein the content of the first and second substances,
representing an external reference rotation matrix;
a step (B2) of setting a rotation vector
Formed matrix
And matrix are
Can make
Matrix of
As shown in the formula (7),
wherein the content of the first and second substances,
representing a third order identity matrix;
step (B3) of using
Group of vectors is in
And
unit vector in coordinate system
And
can solve the rotation matrix
Is provided with
,
Then rotation matrix
The obtained process is shown as formula (8), formula (9) and formula (10),
wherein the content of the first and second substances,
,
which is indicative of the vector of rotation of the,
represent
,
Represent
;
Step (B4) of setting a three-dimensional space point D at
And
are respectively defined as
And
then, then
Is reused
Point averaging, translation vector
As shown in equation (11),
wherein the content of the first and second substances,
representing translation vectors,
Representing a series of coordinate points obtained under the Kinect1 sensor,
a series of coordinate points obtained under the Kinect2 sensor are represented.
Preferably, step (C) is based on the fusion of the processed point cloud data, and
the algorithm obtains the head pose parameters in real time, completes the face area detection and the head pose estimation of the driving trainees, obtains an object model of the virtual three-dimensional scene, and comprises the following specific steps,
step (C1), a loss function is established, the loss function is used for reflecting the error between the prediction result of the model for the sample and the actual label of the sample, and the total loss function of the model is
、
And
the sum of these three hierarchical feature map loss functions, the total loss function of the model is shown in equation (12),
wherein the content of the first and second substances,
the function of the total loss is expressed as,
、
and
respectively represent
、
And
loss functions for the three hierarchical layers;
step (C2), the loss function of each layer contains the position deviation for calculating the face area
Part for calculating classification errors
Part and method for determining whether a target object is contained in a face region
In part, the objective function of each layer is shown in equation (13),
wherein, the first and the second end of the pipe are connected with each other,
and
the Sigmoid is used as an activation function, and corresponding results are converted into probability values;
step (C3), each part of each layer of function adopts cross entropy as a loss function, as shown in formula (14),
wherein the content of the first and second substances,
the value of the loss function of the output is expressed,
the actual label corresponding to the sample is represented,
indicating the corresponding prediction result of the sample.
Preferably, in step (D), the obtained virtual three-dimensional scene object model is mapped to a screen coordinate system for visualization through view transformation, projection transformation, perspective division and viewport transformation, wherein the view transformation is to convert a world coordinate system into a camera coordinate system, the projection transformation is to map three-dimensional coordinates into two-dimensional coordinates, and the perspective division is to map three-dimensional coordinates into two-dimensional coordinates
The component becomes 1 and the viewport transformation is to convert the processed coordinates to screen coordinate system space.
Preferably, step (E) is to calculate a perspective matrix from the virtual viewpoint to each screen in the 3D space according to the actual placement position and size of each screen, enhance the stability of the projection parameters based on the Accela filter, and smooth the head parameter motion curve to complete the multi-screen view simulation of the simulator, which comprises the following specific steps,
step (E1), a fixed viewpoint mode is adopted, a head coordinate system and a screen coordinate system are mapped into the same world coordinate system, a perspective matrix is obtained by calculating a transformation matrix of the head coordinate system and the screen coordinate system, and the splicing work of multiple screens is completed;
step (E2), based on fixed viewpoint multi-screen splicing, calculating a perspective matrix from each frame of virtual viewpoint to each screen in real time according to a dynamic viewpoint technology;
and (E3) enhancing the stability of the projection parameters based on Accela filtering, and smoothing the motion curve of the head parameters.
Preferably, the specific steps of step (E1) are as follows,
step (E11), a calculation formula of the perspective matrix is constructed, as shown in formula (15),
wherein the content of the first and second substances,
representing the distance of the virtual viewpoint to the near clipping plane,
representing the distance from the virtual viewpoint to the far clipping plane;
and (E12) solving the constructed perspective matrix, wherein the concrete steps are as follows,
step (E121) of obtaining the center coordinates of each screen
And
coordinates with screen vertex
And
then, the sub normal vector, the tangent vector and the normal vector of each screen are obtained through the center coordinate and the screen vertex coordinate, and then normalization processing is carried out in sequence to obtain
、
And
thereby obtaining a rotation matrix
And translation vector
As shown in the formula (16) and the formula (17),
step (E122) of rotating the matrix according to the rotation matrix
And translation vector
Finding out an apparent transformation matrix
As shown in the formula (18),
step (E123) of setting the distance from the far and near cutting planes to the viewpoint to be
Then calculates the scaling factor of perspective projection
As shown in the formula (19) and the formula (20),
step (E124) of calculating the boundary conditions of the view frustum
And
then will be
And
substituting into equation (15), as shown in equation (21),
。
preferably, the specific steps of step (E2) are as follows,
step (E21), a world coordinate system, a screen coordinate system, a Kinect camera coordinate system and a head coordinate system are constructed, the specific steps are as follows,
step (E21), kinect camera coordinate system
And
unifying by adopting an external calibration algorithm, mapping one Kinect acquisition information to another Kinect camera coordinate system space, and simultaneously converting the Kinect camera coordinate system to a world coordinate systemThe mapping is also done by external scaling algorithm and solved
And
;
step (E22), each screen coordinate system
Unifying to the screen central coordinate system by the position relation and the screen size
And obtain
And
and screen central coordinate system
To world coordinate system
Is done by manual measurement and is solved
And
wherein errors caused by manual measurement can be corrected by setting an effective compensation value by a program;
a step (E23) of,
to
There is only a translation transformation, so that the compensation vector is defined as
And the translation vector part of the target conversion matrix is returned to zero to obtain
And
then the head coordinate system can be obtained through the motion posture of the head
Relative to the camera coordinate system
As shown in equation (22),
wherein the content of the first and second substances,
and
respectively representing corresponding rotation matrixes and translation vectors;
step (E24) of calculating a transformation matrix from each screen coordinate system to the head coordinate system
As shown in the formula (23),
preferably, the Accela algorithm in step (E3) is to use the header parameters
Splitting into position parts
And a rotating part
The method comprises the following specific steps of,
step (E31), constructing a noise filtering function, as shown in formula (24),
wherein the content of the first and second substances,
to represent
Any of the independent variables in (a);
indicating a noise threshold corresponding to the position part and lower than
The disturbance noise of (2) is ignored;
representing a smoothing coefficient corresponding to the position part;
step (E32), the noise threshold can restrain the tiny noise on the single channel, but can not filter the jitter generated by the superposition of the noise, and the position noise restraining factor is set
And
respectively shown as formula (25 And as shown in equation (26),
the invention has the beneficial effects that:
(1) According to the invention, the Kinect sensors are arranged on two sides of the screen to acquire the point cloud data of the head of the driving student, and the point cloud data of the head of the driving student is obtained through point cloud fusion, so that the problem of self-shielding of the head of the driving student is avoided, and meanwhile, the background and useless information are filtered by using conditional filtering.
(2) The head pose estimation method used by the invention can track the head in real time, can simultaneously detect the head and estimate the pose parameters of the head, and then dynamically updates the visual scene to a multi-screen display unit in real time by using a multi-screen splicing technology.
(3) According to the multi-screen visual scene splicing based on the dynamic viewpoint, on the basis of fixed viewpoint splicing, the position and the angle of the virtual viewpoint are updated in real time by using the obtained head pose parameters through global coordinate system modeling, and then the position and the angle of the screen virtual viewpoint can be updated in real time through the head pose change of a driver by using a screen splicing scheme, so that a virtual scene with high telepresence is realized.
(4) The invention aims at the fact that a dynamic viewpoint system is particularly sensitive to noise, accela filtering is used for enhancing the stability of projection parameters, and smoothing of a head parameter motion curve is carried out, so that the influence of jitter generated in the head motion process on the screen visual content can be avoided.
Detailed Description
The invention will be further described with reference to the accompanying drawings.
As shown in fig. 1, the multi-screen view simulation method of the simulator based on dynamic viewpoint according to the present invention includes the following steps,
step (A), respectively arranging a Kinect sensor at two sides of a multi-screen to collect head original point cloud data of a driving student, and carrying out conditional filtering and denoising processing on the collected original point cloud data to obtain processed point cloud data,
the method comprises the steps that original point cloud data collected by a Kinect sensor has a lot of noise and useless information, the useless information can be well filtered by using a conditional filtering algorithm for the point cloud data, but part of noise exists, and therefore a bilateral filtering algorithm is used again; the bilateral filtering algorithm can achieve the effect of noise reduction and smoothing on the edge of the picture, meanwhile, the bilateral filtering algorithm can also keep the edge information of the picture, and the Euclidean distance between the current point and the adjacent point and the local geometric information of the current point in the adjacent area of the current point can be considered when distinguishing the noise point from the outlier.
Step (A1) of performing conditional filtering on the acquired original point cloud data and using a conditional filtering algorithm to perform conditional filtering on the point cloud data
A shaft,
Shaft and
the axes are processed to filter out useless information and background of each axis,
the filtering of the axis is shown in equation (1),
wherein, the first and the second end of the pipe are connected with each other,
,
representing a conditionally filtered set of point cloud data,
representing an input raw point cloud data set;
step (A2), denoising the acquired original point cloud data, using a bilateral filtering algorithm as shown in formula (2),
wherein the content of the first and second substances,
representing the characteristic information of the point cloud before bilateral filtering,
representing the point cloud characteristic information obtained by bilateral filtering algorithm,
the overall weight parameter is represented by a weight value,
representing the characteristic information of a normal vector before point cloud bilateral filtering; and the overall weight parameter
The specific steps of the definition are as follows,
a step (A21) of,
is defined as shown in the formula (3),
wherein the content of the first and second substances,
the representation feature holds a weight calculation function,
represents a smooth filtering weight calculation function,
indicating points
Is/are as follows
The number of the neighboring values is,
indicating points
To any point in its neighborhood
The distance of the vector of (a) to the target,
indicating points
Normal vector feature information in a neighboring region;
step (A22), in the formula (3)
And
as shown in equation (4) and equation (5) respectively,
wherein the content of the first and second substances,
and
indicating points
Gaussian filter coefficients of the tangent plane.
Step (B), adopting a three-surface target algorithm to carry out external reference rotation matrix on the Kinect sensor
And translation vector
Calibrating and unifying the processed point cloud data to a global coordinate system to complete the fusion of the processed point cloud data of the two Kinect sensors,
step (B1) of setting a vector
As vectors in a three-dimensional coordinate system, i.e.
,
Are points of a three-dimensional coordinate system, and
in a coordinate system
And
are respectively defined as
And
to, for
And
normalization processing is carried out to obtain unit vector
And
and unit vector
And
as shown in the formula (6),
wherein the content of the first and second substances,
representing an external reference rotation matrix;
a step (B2) of setting a rotation vector
Formed matrix
And matrix of
Can make
Matrix of
As shown in the formula (7), the,
wherein, the first and the second end of the pipe are connected with each other,
representing a third order identity matrix;
step (B3) of using
Group of vectors is in
And
unit vector in coordinate system
And
can solve the rotation matrix
Is provided with
,
Then rotation matrix
The obtained process is shown as formula (8), formula (9) and formula (10),
wherein the content of the first and second substances,
,
which is indicative of the vector of rotation of the,
represent
,
To represent
;
Step (B4) of setting a three-dimensional space point D at
And
are respectively defined as
And
then, then
Is reused
Point averaging, translation vector
As shown in equation (11),
wherein, the first and the second end of the pipe are connected with each other,
which represents the translation vector(s) of the image,
representing a series of coordinate points obtained under the Kinect1 sensor,
a series of coordinate points obtained under the Kinect2 sensor are indicated.
As shown in FIG. 2, step (C), based on the fusion of the processed point cloud data, adopts
The algorithm obtains the head pose parameters in real time, completes the face area detection and the head pose estimation of the driving trainees, obtains an object model of the virtual three-dimensional scene,the specific steps are as follows,
wherein adopt
The algorithm is used for acquiring the head pose parameters in real time, and meanwhile, the algorithm can finish head detection and head pose estimation at the same time;
basically completely consistent with the yolov4 processing process, but a front view BEV is used for replacing an RGB image input by the yolov4 network before input; in order to enable the head pose algorithm to predict the target parameters, the processed three-dimensional point cloud data is programmatically converted into a frontal view BEV. In order to make the algorithm suitable for face region detection and head pose estimation, each anchor frame of the model corresponds to an output containing 9 values, wherein 1 represents whether the anchor frame is a positive sample, 2 represents the number of parameters of the position of the anchor frame, 3 represents the offset of the center point of the anchor frame to the nose of the center of the head, and 3 represents the information of the rotation angle of the head.
Step (C1), a loss function is established, the loss function is used for reflecting the error between the prediction result of the model for the sample and the actual label of the sample, and the total loss function of the model is
、
And
the sum of these three hierarchical feature map loss functions, the total loss function of the model is shown in equation (12),
wherein the content of the first and second substances,
the function of the total loss is expressed as,
、
and
respectively represent
、
And
loss functions for the three hierarchical layers;
step (C2), the loss function of each layer contains the position deviation for calculating the face area
Part for calculating classification errors
Part and method for determining whether a target object is contained in a face region
In part, the objective function of each layer is shown in equation (13),
wherein the content of the first and second substances,
and
the Sigmoid is used as an activation function, and corresponding results are converted into probability values;
step (C3), each part of each layer of function adopts cross entropy as a loss function, as shown in formula (14),
wherein the content of the first and second substances,
the value of the loss function of the output is expressed,
the actual label corresponding to the sample is represented,
indicating the corresponding prediction result of the sample.
And (D) mapping the obtained virtual three-dimensional scene object model to a screen coordinate system for visualization through visual transformation, projection transformation, perspective division and view port transformation, wherein the visual transformation is to convert a world coordinate system into a camera coordinate system, the projection transformation is to map three-dimensional coordinates into two-dimensional coordinates, and the perspective division is to map the three-dimensional coordinates into two-dimensional coordinates
The component becomes 1 and the viewport transform is a transformation of the processed coordinates into the screen coordinate system space.
The projection transformation is a key step, and maps a three-dimensional coordinate into a two-dimensional coordinate in an orthogonal projection mode and a perspective projection mode; the orthogonal projection uses a projection mode of a rectangular observation body, does not scale an object according to the distance between the object and a virtual viewpoint, and is equivalent to omitting
The axis information directly projects the three-dimensional object to a two-dimensional plane; perspective projection adoptsThe projection mode of the viewing cone observation body simulates the mode of eyes observing the world, the three-dimensional object is projected onto a two-dimensional plane according to the rule of large and small distances, and the three-dimensional scene except the viewing cone body cannot be projected.
Step (E), respectively calculating a perspective matrix from the virtual viewpoint to each screen in a 3D space according to the actual placing position and the size of each screen, enhancing the stability of the projection parameters based on Accela filtering, and smoothing the motion curve of the head parameters to complete the multi-screen view simulation work of the simulator, which comprises the following specific steps,
the screen display unit of the aircraft simulator is formed by an upper liquid crystal screen and a lower liquid crystal screen in an arc shape, and the high-degree-of-field-feeling large-field-angle virtual visual environment can be realized according with the design of human engineering; the multi-screen view splicing technology respectively calculates a projection matrix from a virtual scene viewpoint to each screen in a 3D space according to the actual placement position and the size of each liquid crystal screen. Because the edges of the screens are jointed with each other during actual placement, the whole picture in the virtual scene can be spliced according to the picture projected to each screen by the calculated perspective matrix. When the head of the driving learner rotates, the head position and the head posture are calculated by the method and used for updating the parameters of the virtual viewpoint, and then the perspective matrix from the virtual viewpoint to each screen is recalculated, so that the multi-screen view splicing scheme under the dynamic viewpoint is realized.
Step (E1), a head coordinate system and a screen coordinate system are mapped into the same world coordinate system by adopting a fixed viewpoint mode, and a perspective matrix is obtained by calculating a transformation matrix of the head coordinate system and the screen coordinate system to finish the splicing work of multiple screens, which comprises the following specific steps,
step (E11), a calculation formula of the perspective matrix is constructed, as shown in formula (15),
wherein, the first and the second end of the pipe are connected with each other,
representing the distance of the virtual viewpoint to the near clipping plane,
representing the distance from the virtual viewpoint to the far clipping plane;
and (E12) solving the constructed perspective matrix, wherein the specific steps are as follows,
step (E121) of finding the center coordinates of each screen
And
coordinates with screen vertex
And
then, the sub normal vector, the tangent vector and the normal vector of each screen are obtained through the center coordinate and the screen vertex coordinate, and then normalization processing is carried out in sequence to obtain
、
And
thereby obtaining a rotation matrix
And translation vector
As shown in the formula (16) and the formula (17),
step (E122) of rotating the matrix according to the rotation matrix
And translation vector
To find out the apparent transformation matrix
As shown in the formula (18),
step (E123) of setting distances from the far and near cutting planes to the viewpoint to be
Then, the scaling factor of perspective projection is calculated
As shown in the formula (19) and the formula (20),
step (E124) of calculating the boundary condition of the view frustum
And
then will be
And
substituting into equation (15), as shown in equation (21),
。
step (E2), based on fixed viewpoint multi-screen splicing, calculating a perspective matrix from each frame of virtual viewpoint to each screen in real time according to a dynamic viewpoint technology, and specifically comprises the following steps;
the multi-screen splicing can be completed in a fixed viewpoint mode by mapping a head coordinate system and a screen coordinate system to the same world coordinate system in advance and calculating a transformation matrix of the head coordinate system and the screen coordinate system to obtain a perspective matrix; therefore, the multi-screen view splicing based on the dynamic viewpoint technology needs to calculate the perspective matrix from the virtual viewpoint to each screen in real time every frame.
Step (E21), a world coordinate system, a screen coordinate system, a Kinect camera coordinate system and a head coordinate system are constructed, the specific steps are as follows,
step (E21), kinect camera coordinate system
And
unifying by adopting an external calibration algorithm, mapping one Kinect acquisition information to another Kinect camera coordinate system space, simultaneously completing the mapping from the Kinect camera coordinate system to a world coordinate system by adopting the external calibration algorithm, and solving
And
;
step (E22), each screen coordinate system
Unified to screen central coordinate system by mutual position relation and screen size
And obtain
And
and screen central coordinate system
To world coordinate system
Is done by manual measurement and finds
And
wherein, the error caused by manual measurement can be corrected by setting an effective compensation value by a program;
a step (E23) of,
to
There is only a translation transformation, so that the compensation vector is defined as
And return the translation vector part of the destination transformation matrix to zeroTo find out
And
then, a head coordinate system can be obtained through the motion posture of the head
Relative to the camera coordinate system
As shown in equation (22),
wherein, the first and the second end of the pipe are connected with each other,
and
respectively representing corresponding rotation matrixes and translation vectors;
step (E24) of calculating a transformation matrix from each screen coordinate system to the head coordinate system
As shown in the formula (23),
and (E3) enhancing the stability of the projection parameters based on Accela filtering, and smoothing the motion curve of the head parameters, wherein the Accela algorithm is to use the head parameters
Splitting into position parts
And a rotating part
The method comprises the following specific steps of,
the dynamic viewpoint system is very sensitive to noise, the head parameters comprise six independent variables, and even if a driving student keeps the head still, the noise influences one independent variable to enable the screen to generate a shaking sense; for real-time systems, multi-frame per second calculation is required to fit a nonlinear continuous model in time, and reasonable smooth filtering is required to reduce the perception of human eyes to the frequency and enhance the comfort.
Step (E31), constructing a noise filtering function, as shown in formula (24),
wherein, the first and the second end of the pipe are connected with each other,
to represent
Any of the independent variables in (a);
representing a noise threshold corresponding to a portion of the location, and below
The disturbance noise of (2) is ignored;
representing a smoothing coefficient corresponding to the position part;
step (E32), the noise threshold can restrain the tiny noise on the single channel, but can not filter the jitter generated by the superposition of the noise, and the position noise restraining factor is set
And
as shown in equation (25) and equation (26), respectively,
in order to verify the validity and effectiveness of the method according to the invention, a specific embodiment of the invention is described below,
the screen display unit of the embodiment adopts eight 3D liquid crystal televisions as the projection screen, and generates a full 3D virtual simulation environment in a mode of splicing multiple screen views. The 4 televisions correspond to a front window of the cockpit, are vertically arranged at 135 degrees between every two televisions and are integrally inclined forward by 30 degrees; the rest 4 televisions correspond to the windows under the feet of the cockpit, are vertically arranged by 180 degrees between every two televisions and form an included angle of 40 degrees with the horizontal plane. Two are adopted
The cameras are arranged right above the front television pairwise junction positions, and the optical axis faces the head position of a flying student in a normal sitting posture to complete dynamic detection of the viewpoint of the student. Each liquid crystal display screen is driven by a graphic computer, and the computers finish the synchronization of data such as control data, viewpoint parameters, simulation entity pose and the like through an open source CIGI distributed network protocol.
The algorithm needs to be trained in advance, and the training data adopted is from
A database. The database provided 42 sets of Kinect collected head depth data, 32000 total, from 26 men, 10 women and 6 women, respectivelyThe wearer of the glasses. And acquiring a depth image and a color image by each frame of Kinect, and labeling the pose parameters by using a Faceshift technology in the later stage. Faceshift is given in terms of depth image per frame
As tag information. As shown in table 1, the recognition results of four consecutive image sequences in the data set are given.
By comparing the output of the head pose estimation algorithm with the actual tag, the algorithm can be found to have good accuracy, and the real-time tracking and the output of the head pose parameters of the frame can be realized.
This real-time example realizes that aircraft simulator multiple screen views concatenation scheme based on eight high definition LCD screens from top to bottom to select four different scenes such as residential area, offshore platform, open-air and airport, show multiple screen views concatenation effect from the inboard external angle respectively, the multiple screen concatenation scheme of this case can carry out fine concatenation with virtual views, provide the big angle of vision virtual environment of higher telepresence, bring better reality sense and the sense of immersing for the driving student, thereby reach better training effect.
Because the actual system is easy to mix sensor noise and algorithm random noise, such as involuntary movement of the head, mechanical shaking of a cockpit base and recognition errors of a head posture estimation calculation method, the screen visual contents can generate shaking feeling, and the training effect is influenced. The projection parameter stability enhancement algorithm based on Accela filtering is used for restraining the projection parameter stability enhancement algorithm, experimental analysis is carried out on the algorithm, the smoothing effect of the algorithm is good, and noise fluctuation of data can be processed in real time.
The visual display system designed by the scheme is composed of 8 high-resolution liquid crystal displays, the 8 displays are arranged in four-up-four-down-four mode, and the angles of the 4 displays on the upper surface and the distance between the displays and a driving student can be adjusted through the screen adjusting structureThe distance to the driving learner can also be adjusted as follows. The visual display system of the simulator splices the images together by utilizing a multi-screen splicing technology, and the horizontal visual angle of 8 liquid crystal displays is larger than that of the
Vertical angle of view greater than
And a wide visual field range is provided for the driving trainees. Meanwhile, the liquid crystal display screen can provide high resolution, brightness and contrast, so that the visual display system has the characteristics of large visual field, high brightness, high contrast, high resolution and the like, and provides a set of continuous and complete extrawindow scenes with high fidelity for training personnel. Scheme based on many screen concatenations also can carry out fine concatenation with virtual scene when realizing low price to provide the virtual scene of big visual angle of high telepresence, bring better sense of immersing for the driving student, thereby make the training effect reach better.
In summary, the simulator multi-screen view simulation method based on the dynamic viewpoint of the invention firstly utilizes a Kinect sensor non-contact mode to collect point cloud data of a head of a driving student, estimates head pose information of the student in real time, simultaneously adopts an arrangement mode of two Kinects for solving the self-shielding problem of the head, then uses a head pose estimation calculation method to simultaneously detect the head and estimate the pose of the head, and realizes real-time tracking, and then on the basis of a multi-screen view splicing scheme based on a fixed viewpoint, the multi-screen view splicing scheme based on the dynamic viewpoint is realized by modeling a global coordinate system and updating the position and the angle of a virtual scene viewpoint in real time by the obtained head pose parameters, and the curve difference value smoothing algorithm Accela filtering based on experience estimation is provided for jitter information possibly occurring in the head motion process.
The foregoing illustrates and describes the principles, general features, and advantages of the present invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are given by way of illustration of the principles of the present invention, but that various changes and modifications may be made without departing from the spirit and scope of the invention, and such changes and modifications are within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.