A kind of ambient light occlusion image drawing method and system that quickens based on GPU
Technical field
The present invention relates to a kind of computing machine sense of reality imaging field, be specifically related to a kind of ambient light occlusion image drawing method and system that quickens based on GPU.
Background technology
Computer graphics has had large development in a plurality of fields such as recreation, virtual reality, film special efficacys in recent years, is accompanied by the continuous increase of application demand, and people require also to improve day by day to the sense of reality of figure.Ambient light occlusion (AO, Ambient Occlusion) is ingredient important in the global illumination technology, AO describes is value of blocking of other objects in point and the scene on the body surface, in global illumination, use AO to come attenuate light to shine and reach the illumination value on surface, and then generate the shade increase figure sense of reality.The ambient light occlusion technology is the quantum jump that shade calculates, and by increasing the bright and dark light level in the shade, makes the shadow performance of object no longer uninteresting, dull, thereby has strengthened the validity of scene greatly.
The algorithm that traditional environment light blocks technology depends on the quantity on object model summit, launch many rays from each summit of object and carry out ray detection, according to ray by around the object situation of blocking calculate the AO information of each pixel, with the AO information stores in texture, i.e. AO pinup picture.Yet the remarkable defective of this algorithm is: at first, object model is complicated more, and what require calculation is many more with regard to the summit, and carrying out efficient is O (n); Secondly, algorithm has only calculated the ambient light occlusion information on summit, and algorithm is had relatively high expectations to logic determines, but not the illumination computing of pixel scale; At last, because algorithm is had relatively high expectations to logic determines, thereby can only calculate by the CPU processor, can't be lower by the logic determines ability, but the video card graphic process unit (GPU that parallel processing capability is very strong, Graphic Processing Unit) quicken processing, can't utilize the video card capabilities of advancing by leaps and bounds now, also inapplicable for the inferior engine from generation to generation that generally uses at present.When above-mentioned defective causes using traditional environment light to block the big game scene of one 4 square kilometres of algorithm computation, also take the time of continuous computing, so thereby do not have a practical value above 1 month with the fastest computing machine of arithmetic speed commonly used.
As patent of invention: the system and method (application number: 200910222067.9 of the sense of reality imaging that environment for use is blocked, priority date 2008.12.05), this method is middle mind-set hemisphere direction emission n bar ray with surface normal, judge whether each root ray is blocked, and the reckoner millet cake arrives the distance of joining as the AO weight, the add up AO value of every ray, as current surface point AO contribution margin, at last that each is surperficial AO data storage is in texture.The defective of this method is, at first, the complexity linear dependence of the complexity of algorithm and object model, object model is complicated more, and calculative surface is many more; Secondly, algorithm has only calculated the ambient light occlusion information on summit, and algorithm is had relatively high expectations to logic determines, but not the illumination computing of pixel scale; At last,, thereby can only calculate by the CPU processor owing to algorithm is had relatively high expectations to logic determines, can't be lower by the logic determines ability, but the very strong GPU processor of parallel processing capability quickens to handle.
Summary of the invention
The present invention is directed to the deficiencies in the prior art, a kind of complexity that does not rely on object model has been proposed, the ambient light occlusion image drawing method that uses the GPU processor to carry out parallel computation, and the ambient light occlusion image drawing system based on the GPU acceleration of this method of realization based on the GPU acceleration.
Technical scheme of the present invention is as follows:
A kind of ambient light occlusion image drawing method that quickens based on GPU, its step comprises:
1) set up the equally distributed dome type camera system of video camera in the CPU processor, the lens parameters of described video camera comprises the parallel projection matrix;
2) with in the described dome type camera system input GPU processor, select a video camera to take and wait to play up the object scene of certain limit on every side, acquisition waits to play up the object scene depth figure of certain limit on every side, carries the described scene depth value of waiting to play up pixel in the object among the described scene depth figure;
3) receive the vertex data of waiting to play up object, described vertex data comprises apex coordinate; Behind described apex coordinate and world coordinates matrix multiple, with described parallel projection matrix multiple, obtain playing up depth value again;
4), calculate the AO information of described pixel according to the scene depth value of described pixel with play up depth value;
5) the AO information basis with pixel generates AO pinup picture ID in advance, stores in the AO pinup picture;
6) the online scene rendering of described AO pinup picture being become displayable surround lighting exports.
In the described step 4), the computing method of described AO information comprise: with the scene depth value of described pixel with play up depth value and compare, when playing up depth value less than the scene depth value, then be considered as the influence that described pixel is subjected to video camera, output AO trial function is 1; When playing up depth value more than or equal to the scene depth value, then be considered as the influence that described pixel is not subjected to described video camera, output AO trial function is 0; All video cameras in the described dome type camera system are added with respect to the AO trial function of described pixel and after, promptly obtain the AO information of described pixel.
In the described step 6), the AO pinup picture is handled through pixel DIFFUSION TREATMENT and Gaussian Blur earlier, and the adjacent AO pinup picture in position after will handling again is spliced into a bigger AO pinup picture, at last described bigger AO pinup picture is carried out online scene rendering.
A kind of ambient light occlusion image drawing system that quickens based on GPU, it is characterized in that: it comprises display and GPU processor, wherein off-line rendering module, AO pinup picture generation module and the online rendering module that configuration connects successively in the GPU processor; The CPU processor imports the dome type camera system in the described off-line rendering module into, and described dome type camera system comprises equally distributed at random video camera, takes the scene depth figure that waits to play up object by described video camera; The scene depth value of described off-line rendering module read pixel point from described scene depth figure, and calculate the depth value of playing up of pixel according to the described apex coordinate of waiting to play up object, according to the scene depth value with play up the AO information that depth value calculates described pixel; Described AO pinup picture generation module generates the AO pinup picture according to the AO information of described pixel, and described AO pinup picture is imported online rendering module, and online scene rendering becomes displayable surround lighting by described display output.
Described video camera is taken described shooting object to be played up by parallel projection, and the lens parameters of described video camera comprises the parallel projection matrix.
In the described off-line rendering module, the method of playing up depth value and scene depth value of calculating described pixel according to the described apex coordinate of waiting to play up object comprises: at first, receive the described world coordinates matrix of waiting to play up object from the AGP bus, described apex coordinate and described world coordinates matrix multiple are obtained world coordinates, more described world coordinates and described parallel projection matrix multiple are obtained playing up depth value; Play up again behind described apex coordinate and the described parallel projection matrix multiple, obtain screen coordinate; Described screen coordinate takes out the described scene depth value of waiting to play up pixel in the object as the texture coordinate of taking a sample from described scene depth figure.
In the described off-line rendering module, the method that draws the AO information of described pixel comprises: with the scene depth value of described pixel with play up depth value and compare, when playing up depth value less than the scene depth value, then be considered as the influence that described pixel is subjected to video camera, output AO trial function is 1; When playing up depth value more than or equal to the scene depth value, then be considered as the influence that described pixel is not subjected to described video camera, output AO trial function is 0; All video cameras in the described dome type camera system are added with respect to the AO trial function of described pixel and after, promptly obtain the AO information of described pixel.
Between described AO pinup picture generation module and online rendering module, image processing module is set, described AO pinup picture is carried out pixel DIFFUSION TREATMENT, Gaussian Blur processing successively, and the adjacent AO pinup picture in the position after will handling is spliced into a bigger AO pinup picture.
Technique effect of the present invention is as follows:
A kind of ambient light occlusion image drawing method that quickens based on GPU of the present invention, its step comprises: set up the equally distributed dome type camera system of video camera in the CPU processor, the lens parameters of described video camera comprises the parallel projection matrix; 2) with in the described dome type camera system input GPU processor, select a video camera to take and wait to play up the object scene of certain limit on every side, obtain to wait to play up the object scene depth figure with scene depth value of certain limit on every side; 3) receive the vertex data of waiting to play up object, vertex data comprises apex coordinate, calculates the depth value of playing up of pixel according to apex coordinate; 4) according to the scene depth value of pixel with play up depth value, the AO information of calculating pixel point; 5) the AO information basis with pixel generates AO pinup picture ID in advance, stores in the AO pinup picture; 6) the online scene rendering of AO pinup picture being become displayable surround lighting exports.
Be different from the prior art from waiting that each summit of playing up object launches many rays to the hemisphere direction, the situation of being blocked by object to be played up on every side according to ray is calculated the method for the coefficient that is blocked on each summit; Method of the present invention is the center with object to be played up after setting up the dome type camera system, and the video camera from the hemisphere is to this center divergent-ray, by the situation that object to be played up on every side blocks, calculates the AO information that each waits to play up each pixel of object according to ray.Because the AO information calculations process of this method only comprises basic comparison operation, adds with computing and asks average calculating operation, thereby avoided the complex logic computing that requires in the prior art, make computation process in the GPU processor, to finish, utilize the very strong GPU processor of parallel processing capability, counting yield is improved greatly.
The adjacent AO pinup picture in position after method of the present invention will be handled is spliced into a bigger AO pinup picture, at last described bigger AO pinup picture is carried out online scene rendering; In the least possible AO pinup picture, the number of times that texture switches when reducing real-time rendering improves real-time rendering efficient with the AO information stores of waiting to play up object of adjacent position.
A kind of ambient light occlusion image drawing system that quickens based on GPU of the present invention is provided with off-line rendering module, AO pinup picture generation module and the online rendering module that connects successively in the GPU processor based on said method.Import in the described off-line rendering module by the dome type camera system that CPU generates, wherein video camera is taken described shooting object to be played up by parallel projection, the parallel projection matrix is used for the process at the video camera photographed scene, the 3D world coordinates of scene is converted to the screen palpable coordinate of 2D, eliminate the transparent effect of scene, it doesn't matter with the distance of waiting to play up object to make the size of waiting to play up object and video camera.
Description of drawings
Fig. 1 is a surround lighting principle of simulation synoptic diagram of the present invention
Fig. 2 is a system architecture synoptic diagram of the present invention
Fig. 3 is undressed AO pinup picture
Fig. 4 is through the AO pinup picture after the pixel DIFFUSION TREATMENT
Fig. 5 is the AO pinup picture after handling through Gaussian Blur
Fig. 6 is that little AO pinup picture is through spliced big AO pinup picture
Fig. 7 is through the scene in early morning after the online scene rendering
Fig. 8 is through the scene in daytime after the online scene rendering
Fig. 9 is through the dusk scene after the online scene rendering
Embodiment
The present invention will be described below in conjunction with accompanying drawing.
In the following description, some details are described so that whole understanding of the present invention to be provided.In an embodiment, show the known elements that realizes concrete function with the form of synoptic diagram or block diagram, so that outstanding technology emphasis, and can aspect unnecessary details, not blur the present invention.Outside the ratio, owing to contained about details disclosed in this areas such as network service, electromagnetic signal instruction technique, user interface or I/O technology, common-sense in those of ordinary skills' the understanding scope, thereby omitted above-mentioned ins and outs in an embodiment to the full extent, and do not think that these details are to obtain the necessary feature of complete skill scheme of the present invention.
As those of ordinary skills' understanding scope as can be known, embodiments of the present invention can be system, method or computer program, therefore, the present invention can take the form of complete hardware embodiment, complete software implementation example (comprising computer firmware, resident software, microcode etc.), combination thereof embodiment, and the form of the foregoing description can be summarised as " module " or " system " respectively.The present invention can adopt one or more computing machines can with or any combination of computer-readable medium, wherein computing machine can with or computer-readable medium can be based on electronics, magnetic, light, electromagnetism, infrared or semi-conductive system, device, equipment or propagation medium or the like.
With reference to the accompanying drawings, the specific embodiment of the present invention is described:
Since the GPU processor adopting parallel architecture, make the computing velocity of GPU processor on graphics process be higher than CPU far away, the more important thing is, the GPU processor has had increasing programmability, and this makes the present invention the computation-intensive computing to be transferred in the GPU processor by the CPU processor by programming and finishes.The render process of GPU processor can be finished by vertex shader (VS, Vertex Shader) and these two functional modules of pixel coloring device (PS, Pixel Shader).When playing up beginning, video card is from figure accelerate bus (AGP, Accelerate Graphical Port) receives the vertex data of waiting to play up object, vertex data comprises apex coordinate, UV value etc., wherein, the summit that apex coordinate is used for representing to wait to play up object is in the position of waiting to play up the object model space, and the UV value is used for sampling from the pinup picture texture in the PS stage, video card can carry out linear interpolation to UV automatically, corresponds on the pixel of indicator screen.Each vertex data is admitted to vertex shader successively and carries out work such as coordinate transform, surround lighting calculating, and the result of conversion is directly available to the screen coordinate system each the triangle conversion that comprises the summit.The pixel that needs afterwards to draw is sent to work such as carrying out pinup picture pixel value, pinup picture mixing in the pixel coloring device, fills which pixel and is by the linear interpolation to the screen coordinate on summit and decide.VS in the stage video card receive the world coordinates matrix wait to play up object from the AGP bus.
As shown in Figure 1, surround lighting is produced by a lot of video camera actings in conjunction in reality, thereby but atmospheric envelope 1 is imagined as a large-scale hemisphere, and the scattering of sunshine on atmospheric envelope 1 formed countless video camera 2, and it is exactly final surround lighting that video camera 2 is treated the set of playing up the object influence.The present invention is modeled as a hemisphere system with atmospheric envelope 1, the video camera 2 of even distribution sufficient amount on the hemisphere, and video camera 2 is to waiting to play up object center divergent-ray; Then each video camera 2 is taken as viewpoint and waited to play up object and adjacent object thereof, obtain to be subjected to that this video camera 2 influences is the scene at center with object to be played up; At last the situation of being blocked by object to be played up on every side according to ray is calculated the coefficient that is blocked of each pixel, and the process of ray detection is played up by video card GPU processor and finished.
As shown in Figure 2, system of the present invention comprises dome type camera system 4, off-line rendering module 5, AO pinup picture generation module 6, image processing module 7, online rendering module 8, and wherein off-line rendering module 5, AO pinup picture generation module 6, image processing module 7, online rendering module 8 use the rendering pipeline of GPU to carry out work.Set up dome type camera system 4 in the CPU processor in advance, dome type camera system 4 is used for equally distributed at random video camera 2 on the simulating reality mesosphere.The lens parameters of dome type camera system 4 generates the back in advance by program and imports the GPU pipeline into as the shader parameter, video camera 2 is taken and is waited to play up the object scene of certain limit on every side, obtains to wait to play up the object scene depth figure that belongs to camera coordinate system of certain limit on every side.All be provided with lens parameters for each video camera 2 in the dome type camera system 4, lens parameters comprises lens location and parallel projection matrix, and the purpose of using parallel projection here is the space that makes full use of pinup picture.Wherein the parallel projection matrix is used for the process at video camera 2 photographed scenes, converts the 3D world coordinates of scene the screen palpable coordinate of 2D to, eliminates the transparent effect of scene, and it doesn't matter with the distance of waiting to play up object to make the size of waiting to play up object and video camera 2.
AO pinup picture generation module 6 is for waiting that playing up object generates AO pinup picture ID and UV skew in advance, for what will in off-line rendering module 5, generate, the waiting of adjacent position played up the AO information stores of object pixel scale in the least possible UV pinup picture, the number of times that texture switches when reducing real-time rendering improves rendering efficiency.AO pinup picture ID has indicated the store path of AO pinup picture; The UV value is the coordinate of pixel in the AO pinup picture; When UV skew is spliced into a big OA pinup picture according to a plurality of little OA pinup pictures in the image processing module 7, the position transformational relation of same pixel between calculates, UV skew is set is in the enforcement render process for online rendering module 8, read AO information the big OA pinup picture after piecing together easily.
In off-line rendering module 5, treat when playing up object and playing up, at first in dome type camera system 4, select a video camera 2 to take object to be played up as viewpoint, acquisition waits to play up the object scene depth figure that belongs to camera coordinate system of certain limit on every side, carries the scene depth value of each pixel of object in the scene among the scene depth figure; Secondly receive the described world coordinates matrix of waiting to play up object from the AGP bus, apex coordinate and described world coordinates matrix multiple are obtained world coordinates, again world coordinates and parallel projection matrix multiple are obtained playing up depth value; Play up again behind apex coordinate and the parallel projection matrix multiple, obtain screen coordinate; Screen coordinate takes out the scene depth value of waiting to play up pixel in the object as the texture coordinate of taking a sample from scene depth figure; At last with the scene depth value of pixel with play up depth value and compare, when playing up depth value less than the scene depth value, then be considered as the influence that this pixel is subjected to video camera 2, output AO trial function is 1, and when playing up depth value more than or equal to the scene depth value, then be considered as the influence that this pixel is not subjected to video camera 2, output AO trial function is 0; The AO trial function of all video cameras 2 in the dome type camera system 4 with respect to this pixel added and the back standardization, promptly obtain the AO information of this pixel.The AO information of each pixel of waiting to play up object is according to generating the AO pinup picture in the AO pinup picture ID input AO pinup picture generation module 6 and preserving.
Said process only need be played up by twice, promptly can obtain video camera 2 and treat the AO information of playing up object, and the complexity of the process of calculating AO information and object is irrelevant, so just simplify logical calculated greatly, make computation process can utilize effective logic determines ability lower, but the very strong video card graphic process unit of parallel processing capability is quickened to handle.
As shown in Figure 3, Figure 4, because the size of AO pinup picture is less, when sampling, causes the error of a pixel easily, thereby sample pixel value invalid in the AO pinup picture.In order to address this problem, the present invention will carry out the pixel DIFFUSION TREATMENT in the AO pinup picture input picture processing module 7 of preserving in the AO pinup picture generation module 6.Judge in image processing module 7 whether current pixel is invalid, if invalid then 8 pixel around it of sampling, the coordinate array of structure 3 * 3; Effective pixel color value in the traversal coordinate array, and with the color value of first valid pixel color value as current inactive pixels, and current inactive pixels is set for effectively, finally generate effective AO pinup picture.As shown in Figure 5,, effective AO pinup picture is carried out a Gaussian Blur handle, obtain new AO pinup picture in order to eliminate the sawtooth that exists in the AO pinup picture.
As shown in Figure 6, in order to reduce the quantity of AO pinup picture, after in image processing module 7, Gaussian Blur being handled, the adjacent less AO pinup picture in the position of world coordinates in the same area is spliced into a bigger AO pinup picture.At first, all objects in the traversal scene, the little AO pinup picture that will meet the splicing condition is put into same tabulation, the size sum of the little AO pinup picture in the calculations list; Select multiple size forms such as 1024 * 1024,1024 * 512,512 * 512,512 * 256,256 * 256,256 * 128,128 * 128,128 * 64,64 * 64,64 * 32,32 * 32 according to the capacity situation of hardware then, select corresponding form to splice according to large scale pinup picture principle of priority, purpose is to reduce the quantity of pinup picture.
As Fig. 7, Fig. 8, shown in Figure 9, for the surround lighting that makes scene is full of variety, realization such as early morning, daytime, at dusk, late into the night etc. nature Changes in weather, the present invention imports online rendering module 8 with the big AO pinup picture that generates after treatment in the image processing module 7, according to the adjusting parameter of the online input of game server, the online scene rendering of big AO pinup picture of handling the back generation is become surround lighting.The adjusting parameter of online scene rendering comprises that the surround lighting cardinal sum blocks coefficient, and wherein the surround lighting radix is not rely on AO information, is used for the brightness of whole control scene; Blocking coefficient is the weight of AO information, regulates the power of local relationship between light and dark contrast.Then can pass through the surround lighting that equation (1) calculates scene, be used for showing to the user:
Surround lighting=(blocking coefficient * AO information+surround lighting radix) * ambient light color * surround lighting material (1)
Should be pointed out that the above embodiment can make those skilled in the art more fully understand the invention, but do not limit the present invention in any way creation.Therefore; although this instructions has been described in detail the invention with reference to drawings and Examples; but; those skilled in the art are to be understood that; still can make amendment or be equal to replacement the invention; in a word, all do not break away from the technical scheme and the improvement thereof of the spirit and scope of the invention, and it all should be encompassed in the middle of the protection domain of the invention patent.