KR20100128337A

KR20100128337A - Architectures for parallelized intersection testing and shading for ray-tracing rendering

Info

Publication number: KR20100128337A
Application number: KR1020107023579A
Authority: KR
Inventors: 루크 틸만 피터슨; 제임스 알렉산더 맥콤; 리안 알. 살스버리; 스테판 퍼셀
Original assignee: 카우스틱 그래픽스, 아이엔씨.
Priority date: 2008-03-21
Filing date: 2009-03-20
Publication date: 2010-12-07
Also published as: CN102037497B; WO2009117691A3; KR101550477B1; JP5485257B2; JP5740704B2; JP2014089773A; JP2011515766A; WO2009117691A2; WO2009117691A4; CN104112291B; CN104112291A; CN102037497A

Abstract

일 예에서, 장면에 대한 광선 추적 방법은, 링크/큐를 통해 집합적으로 통신 가능한, 복수의 세이딩 자원에 연결된 복수의 교차 테스트 자원을 사용하는 것을 포함한다. 세이딩에 대한 테스트로부터 생성된 큐는 개별적인 광선/원형 교차부 식별결과를 포함하고, 이는 광선 식별기를 포함한다. 테스트 큐의 세이딩은 테스트를 위한 새로운 광선의 식별기를 포함하고, 광선을 정의하는 데이터는 교차 테스트 자원 사이에 분산된 메모리에 개별적으로 저장된다. 광선 정의 데이터는, 광선이 교차 테스트를 완료할 때까지 분산된 메모리에서 유지될 수 있으며, 광선 식별기에 근거하여 여러 번 테스트를 위해 선택될 수 있다. 가속 모양의 구조가 사용될 수 있다. 광선 식별기 및 모양 데이터의 패킷은 교차 테스트 자원 사이에서 순환할 수 있고, 각각의 자원은 패킷에서 식별되며, 그 정의 데이터가 메모리에 존재하는 광선을 테스트할 수 있다. 가속 모양 테스트 결과는 교차된 모양에 근거하여 광선의 수집을 허용하고, 가장 인접한 검출 광선/원형 교차부가 세이딩을 위해 광신 식별기를 큐잉함으로서 식별된다. In one example, the ray tracing method for the scene includes using a plurality of cross test resources coupled to the plurality of shading resources, which are collectively communicateable via a link / queue. The cue generated from the test for shading contains individual ray / circular intersection identifications, which include ray identifiers. The shading of the test queue includes an identifier of the new beam for testing, and the data defining the beam are stored separately in memory distributed among the cross test resources. The ray definition data may be maintained in distributed memory until the ray completes the crossover test and may be selected for testing multiple times based on the ray identifier. Acceleration shaped structures can be used. Packets of ray identifiers and shape data may cycle between cross test resources, each resource identified in the packet, and the ray of definition data being tested in the memory being tested. The accelerated shape test results allow the collection of light rays based on the crossed shape and the closest detected light / circular intersections are identified by queuing the fanatic identifier for shading.

Description

ARCHITECTURES FOR PARALLELIZED INTERSECTION TESTING AND SHADING FOR RAY-TRACING RENDERING}

본 발명은 3차원 장면으로부터 이차원 표현(representations)을 렌더링하는 방법에 관한 것으로, 더 구체적으로는 장면의 실사 이차원 표현을 가속 렌더링하기 위해 광선 추적법(ray tracing)을 사용하는 방법에 관한 것이다.The present invention relates to a method of rendering two-dimensional representations from a three-dimensional scene, and more particularly to a method of using ray tracing to accelerate rendering a photorealistic two-dimensional representation of a scene.

광선 추적법을 이용하여 실사 이미지를 렌더링하는 방법은 컴퓨터 그래픽 기술 분야에 공지되어 있다. 광선 추적법(ray tracing)은 실사 이미지(현실적인 그림자 및 조명 효과를 포함하는)를 생성하기 위한 것으로 알려져 있다. 광선 추적법은 장면의 구성요소와 빛의 상호 작용에 대한 물리적 동작을 모델링할 수 있기 때문이다. 그러나, 광선 추적법은 계산 집약적으로 알려져 있으며, 현재, 아트 그래픽 워크 스테이션에서도, 광선 추적법을 사용하여 복합적인 장면을 렌더링하기 위해서 상당한 시간을 요한다.Methods of rendering photorealistic images using ray tracing are known in the computer graphics art. Ray tracing is known to produce photorealistic images (including realistic shadows and lighting effects). This is because ray tracing can model the physical behavior of light's interaction with components in the scene. However, ray tracing is known computationally intensive, and even now, even in art graphics workstations, it takes considerable time to render complex scenes using ray tracing.

광선 추적법은 보통, 장면 내 구조물(structure)들의 표면을 묘사하는 삼각형과 같은 기하학적 원형(primitive)으로 구성된 장면 디스크립션(description)을 획득하는 것, 그리고 카메라로부터 시작하여, 장면 객체와의 수많은 가능한 상호작용을 거쳐, 광원을 차단하거나 또는 광원과 교차하지 않고 장면을 빠져나갈 때까지, 광선을 추적함으로써, 장면 내 원형들과 빛이 어떻게 상호작용하는지를 모델링하는 것과 관련된다.Ray tracing usually involves obtaining a scene description consisting of geometric primitives, such as triangles, that depict the surfaces of the structures in the scene, and starting with the camera, numerous possible interactions with scene objects. Actions involve modeling how light interacts with the circles in the scene by tracing the rays until they exit the scene without blocking or intersecting the light source.

예를 들어, 장면(scene)은 거리의 어느 한 편에 위치한 빌딩과 함께, 거리 상의 자동차를 포함할 수 있다. 이러한 장면 내의 자동차는 연속적인 표면에 근접하는 다량의 삼각형(예, 100만 개의 삼각형)에 의해 정의될 수 있다. 장면이 보이는 카메라 위치가 정의된다. 카메라로부터 나온 광선(ray)을 종종 일차 광선(primiary ray)이라 하며, 예를 들면 반사를 가능하기 하기 위해, 하나의 객체에서 다른 객체로 방출된 광선을 이차 광선이라 한다. 선택된 해상도(예, SVGA 디스플레이 용의 1024x768)의 이미지 평면이 카메라와 장면(secence) 사이의 선택된 위치에 배치된다.For example, a scene may include a car on a street, along with a building located on either side of the street. The car in such a scene can be defined by a large amount of triangles (eg, one million triangles) proximate a continuous surface. The camera position at which the scene is visible is defined. Rays from cameras are often called primary rays, and rays emitted from one object to another, for example to enable reflection, are called secondary rays. An image plane of a selected resolution (eg, 1024x768 for SVGA display) is placed at a selected location between the camera and the scene.

알고리즘을 추적하는 가장 간단한 광선 추적법은 카메라로부터 이미지의 각 픽셀을 통해 장면으로 하나 이상의 광선을 주사(casting)하는 것을 포함한다. 각각의 광선은 이어서 이 광선이 교차하는 원형(primitive)을 식별하기 위한 장면을 구성하는 각각의 원형에 대해 테스트되고, 이 원형이 예를 들면, 반사 및/또는 회절시키는 광선에 미치는 영향을 판단한다. 이러한 반사 및/또는 회절은 서로 다른 방향으로 광선이 진행하도록 하며, 광선을, 서로 다른 경로를 취하는 여러 개의 이차 광선으로 분할한다. 이러한 이차 광선 모두는 이어서 이들이 교차하는 원형을 결정하기 위해 장면 원형들에 대해 테스트되고, 이차(및 삼차, 등) 광선이 예를 들면 장면 이탈 또는 광원 충돌에 의해 소거될 때까지, 반복적으로 계속된다. 이러한 광선/원형이 모두 결정되는 동안 이들을 맵핑하는 트리가 생성된다. 광선이 소거된 후, 광원의 기여도(contribution)가 트리를 통해 역 추적되고, 장면의 픽셀에 대한 효과가 결정된다. 이미 이해할 수 있는 바와 같이, 수백만 개의 삼각형과의 교차에 대해 (예를 들면) 1024x768 개의 광선을 테스트하는 계산상의 복잡성으로 인해, 계산상 고비용이 되며, 이러한 광선 수(numbers)는 광선 교차를 이용한 물질 교차의 결과로 발생된 모든 추가 광선을 고려하지 않은 것이다. The simplest ray tracing method of tracking algorithms involves casting one or more rays into the scene from each camera through each pixel of the image. Each ray is then tested for each circle that constitutes a scene to identify primitives that the ray intersects, and determines the effect of the circle on the ray that reflects and / or diffracts, for example. . This reflection and / or diffraction causes the light to travel in different directions and splits the light into several secondary rays that take different paths. All of these secondary rays are then tested against scene primitives to determine the circles with which they intersect, and then repeat repeatedly until secondary (and tertiary, etc.) rays are canceled by, for example, out of scene or light source collision. . While all these rays / circles are determined, a tree is created that maps them. After the ray is canceled, the contribution of the light source is traced back through the tree, and the effect on the pixels of the scene is determined. As can be appreciated, the computational complexity of testing 1024x768 rays (for example) against the intersection with millions of triangles is computationally expensive, and these numbers of rays are the materials that use ray intersection. It does not take into account any additional rays generated as a result of the intersection.

광선 추적법을 이용하여 장면을 렌더링하는 것을 "난감한 병렬 문제(embarrassingly paralle problem)"라 하는데, 이는 생성되는 이미지의 각 픽셀에 누적된 색 정보가 이미지의 나머지 픽셀과 독립적으로 누적될 수 있기 때문이다. 따라서, 최종 이미지를 출력하기 전에 픽셀에 대한 일정한 필터링, 보간 또는 그 외의 처리가 존재함에도, 이미지 픽셀에 대한 색 정보는 병렬적으로 결정될 수 있다. 따라서, 프로세싱 자원들 중 렌터링 될 픽셀을 분할하고, 이러한 픽셀의 렌더링을 병렬로 수행함으로써 지정된 세트의 처리 자원에 대한 이미지를 광선 추적하는 작업(task)을 분할하는 것이 간단하다.Rendering a scene using ray tracing is called an "embarrassingly paralle problem," because the color information accumulated in each pixel of the resulting image can be accumulated independently of the rest of the pixels in the image. . Thus, even though there is constant filtering, interpolation, or other processing for the pixels before outputting the final image, the color information for the image pixels can be determined in parallel. Thus, it is simple to divide the task of ray tracing an image for a specified set of processing resources by dividing the pixels to be rendered among the processing resources and performing the rendering of these pixels in parallel.

일부의 경우에, 프로세싱 자원은 다중 스레딩을 지원하는 연산 플랫폼일 수 있으며, 다른 경우에는 LAN 상에 연결된 컴퓨터 클러스터 또는 컴퓨터 코어 클러스터에 관련된다. 이러한 유형의 시스템에 관하여, 지정된 프로세싱 자원(예, 스레드)는 교차 테스팅 및 세이딩(shading)의 완료를 통해 할당된 광선 또는 광선 그룹을 처리하기 위한 예가 될 수 있다. 다르게 설명하면, 픽셀이 서로 독립적으로 렌더링될 수 있는 속성을 이용하여, 서로 다른 픽셀에 기여한 것으로 알려진 광선이, 교차 테스트 될 스레드 또는 프로세싱 자원 사이에 분할도리 수 있으며, 이어서 이러한 교차(intersections)가 세이딩(shade)되고, 이러한 세이딩 연산의 결과를 처리 또는 디스플레이될 스크린 버퍼에 기록한다.In some cases, the processing resource may be a computing platform that supports multiple threading, in other cases involving a computer cluster or computer core cluster connected on a LAN. With respect to this type of system, the designated processing resource (e.g., thread) may be an example for processing allocated rays or groups of rays through the completion of cross testing and shading. In other words, with the property that pixels can be rendered independently of one another, rays known to contribute to different pixels can be split between threads or processing resources to be cross-tested, and then these intersections It is shaded and writes the result of this shading operation to the screen buffer to be processed or displayed.

이러한 종류의 문제를 해결하기 위한 일부 알고리즘 접근법이 제안되었다. 이러한 접근법의 하나는 Matt Pharr 등에 의해 "Rendering Complex Secene with Memory-Coherent Ray Tracing"(Proceedings of SigGraph(1997), 이하에서 "Pharr" 라 함)에 개시된다. Pharr는 광선 추적될 장면을 기하학적 3D 화소(voxel, 복셀) 분할하는 방법에 대해 기술하며, 여기서 각각의 기하학적 복셀은 장면 원형(primitive, 예, 삼각형)을 둘러싸는 정육면체이다. Pharr는 또한 스케줄링 그리드를 중첩(superimposing)하는 방법에 대해 기술하며, 여기서 스케줄링 그리드의 각각의 구성요소는 기하학적 복셀의 일부와 겹쳐질 수 있는 스케줄링 복셀이다(즉, 스케줄링 복셀은 기하학적 복셀의 정육면체와 다른 크기를 가지는 장면 내 볼륨 정육면체이다. 각각의 스케줄링 복셀은 관련 광선 큐(queue)를 가지며, 이는 스케줄링 복셀 내부에 현재 존재하는 광선, 즉 스케줄링 복셀 내에 둘러싸인 광선과, 어떠한 기하학적 복셀이 스케줄링 복셀과 겹치는지에 관한 정보를 포함한다. Some algorithmic approaches have been proposed to solve this kind of problem. One such approach is described by Matt Pharr et al in "Rendering Complex Secene with Memory-Coherent Ray Tracing" (Proceedings of SigGraph (1997), hereinafter referred to as "Pharr"). Pharr describes a method of segmenting a scene to be ray traced into geometric 3D pixels (voxels), where each geometric voxel is a cube surrounding a primitive (eg, triangle). Pharr also describes a method of superimposing a scheduling grid, where each component of the scheduling grid is a scheduling voxel that may overlap with a portion of the geometric voxel (ie, the scheduling voxel is different from the cube of the geometric voxel). Each scheduling voxel has an associated ray queue, which depends on the rays currently present inside the scheduling voxel, i.e. the rays enclosed within the scheduling voxel, and which geometric voxels overlap the scheduling voxel. Contains information about

Pharr는 스케줄링 복셀이 처리될 때, 관련 큐 내의 광선이 스케줄링 복셀에 둘러싸인 기하학적 복셀 내의 원형과의 교차부(intersection)가 테스트된다. 광선과 원형 사이의 교차부가 발견된 경우에, 세이딩 연산이 수행되고, 이는 결과적으로 광선 큐에 부가되는 광선을 생성한다. 이러한 스케줄링 복셀에 교차부가 발견되지 않으면, 광선이 다음의 비-공백 스케줄링 복셀로 진행하고 스케줄링 복셀의 광선 큐에 배치된다. Pharr is tested for the intersection with a circle in a geometric voxel where the rays in the associated queue are surrounded by the scheduling voxel when the scheduling voxel is processed. If the intersection between the ray and the circle is found, a shading operation is performed, which results in the ray being added to the ray queue. If no intersection is found in this scheduling voxel, the beam proceeds to the next non-blank scheduling voxel and is placed in the beam queue of the scheduling voxel.

Pharr는, 이러한 접근법이 추구하는 목적이 장면의 도형을 범용 프로세서에 의해 통상적으로 제공되는 캐시에 맞게 하여, 각각의 스케줄링 복셀 내의 장명 도형이 캐시에 맞춰지는 경우에, 캐시가 이러한 장면 도형과 광선의 교차 테스트 중에 과도한 트래시(하중)를 받지 않도록 한다는 내용을 기술한다.Pharr believes that the goal of this approach is to adapt the scene's geometry to a cache typically provided by a general purpose processor, so that if the lifespan geometry in each scheduling voxel fits into the cache, the cache will be State that you are not subjected to excessive trash (load) during the cross test.

또한, Pharr는, 스케줄링 복셀 내 테스트를 위한 광선을 큐잉함으로써, 원형이 기하학적 캐시로 페치(fetch)될 때, 더 많은 작업이 캐시 상에서 수행될 수 있다는 내용을 기술한다. 다중 스케줄링 복셀에 다음 처리가 이루어지는 경우에, 스케줄링 알고리즘은 기하학적 캐시로 로드될 필요가 있는 도형의 양을 최소화하는 스케줄링 복셀을 선택할 수 있다. Pharr also describes that by queuing a ray for testing in a scheduling voxel, more work can be performed on the cache when the prototype is fetched into the geometric cache. If the next processing is to a multiple scheduling voxel, the scheduling algorithm may select a scheduling voxel that minimizes the amount of geometry that needs to be loaded into the geometric cache.

Pharr는, 특정한 장면이 불균일한 복잡성(에, 장면의 일부에서의 원형의 고 밀도)을 가지는 경우에, 제안된 규칙적인 스케줄링 그리드가 잘 동작하지 않는다는 것을 인식한다. Pharr는 팔진트리(Octree, 옥트리)와 같은 적응성 데이터 구조가 규칙적인 스케줄링 그리드 대신에 사용될 수 있다고 가정한다. 옥트리는 계층(hierarchy)의 각 레벨에서, 장면의 각각의 주 축(예, x, y, 및 z 축)을 따라 부분분할(subdivision) 되도록 함으로써 삼차원 장면 내의 공간적 부분분할을 도입하여, 옥트리 부분분할이 8개의 작은 서브-볼륨을 생성하고, 이들은 예를 들면 각각 8 개의 더 작은 서브-볼륨을 생성할 수 있다. 각각의 서브-볼륨에서, 분할/비분할 플래그(divide/do not divide flag)는 서브-볼륨이 추가로 분할되는지 아닌지 여부를 결정하도록 설정된다. 이러한 서브-볼륨들은, 서브-볼륨 내의 원형(primitive)의 수가 테스트하기에 충분할 만큼 작아질 때까지 부분분할을 표시한다. 따라서, 옥트리에 관하여, 부분분할의 양은 장면의 특정한 부분에 얼마나 많은 원형이 존재하는가에 따라 제어될 수 있다. 이와 같이, 옥트리는 렌더링 될 볼륨의 부분분할의 정도를 변경할 수 있다.Pharr recognizes that the proposed regular scheduling grid does not work well when a particular scene has non-uniform complexity (eg, high density of prototypes in a portion of the scene). Pharr assumes that adaptive data structures such as octrees can be used in place of regular scheduling grids. The octree introduces spatial subdivision within the three-dimensional scene by subdivisioning at each level of the hierarchy along each major axis of the scene (e.g., the x, y, and z axes), thereby octet subdivision. These eight small sub-volumes are produced, which can for example produce eight smaller sub-volumes each. In each sub-volume, a divide / do not divide flag is set to determine whether or not the sub-volume is further divided. These sub-volumes indicate partial partitioning until the number of primitives in the sub-volume is small enough to test. Thus, with respect to the octree, the amount of subdivision can be controlled depending on how many prototypes exist in a particular part of the scene. In this way, the octree can change the degree of subdivision of the volume to be rendered.

Pfister의 미국 특허 번호 6,556,200(이하 'Pfister'라 함)에 유사한 접근법이 설명된다. Pfister는 복수의 스케줄링 블록으로 장면을 분할하는 방법에 대해 기술한다. 광선 큐(queue)가 각각의 블럭으로 제공되고, 각각의 큐 내의 광선이 공간적으로 그리고 임시적으로 의존도 그래프(dependency graph)를 이용하여 순서가 매겨진다. 광선은 의존도 그래프 내에 정의된 순서에 따라 각각의 스케줄링 블록을 통해 추적된다. Pfister는 Pharr의 논문을 참조하고, 하나 이상의 단일형의 그래픽 원형(예, 삼각형에만 한정되는 것은 아님)을 렌더링하고, 스케줄링 블록에 대해 더 복잡한 스케줄링 알고리즘을 고안하기 위한 Pfister의 목적을 추가한다. Pfister는 또한 메모리 계층 내의 다중 캐시 레벨에서 장면 도형의 스테이지 상 하위-부분을 고려한다.A similar approach is described in Pfister US Pat. No. 6,556,200 (hereinafter referred to as 'Pfister'). Pfister describes a method of dividing a scene into a plurality of scheduling blocks. Ray queues are provided in each block, and the rays in each queue are ordered spatially and temporarily using a dependency graph. Rays are tracked through each scheduling block in the order defined in the dependency graph. Pfister consults Pharr's paper, adds the purpose of Pfister to render one or more single graphical prototypes (eg, not limited to triangles), and to devise more complex scheduling algorithms for scheduling blocks. Pfister also considers the sub-parts on the stage of the scene figure at multiple cache levels in the memory hierarchy.

또 다른 접근법은 패킷 추적법(packet tracing)에 관한 것이며, 이러한 패킷 추적법에 대한 일반적인 참조문헌은 Ingo Wald, Philip Slusallek Carsten Benthin 등에 의한 논문 "Interactive Rendering through Coherent Ray Tracing)(Proceedings of EUROGRAPHICS 2001, pp 153-164, 20(3), Machester, United Kindom (Sep. 2001)"이다. 이와 같은 참조문헌에서, 패킷 추적법은 그리드를 통해 유사한 원점(origins)과 방향을 가지는 광선의 패킷을 추적하는 것과 관련된다. 이 광선을 실질적으로 동일한 그리드 위치에서 나와 실질적으로 유사한 방향으로 이동하기 때문에, 이러한 광선의 대부분이 동일한 그리드 위치를 통과해 진행한다. 따라서, 유사한 원점으로부터, 유사한 방향으로 이동하는 광선을 식별하는 것이 필요하다. 이러한 패킷 추적법의 다른 변형예는 광선 패킷의 에지를 구분하기 위해 귀퉁이가 잘린(frustrum, 절두) 광선을 사용하여, 절두 광선이 교차된 복셀을 결정하는데 사용되도록 하고, 이는 지정된 광선 패킷에 대한 연산 수를 줄이는데 도움을 준다(즉, 모든 광선이 교차에 관해 테스트되는 것이 아니라, 패킷의 외부 에지상의 광선만이 테스트 됨). 패킷 추적법은 여전히 유사한 장소에서 나와 유사한 방향으로 진행하는 광선을 식별하여야 한다. 광선 추적 중에 광선이 반사, 회절 및/또는 생성되기 때문에, 이러한 광선을 식별하는 것이 점점 더 어려워질 수 있다. Another approach relates to packet tracing, and a general reference to such packet tracing is described by Ingo Wald, Philip Slusallek Carsten Benthin et al. "Interactive Rendering through Coherent Ray Tracing (Proceedings of EUROGRAPHICS 2001, pp 153-164, 20 (3), Machester, United Kindom (Sep. 2001). ”In these references, packet tracking is similar to tracking packets of rays with similar origins and directions through the grid. Most of these rays travel through the same grid position as they move out of substantially the same grid position, thus identifying rays that move in similar directions from similar origins. Another variant of this packet tracing method is the frustration to distinguish the edges of a ray packet. Using rum truncation, the truncation beam is used to determine the crossed voxels, which helps to reduce the number of operations for a given beam packet (ie, not all beams are tested for crossing, but packets). Only the rays on the outer edge of are tested.) The packet tracing method should still identify the rays traveling from a similar location in a similar direction, as these rays are reflected, diffracted and / or generated during ray tracing. It can be harder and harder to do.

또 다른 접근법은 광선 추적법을 가속화하는 분야에 존재한다. 즉, 하나의 접근법은 광선 상태를 더 능동적으로 관리함으로 캐시 사용을 개선하려는 시도를 한다. Navratil 등의 논문 "Dynamic Ray Scheduling for improved System Performance"(2007 IEEE Symposium on Interactive Ray Tracing (Sep. 2007))은 Pharr의 접근법은 메인 메모리가 프로세서 캐시 트래픽에 적합하지 않게 하는 "광선 상태 폭발(ray state explosion)"(이하, 'Navratil' 이라함)의 단점을 가지는 것으로 설명하고 있다. 이를 해결하기 위해, Navratil은 광선 추적 중에, 광선 상태 및 도형 상태를 "능동적으로 관리"하는데 필요한 제한사항을 가짐으로써, "광선 상태 폭발"을 방지할 것을 제안한다. 하나의 제안사항은 여러 광선의 생성을 별개로 추적하고, 이에 따라 Navratil은 먼저 일차 광선을 추적하고, 이어서 일차 광선이 끝난 뒤에 이차 광선을 추적하는 등과 같은 내용을 설명한다.Another approach exists in the field of accelerating ray tracing. In other words, one approach attempts to improve cache usage by more actively managing ray conditions. Navratil et al. "Dynamic Ray Scheduling for improved System Performance" (2007 IEEE Symposium on Interactive Ray Tracing (Sep. 2007)) suggests that Pharr's approach is a "ray state explosion" that makes main memory unsuitable for processor cache traffic. explosion) "(hereinafter referred to as" Navratil "). To address this, Navratil proposes to prevent "ray state explosion" by having the necessary constraints to "actively manage" ray state and figure state during ray tracing. One suggestion is to track the generation of several rays separately, so Navratil first tracks the primary rays, then the secondary rays after the primary rays are finished.

위와 같은 배경지식은 렌더링에 근거한 광선-추적법을 가속화하는 영역에서 일반적인 사상 및 접근방식의 다양성을 보여준다. 또한, 이러한 참조문헌은 광선 추적법 분야에서 추가적인 장점을 가지는 것으로 보여진다. 그러나, 이러한 참조문헌 및 기술 중 어느 하나에 대한 논의가, 이러한 참조문헌 중 어느 하나 또는 이들의 중심 주제가 이 명세서에 개시된 임의의 주제에 대한 종래 기술이라는 시인 또는 암시인 것은 아니다. 오히려, 이러한 참조문헌은 광선 추적법을 이용한 렌더링에 대한 접근방식으로 차이점을 부각시키는 것을 돕기 위한 것이다. 나아가, 명확한 설명을 위해 이러한 참조문헌 중 임의의 것의 처리가 생략되어야 하고, 명확히 설명될 필요는 없다. The background above shows a variety of general ideas and approaches in the area that accelerate ray-tracing based rendering. This reference also appears to have additional advantages in the field of ray tracing. However, a discussion of any one of these references or techniques is not an admission or suggestion that any one of these references or their central subject matter is prior art on any subject disclosed herein. Rather, these references are intended to help highlight the differences with an approach to ray tracing rendering. Furthermore, the processing of any of these references should be omitted for clarity and need not be clearly described.

일 측면에서, 본 발명의 방법은 3-D 장면의 2-D 표현(representation)을 광선 추적할 시에 복수의 연산 자원을 이용한다. 이 방법은 3-D 장면에서 광선 이동을 이용하여, 하나 이상의 원형 및 도형 가속 구성요소를 포함하는 기하학적 모양을 교차 테스트하기 위한 연산 자원으로 구성된 제 1 서브세트를 사용하는 단계를 포함한다. 제 1 서브세트의 각각의 연산 자원은 장면에서 이동하는 광선의 개별적인 서브세트를 저장하는 개개의 로컬화된 메모리 자원과 통신하도록 동작할 수 있다. 이 방법은 연산 자원의 제 1 서브세트로부터 연산 자원의 제 2 서브세트로 광선과 원형들 사이의 교차부를 식별기를 전송하고, 광선과 원형들 사이의 식별된 교차부와 관련된 세이딩 루틴을 실행하기 위해 연산 자원의 제 2 세트를 사용하는 단계를 포함하며, 세이딩 루틴으로부터의 출력은 교차 테스트 될 새로운 광선을 포함한다. 서브세트 내의 멤버십이 시간-변수일 수 있으며, 또는 시스템 환경 설정시 또는 장면이나 일련의 장면 중 하나를 렌더링하는 동안 인식 포인트 중에 통계적으로 결정된다. In one aspect, the method utilizes a plurality of computational resources in ray tracing a 2-D representation of a 3-D scene. The method includes using a first subset of computational resources for cross testing a geometric shape comprising one or more circular and geometric acceleration components using ray movement in a 3-D scene. Each computing resource of the first subset may be operable to communicate with an individual localized memory resource that stores an individual subset of rays traveling in the scene. The method sends an identifier between the rays and the circles from the first subset of the computational resources to the second subset of the computational resources and executes a shading routine associated with the identified intersection between the rays and the circles. Using a second set of computational resources, and the output from the shading routine includes a new ray to be cross tested. Membership in a subset can be a time-variable, or is determined statistically during recognition point during system configuration or while rendering a scene or one of a series of scenes.

또한, 이 방법은, 로컬화 된 메모리 자원 중 새로운 광선을 정의하는 데이터를 분배하는 단계와, 모양 데이터를 이용하여 제 1 서브세트의 연산 자원에 대한 광선 식별기를 그룹화를 통과하는 단계를 포함한다. 각각의 광선 식별기는 이 광선에 대한 광선 정의 데이터와 다른 데이터를 포함한다. 광선 식별기의 통과는 모양 데이터에 의해 식별된 모양을 가지는 식별 광선의 교차 테스트를 활성화한다. 이러한 테스트는, 각각의 연산 자원에 의해 이의 로컬화된 메모리에 저장된 식별된 광선을 정의하는 데이터를 인출하고, 인출된 정의 데이터에 근거하여 교차부에 대해 식별된 모양을 테스트하며, 통신을 위해 검출된 교차부의 식별 결과(indication)를 출력하는 단계를 포함한다.The method also includes distributing data defining new rays of the localized memory resources and passing the grouping of ray identifiers for the first subset of computational resources using shape data. Each ray identifier includes different ray definition data for this ray. The passage of the ray identifier identifies the cross test of the identification rays having the shape identified by the shape data. This test retrieves the data defining the identified rays stored in its localized memory by each computing resource, tests the identified shape for the intersection based on the retrieved definition data, and detects for communication. Outputting an identification indication of the intersection.

다른 측면에서, 본 발명은 광선 추적을 이용하여 원형으로 구성된 3-D 장면의 2-D 표현을 렌더링하기위한 시스템을 포함한다. 이 시스템은 개별적인 캐시 메모리에 대한 액세스를 포함하는 복수의 교차 테스트 자원을 포함하며, 각각의 캐시 메모리는 광선 정의 데이터의 마스터 카피로 구성된 서브세트를 포함하고, 각각의 광선에 대한 광선 정의 데이터는 그 광선에 대한 테스트가 완료될 때까지 캐시 메모리에 유지된다.In another aspect, the present invention includes a system for rendering a 2-D representation of a 3-D scene constructed in a circle using ray tracing. The system includes a plurality of cross test resources including access to individual cache memories, each cache memory comprising a subset consisting of a master copy of the ray definition data, and the ray definition data for each ray It stays in cache memory until the test for the beam is complete.

또한, 시스템은 개별적인 캐시 메모리 내의 광선에 대한 정의 데이터에 대한 액세스를 가지는 개별적인 테스트 자원에 의해 각각의 광선에 대한 테스트를 제어하도록, 그리고 각각의 광선에 식별기(identifier)를 할당하도록 동작하는 제어 로직을 포함한다. 테스트 제어는 테스트될 광선에 대한 데이터를 저장하는 개별적인 테스트 셀에 광선 식별기를 제공함으로써 영향을 받는다. 시스템은 교차 테스트를 완료한 광선과 교차된 개별적인 원형을 식별하기 위한 출력 큐(queue)를 포함한다. 제어 로직은 캐시 메모리에서 교차 테스트를 완료한 광선을 교체하도록 세이딩 연산의 결과로 얻어진 새로운 광선을 할당한다.The system also has control logic that operates to control the test for each light ray by an individual test resource having access to definition data for the light rays in a separate cache memory, and to assign an identifier to each light ray. Include. Test control is affected by providing ray identifiers to individual test cells that store data for the ray to be tested. The system includes an output queue to identify the individual prototypes crossed with the beams that completed the cross test. The control logic allocates a new beam obtained as a result of the shading operation to replace the cross-tested beam in cache memory.

일부 측면에서, 다음 중 하나 이상이 제공될 수 있다. 제어 로직은 새로운 광선에 대한 식별기로서 완료된 광선에 대한 식별기를 재사용함으로써 교체를 하고, 광선 식별기는 그 광선을 정의하는 개별적인 데이터를 저장하는 메모리 위치에 관계되며, 새로운 광선을 정의하는 데이터는 완료된 광선의 메모리 위치에 저장된 데이터를 대신한다. In some aspects one or more of the following may be provided. The control logic replaces by reusing the identifier for the completed ray as an identifier for the new ray, and the ray identifier relates to a memory location that stores the individual data defining the ray, and the data defining the new ray is the It replaces the data stored in the memory location.

본 발명의 또 다른 측면은 광선 추적법을 이용하여, 원형으로 구성된 3-D 장면의 2-D 표현을 렌더링하는 시스템을 포함한다. 시스템은 3-D 장면을 구성하는 원형을 저장하는 메모리와, 복수의 교차 테스트 자원을 포함한다. 각각의 교차 테스트 자원은 위의 원형 중 하나 이상을 이용하여 장면 내에서 이동하는 하나 이상의 광선을 테스트하도록 동작하고, 검출된 교차부(intersection)의 식별결과를 출력한다. 또한 시스템은 복수의 세이딩 자원을 포함하며, 이들은 각각 검출된 광선/원형 교차부의 삭별 결과로부터 원형과 관련된 세이딩 루틴을 가동하도록 동작한다. 또한 이 시스템은 세이딩 자원으로 검출된 교차부의 식별 결과를 출력하기 위한 제 1 통신 링크와, 교차 테스트 자원으로 세이딩 루틴의 가동 결과 생성된 새로운 광선을 전달하기 위한 제 2 통신 링크를 포함한다. 여기서, 새로운 광선은 교차 테스트 자원으로 보내질 수 있으며, 이들이 보내진 상대적인 순서와 다른 교차 테스트를 완료할 수 있다. 통신 링크들은 FIFO 큐와 같은 큐로 구현될 수 있다.Another aspect of the invention includes a system for rendering a 2-D representation of a 3-D scene constructed in a circle using ray tracing. The system includes a memory for storing prototypes constituting the 3-D scene and a plurality of cross test resources. Each cross test resource operates to test one or more rays moving within the scene using one or more of the above prototypes, and outputs an identification result of the detected intersection. The system also includes a plurality of shading resources, each of which operates to run a shading routine associated with the circle from the result of the detection of the detected ray / circular intersection. The system also includes a first communication link for outputting an identification result of the intersection detected as the shading resource, and a second communication link for delivering a new light beam generated as a result of the operation of the shading routine to the cross test resource. Here, new rays can be sent to the cross test resources and complete cross tests that differ from the relative order in which they were sent. Communication links may be implemented as a queue, such as a FIFO queue.

또 다른 본 발명의 측면은 메인 메모리와 연산 자원들 사이에 분산 메모리를 포함하는 계층적 메모리 구조에 연결된 복수의 연산 자원을 가지는 시스템에서 원형으로 구성된 장면을 광선 추적하는 방법을 포함한다. 여기서, 메인 메모리는 분산 메모리보다 레이턴시가 크다. 이 방법은 분산 메모리 중에 장면에서 교차 테스트될 광선을 정의하는 데이터를 분산시켜, 광선의 서브세트들이 분산 메모리 중 서로 다른 메모리에 저장되도록 하는 단계와, 하나 이상의 기하학적 모양을 가지는 광선 그룹의 교차 테스트를 결정하는 단계를 포함한다. 여기서, 광선 그룹의 멤버들은 여러 분산 메모리에 저장된다. 이 방법은 주 메모리로부터 하나 이상의 모양을 정하는 데이터를 페치하는 단계, 광선 그룹에 관한 기하학적 모양 및 식별기가 이 광선 그룹의 하나의 광선에 대한 데이터를 저장하는 각각의 분산 메모리와 관련된 하나 이상의 연산 자원에 제공하는 단계를 포함한다. 또한 이 방법은 광선에 대한 데이터를 저장하는 분산 메모리 중 하나 이상의 메모리와 관련된 연산 자원을 이용하여 교차에 관한 광선 그룹의 각각의 광선을 테스트하는 단계와, 연산 자원으로부터의 교차 테스트 결과를 수집하는 단계를 포함한다.Another aspect of the invention includes a method for ray tracing a scene comprised in a circle in a system having a plurality of computational resources coupled to a hierarchical memory structure including distributed memory between main memory and computational resources. Here, the main memory has a greater latency than the distributed memory. The method distributes data defining the rays to be cross-tested in the scene in distributed memory, such that subsets of the rays are stored in different memories of the distributed memory, and cross-tests of groups of rays having one or more geometric shapes. Determining. Here, the members of the ray group are stored in several distributed memories. The method involves fetching one or more shape data from main memory, the geometric shape for the ray group and the one or more computational resources associated with each distributed memory where the identifier stores data for one ray of this ray group. Providing a step. The method also includes testing each ray of the ray group with respect to the intersection using a computational resource associated with one or more of the distributed memories that store data for the ray, and collecting cross-test results from the computational resource. It includes.

또 다른 측면은, 3-D 장면을 구성하는 원형을 이용하여 광선을 교차 테스트하는 시스템을 포함한다. 이 시스템은 복수의 교차 테스트 자원을 포함하고, 각각의 교차 테스트 자원은 기하학적 모양을 이용하여 교차에 관해 개개의 광선을 테스트하도록 동작한다. 각각의 개별적인 광선은 각 교차 테스트 자원에 제공된 레퍼런스(reference)에 의해 식별되고, 테스트 자원은 제 1 출력 또는 제 2 출력으로 광선과 기하학적 모양 간의 교차부의 식별 결과(indication)를 출력하도록 동작한다.Another aspect includes a system for cross testing light rays using prototypes that make up a 3-D scene. The system includes a plurality of cross test resources, each cross test resource being operative to test individual rays with respect to cross using geometric shapes. Each individual light ray is identified by a reference provided to each cross test resource, and the test resource is operative to output an identification indication of the intersection between the light ray and the geometry to the first or second output.

하나의 출력은 원형 교차부에 관한 것이고, 다른 출력은 기하학적 가속 요소 교차에 관한 것이다. 예를 들어, 제 1 출력은 복수의 세이딩 자원으로 입력을 제공할 수 있으며, 광선과 원형 사이 교차부의 식별 결과에 관한 것이다. 한편, 제 2 출력은 입력을 광선 컬렉션 매니저로 제공하고, 광선과 기하학적 가속 요소 사이의 교차부의 식별 결과를 수신한다. One output relates to the circular intersection and the other output relates to the geometric acceleration element intersection. For example, the first output may provide an input with a plurality of shading resources, and relates to the result of the identification of the intersection between the beam and the circle. On the other hand, the second output provides an input to the ray collection manager and receives a result of the identification of the intersection between the ray and the geometric acceleration element.

또 다른 측면은 광선 추적 방법을 포함하며, 이 방법은 메인 메모리 자원 내의 원형의 선택을 개별적으로 제한하는 기하학적 가속 요소와 3-D 표현을 구성하는 원형을 저장하는 단계, 장면 내의 교차 테스트 될 광선을 정의하는 단계, 그리고 각 광선에 대한 식별기를 정의하는 단계를 포함한다. 이 방법은 복수의 개별적으로 프로그램가능한 프로세싱 자원을 포함하는 시스템에서, 각각의 프로세싱 자원과 개별적으로 관련된 로컬화 된 메모리 자원에 광선 원점 및 방향 데이터의 일부를 저장하는 단계를 포함한다. 이 방법은 또한 테스트를 위해 스케줄된 광선에 대한 식별기 및 프로세싱 자원에 대한 기하학적 모양의 식별 결과를 제공함으로써 교차 테스트를 위한 광선 스케줄링을 구현하는 단계를 포함한다. 각각의 프로세싱 자원은 자신의 로컬화된 메모리 자원이 임의의 식별된 광선에 대한 광선 정의 데이터를 저장하는지 여부 및 그러한 경우에, 이러한 식별된 기하학적 모양을 이용하여 이러한 광선을 교차 테스트할지 여부를 결정한다.Another aspect includes a ray tracing method, which stores geometric acceleration elements that individually limit the selection of a circle within a main memory resource and a circle that constitutes a 3-D representation; Defining and defining an identifier for each ray. The method includes storing a portion of the ray origin and direction data in a localized memory resource individually associated with each processing resource in a system comprising a plurality of individually programmable processing resources. The method also includes implementing ray scheduling for cross testing by providing an identifier for a ray scheduled for testing and a geometric shape identification result for processing resources. Each processing resource determines whether its localized memory resource stores ray definition data for any identified ray and, in such cases, whether to cross test such ray using this identified geometry. .

또 다른 측면은, 3-D 장면의 2-D 표현을 렌더링하는 데 사용하기 위한 광선을 이용하여 기하학적 모양의 교차 테스트를 할 수 있도록 복수의 프로세싱 자원을 제어하는 시스템을 위한 컴퓨터 판독형 명령을 포함하는 컴퓨터 판독형 매체을 포함한다. 이러한 명령은 원형의 제 1 선택을 제한하는 제 1 기하학적 가속 요소를 교차시키도록 정해진 광선에 대한 식별기 패킷을 액세스하는 단계, 그리고 제 1 기하학적 가속 요소에 의해 제한된 원형의 일부를 제한하는 다른 기하학적 가속 요소를 결정하는 단계를 포함하는 방법을 구현하기 위한 것이다. 또한 이 방법은 각각 광선 식별기를 포함하는 복수의 패킷 그리고 다른 기하학적 가속 소자 중 다른 하나에 대한 개별적인 식별결과를 인스턴스화 하는 단계, 그리고 각각 패킷 내에 식별된 모든 광선보다 적은 수의 광선을 교차 테스트하도록 개별적으로 구성된 복수의 자원들 각각으로 복수의 패킷을 제공하는 단계를 포함한다. 또한, 이 방법은 복수의 연산 자원으로부터 검출된 교차부의 식별 결과를 수신하는 단계, 그리고 임계치의 수신된 식별결과 수보다 적은 식별결과를 가지는 다음의 기하학적 가속 요소를 식별할 때까지 기하학적 가속 요소에 의해 수신된 식별결과를 추적하고, 다음 패킷과의 액세스를 반복하는 단계를 포함한다.Another aspect includes computer readable instructions for a system that controls a plurality of processing resources to enable crossover testing of geometric shapes using light rays for use in rendering a 2-D representation of a 3-D scene. Computer-readable media. Such instructions may include accessing an identifier packet for a beam determined to intersect a first geometric acceleration element that limits the first selection of a circle, and other geometric acceleration elements that limit a portion of the circle limited by the first geometric acceleration element. To implement the method comprising the step of determining. The method also instantiates individual identification results for a plurality of packets each including a beam identifier and another of the other geometric acceleration elements, and individually for cross-testing fewer rays than all the rays identified in each packet. Providing a plurality of packets with each of the configured plurality of resources. In addition, the method further includes receiving an identification result of the intersection detected from the plurality of computational resources, and by means of the geometric acceleration element until identifying the next geometric acceleration element that has fewer identification results than the number of received identification results of the threshold. Tracking the received identification and repeating access to the next packet.

본 발명의 또 다른 추가 측면은, 광선을 이용하여 모양을 교차 테스트하도록 구성된 복수의 연산 자원과, 연산 자원들과 각각 연결된 개별적인 캐시(여기서 각각의 캐시는 장면에서 이동하는 복수의 광선 중 일부를 정의하는 데이터를 저장함)를 포함하는 광선 추적 시스템을 포함하며, 또한, 복수의 연산 자원 사이의 메시지 전달을 위한 채널을 포함한다. 여기서 연산 자원 각각은 복수의 광선 식별기를 포함함에 따라 연산 자원에 의해 수신된 메시지 내의 데이터를 해석하고, 캐시에 저장된 복수의 광선 중 하나를 가지는지 여부를 결정하고, 관련 모양을 이용하여 저장된 임의의 광선을 테스트하도록 구성된다. Another additional aspect of the invention is a plurality of computational resources configured to cross-test shapes using rays, and a separate cache each associated with the computational resources, where each cache defines some of the plurality of rays traveling in the scene. And a channel for message transfer between a plurality of computational resources. Wherein each computational resource includes a plurality of ray identifiers to interpret data in a message received by the computational resource, determine whether to have one of a plurality of rays stored in the cache, and use any associated shape to store any Configured to test the beam.

또 다른 추가 측면은 3-D 장면을 구성하는 원형을 이용하여 광선을 교차 테스트하는 시스템을 포함한다. 이 시스템은 복수의 교차 테스트 자원을 포함하며, 각각이 기하학적 모양과 교차에 관하여 개별적인 광선을 테스트하도록 동작한다. 개별적인 광선은 각각의 교차 테스트 자원에 제공된 레퍼런스(reference)에 의해 식별된다. 각각의 교차 테스트 자원은 또한 제 1 출력이나 제 2 출력으로, 광선과 원형 사이의 교차부의 식별결과를 출력하도록 구성된다. 이 시스템은 또한 복수의 세이딩 자원(이들 각각은 검출된 교차부에 대한 세이딩 코드를 실행하도록 동작함)을 포함하며, 광선에 대한 레퍼런스을 유지하고 테스트될 광선을 식별하기 위한 복수의 교차 테스트 자원으로 광선 레퍼런스(reference)을 제공하도록 동작하는 광선 컬렉션 매니저를 포함한다. 제 1 출력은 복수의 세이딩 자원으로 입력을 제공하고, 광선과 원형 사이의 교차부의 식별결과를 수신하며, 그리고 제 2 출력은 광선 컬렉션 매니저에 입력을 제공하고 광선과 기하학적 가속 요소 사이의 교차부의 식별결과를 수신한다.Another additional aspect includes a system for cross testing light rays using prototypes that make up a 3-D scene. The system includes a plurality of cross test resources, each operative to test individual rays with respect to geometric shapes and crosses. Individual rays are identified by the reference provided to each cross test resource. Each cross test resource is also configured to output, as a first output or a second output, the identification result of the intersection between the ray and the circle. The system also includes a plurality of shading resources, each of which operates to execute shading code for the detected intersection, and a plurality of cross testing resources for maintaining a reference to the rays and identifying the rays to be tested. It includes a ray collection manager that operates to provide ray references. The first output provides an input with a plurality of shading resources, receives an identification of the intersection between the light beam and the circle, and the second output provides an input to the light beam collection manager and provides the input of the intersection between the light beam and the geometric acceleration element. Receive the identification result.

또 다른 추가 측면은 3-D 장면의 2-D 표현을 광선 추적법에 근거한 렌더링을 병렬식으로 사용하기 위한 연산 구성(configuration)을 포함한다. 이는 로컬 캐시에 연결된 프로세서(로컬 캐시는 특정된 기하학적 모양과 교차에 대해 테스트될 복수의 광선을 정의하는 데이터를 저장함); 그리고 이 프로세서에 의해 제공된 입력 큐를 포함한다. 입력 큐에 수신된 데이터는, 식별된 기하학적 모양과의 교차에 관해 테스트 될 광선에 관한 복수의 식별기를 포함함에 따라, 프로세서에 의해 해석될 수 있고, 이 프로세서는 프로세서의 로컬 캐시에 저장된 데이터가 존재하는 큐의 식별된 임의의 광선에 대해서만 정의 데이터를 인출하고, 식별된 기하학적 모양과 이러함 임의의 광선을 교차 테스트하며, 임의의 선택된 교차부의 식별결과를 출력하도록 구성된다.Another further aspect includes a computational configuration for using ray tracing based rendering in parallel with a 2-D representation of a 3-D scene. This includes a processor coupled to a local cache (the local cache stores data defining a plurality of rays to be tested for intersection with a specified geometric shape); And an input queue provided by this processor. The data received in the input queue can be interpreted by the processor as it includes a plurality of identifiers for the light beams to be tested with respect to their intersection with the identified geometric shapes, where the processor has data stored in the processor's local cache. Extract definition data only for the identified arbitrary rays of the queue, cross-test any identified rays with the identified geometric shapes, and output the identification result of any selected intersection.

또 다른 추가 측면은 컴퓨터 판독형 매체를 포함한다. 이 매체는 원형의 선택을 제한하는 기하학적 가속 요소와 교차하도록 정해진 광선에 대한 식별기 패킷에 액세스하고, 교차된 기하학적 가속 요소에 의해 정해진 원형의 일부를 제한하는(경계를 정하는) 다른 기하학적 가속 요소를 결정하는 단계들을 포함하는 광선 추적 방법을 구현하기 위한 컴퓨터 판독형 명령을 포함한다. 또한, 이 방법은 복수의 패킷(각각의 패킷은 광선 식별기를 포함함)과 기하학적 가속 요소들 중 하나의 개별적인 식별결과를 인스턴스화(intantiating)하는 단계, 그리고 복수의 패킷을 각각의 패킷에서 식별된 광선을 교차 테스트하도록 개별적으로 구성된 복수의 연산 자원들 각각으로 제공하는 단계를 포함한다. 또한, 이 방법은 복수의 연산 자원으로부터 검출된 식별결과를 수신하고, 기하학적 가속 요소에 따라 수신된 식별결과를 추적하는 단계를 포함한다. Still further aspects include computer readable media. This medium accesses the identifier packet for a ray defined to intersect the geometric acceleration element that limits the selection of the circle, and determines other geometric acceleration elements that limit (bound) the portion of the circle defined by the crossed geometric acceleration element. Computer-readable instructions for implementing a ray tracing method comprising the steps of: In addition, the method instantiates individual identification results of a plurality of packets (each packet comprising a beam identifier) and one of the geometric acceleration elements, and a plurality of packets of the beams identified in each packet. Providing each of the plurality of computing resources individually configured to cross-test. The method also includes receiving an identification result detected from the plurality of computational resources and tracking the received identification result according to the geometric acceleration factor.

또 다른 추가적인 측면은, 광선 추적 방법을 포함한다. 이 방법은 3-D 정면을 구성하는 원형과 교차에 관해 테스트될 복수의 광선을 정의하는 광선 정의 데이터를 결정하는 단계를 포함한다. 또한 이 방법은 복수의 연산 자원의 개별적인 로컬 메모리 중에 광선 정의 데이터의 서브세트를 분산시키는 단계를 포함하며, 여기서 연산 자원은 기하학적 모양을 이용하여 광선을 교차 테스트하고, 관리 모듈에서, 연산 자원에 의해 교차 테스트 될 복수의 광선 중 수집될 광선을 결정하도록 구성된다. 이러한 컬렉션은 복수의 광선 식별기에 의해 정의되고, 광선 식별기 각각은 광선에 대한 정의 데이터와 다른 데이터를 포함하며, 원형의 일부를 규정하는 규정된 모양과 연관된다. 이 방법은 또한 연산 자원들 간에 이러한 컬렉션을 위한 광선 식별기를 전달함으로써 결정된 컬렉션 광선을 연산 자원이 테스트하도록 하는 단계를 포함하며, 각각의 연산 자원은 연산 자원의 로컬 메모리에 정의 데이터가 저장된 식별된 광선을 교차 테스트함으로써 개별적으로 응답한다.Yet another additional aspect includes a ray tracing method. The method includes determining ray definition data defining a plurality of rays to be tested for intersection with the circles constituting the 3-D facade. The method also includes distributing a subset of the ray definition data in separate local memories of the plurality of computational resources, wherein the computational resource cross-tests the rays using geometric shapes and, in the management module, by the computational resources And determine which of the plurality of light rays to be cross-tested to be collected. This collection is defined by a plurality of ray identifiers, each of which includes definition data for the ray and other data and is associated with a defined shape that defines a portion of the circle. The method also includes causing the computing resource to test the collection beam determined by passing the ray identifier for this collection between the compute resources, each computed resource having an identified ray in which definition data is stored in the local memory of the compute resource. Answer individually by cross-testing

이러한 측면들에서, 로컬 캐시에 저장된 복수의 광선은 제 2 복수의 광선의 분리된 서브세트일 수 있으며, 복수의 광선 식별기 중 일부가 로컬 캐시에 저장된 광선을 식별하고, 제 2 복수의 광선 중 일부가 로컬 캐시에 저장되지 않는다. In these aspects, the plurality of rays stored in the local cache may be a separate subset of the second plurality of rays, wherein some of the plurality of ray identifiers identify rays stored in the local cache and some of the second plurality of rays. Is not stored in the local cache.

설명된 기능적 측면들은 모듈과 같이 구현될 수 있으며, 이는 예를 들면 설명된 것과 같은 입력 및 출력을 생성하도록 동작하는 적합한 하드웨어 자원을 구성하는, 컴퓨터로 실행가능한 코드의 모듈이다.The described functional aspects can be implemented like a module, which is, for example, a module of computer executable code that constitutes suitable hardware resources that operate to generate input and output as described.

이 명세서에 설명된 측면 및 실시예에 대한 더 완전한 이해를 위해, 다음에 설명된 첨부 도면을 참조한다.
도 1은 광선 추적법을 이용하여 장면을 렌더링하는 시스템의 제 1 실시예를 나타낸다.
도 2는 도 1의 일부에 대한 추가 측면을 나타낸다.
도 3은 광선 추적 렌터링 시스템의 교차 테스트 부분에 대한 다른 구현예를 나타낸다.
도 4는 도 1-3의 시스템에 사용될 수 있는 교차 테스트를 위한 연산 자원의 일 예를 나타낸다.
도 5는 광선 추적에 사용하기 위한 교차 테스트 시스템 아키텍처의 추가적인 일 실시예를 나타낸다.
도 6은 교차 테스트를 위한 아키텍처의 다른 실시예의 측면을 나타낸다.
도 7은 도 1-6의 예들의 여러 측면들을 구현하는 시스템 아키텍처를 나타내며, 이는 교차 테스트 자원을 포함하고 큐에 의해 연결된 세이딩 자원을 포함한다.
도 8a는 도 1-7에 따른 시스템에서 광선 추적을 제어하는 데 사용될 수 있는 광선에 대한 식별기를 제공하는 방법의 여러 측면을 나타낸다.
도 9a 및 9b는 도 1-7 중 임의의 교차 테스트 자원에 제공될 수 있는 메모리 내의 광선 데이터를 식별하기 위한 광선 ID를 사용하는 실시예들을 나타낸다.
도 10은 도 1-7의 시스템에 구현될 수 있는 복수의 교차 테스트 자원간에 분산된 교차 테스트 제어(기능) 및 모양의 여러 측면을 나타낸다.
도 11은 광선 추적을 위한 아키텍처를 이용할 때 도 1-10의 시스템의 여러 측면이 구현될 수 있는 다중 프로세서 아키텍처를 나타낸다.
도 12는 도 1-11의 구현예에 영향을 미칠 수 있는 자원 간 통신 및 로컬화된 광선 데이터 저장 매체를 이용한 복수의 연산 자원의 구조화(organization)를 나타낸다.
도 13은 도 12의 연산 자원의 일부로서 동작하는 다중 스레스 또는 코어의 일 예를 나타낸다.
도 14a-14c는 도 1-13에 따라 시스템 및 아키텍처에 사용될 수 있는 여러 다른 큐 구현예들을 나타낸다.
도 15는 복수의 연산 자원에 의해 공유된 L2 캐시로부터 프라이빗 L1 캐시 사이에 광선 데이터가 분산될 수 있는 여러 다른 방식을 나타내는 데 사용된다.
도 16은 본 발명의 실시예 마다, 큐 내에 존재할 수 있는 패킷들의 예를 나타낸다.
도 17은 교차 테스트에서 로컬 가용 광선 데이터를 사용하여, 특정한 연산 자원이 패킷으로부터 광선 ID를 처리하고, 이러한 테스트 결과를 재기록하는 방법을 제공한다.
도 18A 및 18B는 광선 ID 정보의 패킷을 처리하는 예시적인 SIMD 아키텍처의 실시예들을 나타낸다.
도 19는 광선 식별기를 보급하고, 광선을 테스트하며, 추가 테스트를 위해 추가 패킷으로 테스트 결과를 합하는 내용을 나타낸다.
도 20은 이전 도면에 따라 시스템에 일반적으로 적용가능한, 데이터 구조의 내용면에서, 방법적 단계를 나타낸다.
도 21은 본 발명에 따른 추가 방법적 측면들을 나타낸다.For a more complete understanding of the aspects and embodiments described in this specification, reference is made to the accompanying drawings described in the following.
1 shows a first embodiment of a system for rendering a scene using ray tracing.
2 shows a further aspect of the portion of FIG. 1.
3 illustrates another implementation of a cross test portion of a ray tracing renting system.
4 illustrates an example of computational resources for cross testing that may be used in the systems of FIGS. 1-3.
5 illustrates a further embodiment of a cross test system architecture for use in ray tracing.
6 illustrates aspects of another embodiment of an architecture for cross testing.
FIG. 7 shows a system architecture implementing various aspects of the examples of FIGS. 1-6, which include cross test resources and includes shading resources connected by queues.
8A illustrates various aspects of a method of providing an identifier for a ray that may be used to control ray tracing in the system according to FIGS. 1-7.
9A and 9B illustrate embodiments that use ray IDs to identify ray data in memory that may be provided to any of the cross test resources of FIGS. 1-7.
10 illustrates various aspects of cross test control (function) and appearance distributed among a plurality of cross test resources that may be implemented in the system of FIGS. 1-7.
11 illustrates a multiprocessor architecture in which various aspects of the system of FIGS. 1-10 may be implemented when using an architecture for ray tracing.
12 illustrates the organization of a plurality of computational resources using localized ray data storage media and inter-resource communications that may affect the implementations of FIGS. 1-11.
FIG. 13 shows an example of multiple threads or cores operating as part of the computational resource of FIG. 12.
14A-14C illustrate several different queue implementations that may be used in systems and architectures according to FIGS. 1-13.
FIG. 15 is used to illustrate different ways in which ray data can be distributed between an L2 cache shared by a plurality of computational resources and a private L1 cache.
16 shows an example of packets that may exist in a queue, per embodiment of the present invention.
FIG. 17 provides a method for using a locally available ray data in a cross test, where a particular computational resource processes a ray ID from a packet and rewrites these test results.
18A and 18B illustrate embodiments of an exemplary SIMD architecture for processing packets of ray ID information.
19 shows propagation of the ray identifier, testing the ray, and summing the test results into additional packets for further testing.
20 illustrates methodological steps in terms of the content of a data structure, which is generally applicable to a system according to the previous figures.
21 shows further method aspects according to the invention.

이하에서, 첨부된 도면 및 실시예와 함께 본 발명을 상세히 설명한다.Hereinafter, the present invention will be described in detail with the accompanying drawings and examples.

다음의 설명은 본 발명이 속하는 분야의 기술자가 본 발명의 다양한 실시예들을 제조 및 사용할 수 있도록 하기 위한 것이다. 이 명세서에 설명된 예에 대한 다양한 변경예는 본 발명이 속하는 분야의 기술자가 분명히 이해할 수 있으며, 이 명세서에 설명된 포괄적인 원리는 본 발명의 범위를 벗어나지 않는 한 다른 예 및 응용예에 적용될 수 있다. 이러한 설명은 먼저 삼차원(3-D) 장면(도 1)의 예에 관한 복수의 측면들을 소개함으로써 시작되며, 삼차원 장면은 도 2의 예에서와 같이, 기하학적 가속 데이터를 이용하여 압축될 수 있다. 이러한 삼차원 정면은 표현 및 설명된 예에 따른 시스템 및 방법을 이용하여 이차원 표현으로 렌더링될 수 있다.The following description is intended to enable those skilled in the art to make and use the various embodiments of the present invention. Various modifications to the examples described herein will be apparent to those skilled in the art, and the generic principles described herein may be applied to other examples and applications without departing from the scope of the present invention. have. This description begins by introducing a plurality of aspects relating to an example of a three-dimensional (3-D) scene (FIG. 1), which can be compressed using geometric acceleration data, as in the example of FIG. 2. Such three-dimensional facades can be rendered in two-dimensional representations using systems and methods in accordance with the representations and examples described.

기술분야에서 소개된 것과 같이, 삼차원 장면(scene)은 디스플레이를 위한 이차원 표현으로 변활될 필요가 있다. 이러한 변환은, 장면이 보이는 곳으로부터 카메라 위치를 선택하는 것이 필요하다. 카메라 위치는 종종 장면을 보는 사람(예, 게이머, 애니메이션 필름을 보는 사람 등)의 위치를 표현한다. 이차원 표현은 일반적으로 카메라와 장면 사이의 평면 위치에 존재하여, 이차원 장면은 바람직한 해상도의 픽셀 어레이를 포함한다. 각 픽셀에 대한 컬러 벡터가 렌더링을 통해 결정된다. 광선 추적중에, 광선은 카메라위치로부터 시작하여 원하는 포인트의 이차원 표현의 평면과 교차할 수 있다. 이어서, 3차원 장면으로 계속된다. 광선이 이차원 표현과 교차하는 위치는 그 광선과 관련된 데이터 구조에 유지된다. As introduced in the art, a three dimensional scene needs to be transformed into a two dimensional representation for display. This conversion requires selecting the camera position from where the scene is visible. Camera position often represents the position of the person viewing the scene (eg gamers, viewers of animated films, etc.). The two-dimensional representation is generally in a planar position between the camera and the scene, so that the two-dimensional scene includes a pixel array of the desired resolution. The color vector for each pixel is determined through rendering. During ray tracing, the ray can intersect the plane of the two-dimensional representation of the desired point starting from the camera position. Then continues to the three-dimensional scene. The position at which the ray intersects the two-dimensional representation is maintained in the data structure associated with the ray.

카메라 위치는 반드시 공간에 정의된 단일 포인트일 필요는 없으며, 대신에 카메라 위치가 분산되어, 광선이 카메라 위치 내에서 고려된 많은 수의 포인트로 부터 방출될 수 있다. 각각의 광선은 픽셀 내의 이차원 표현과 교차하며, 이를 샘플이라 부를 수 있다. 일부 실시예에서, 픽셀과 교차된 광선이 기록될 수 있는 위치가 더 정확할수록, 더 정밀한 색의 보간 및 혼합이 가능하다.The camera position does not necessarily need to be a single point defined in space, but instead the camera position is distributed so that light rays can be emitted from a large number of points considered within the camera position. Each ray intersects a two-dimensional representation in the pixel, which may be called a sample. In some embodiments, the more accurate the location at which light rays intersected with the pixel can be recorded, the more accurate color interpolation and blending is possible.

설명의 명확성을 위해, 소정의 유형의 객체에 대한 데이터(예, 삼각형의 세 개의 꼭짓점에 대한 좌표)가 종종 객체에 대한 데이터에 관해서라기 보다 객체 자체로 간단히 설명된다. 예를 들어, "원형의 페치"라고 할 때, 이는 그 원형의 물리적 실현물이라기 보다는 원형의 데이퍼 표현이 페치되는 것으로 이해되어야 한다. 그러나, 특히 광선에 관하여, 본 발명에서는 광선에 대한 식별기와 광선 자체를 정의하는 데이터를 구별하며, 여기서, "광선(ray)"이라는 표현을 사용하는 경우에, 반대되는 내용을 나타내지 않는 한, 광선을 정의하는 데이터와 광선 ID 양자에 관한 것으로 넓게 이해된다.For clarity of explanation, data for a given type of object (eg, coordinates for three vertices of a triangle) is often described simply as the object itself rather than as to the data for the object. For example, when referred to as "circular fetch", it should be understood that the circular data representation is fetched rather than the physical manifestation of the circular. However, in particular with respect to light rays, the present invention distinguishes the identifier for a light ray from the data defining the light beam itself, where the term "ray" is used, unless otherwise indicated. It is broadly understood to be both data and ray IDs that define.

삼차원 장면 내의 실제적인 그리고 매우 상세한 객체 표현은 이반적으로 객체의 표면(즉, 와이어 프레임 모델)에 근접한 많은 수의 작은 기하학적 원형을 제공함으로써 이루어진다. 이와 같이, 복잡한 객체는 간단한 객체에 비해 더 많은 원형 및 더 작은 원형으로 표현될 필요가 있다. 고 해상도의 이점을 제공하나, 광선 및 많은 수의 원형 사이의 교차 테스트를 수행하는 것은(위에 설명한 바와 같이, 그리고 이하에 추가로 설명될 바와 같이) 연상이 심화되며, 이는 특히 복잡한 장면이 많은 수의 객체를 가지기 때문이다. 교차 테스트를 위해 장면에 강제된 일정한 외부 구조화 없이, 각각의 광선이 각 원형과 교차에 대해 테스트되어야 하며, 이는 결과적으로 매우 느린 교차 테스트가 된다. 따라서, 광선마다 필요한 광선/원형 교차 테스트의 수를 줄이는 방법이 장면 내 광선 교차 테스트를 가속화하는 데 도움이된다. 이러한 교차 테스트의 수를 줄이는 한가지 방법은 많은 수의 원형의 표면을 압축한 엑스트라 경계 표면(extra bounding surface)를제공하는 것이다. 광선은 각각의 광선과 교차 테스트를 위해 원형보다 작은 원형의 서브세트를 식별하기 위한 경계 표면에 대해 먼저 교차 테스트될 수 있다. 이러한 경계 표면 모양은 다양한 모양에 제공될 수 있다. 이 명세서에서, 이러한 경계 표면 요소의 컬렉션를 GAD(Geometry Acceleration Data)라 한다.Actual and highly detailed object representation in a three-dimensional scene is achieved by providing a large number of small geometric primitives which are in close proximity to the surface of the object (ie wireframe model). As such, complex objects need to be represented in more circles and smaller circles than simple objects. While providing the advantages of high resolution, performing cross-tests between light rays and a large number of circles (as described above, and as described further below) intensifies associations, particularly in large numbers of complex scenes. This is because it has an object of. Without constant external structuring forced into the scene for the cross test, each ray must be tested for each circle and cross, which results in a very slow cross test. Thus, a method of reducing the number of ray / circle cross tests required per ray helps to accelerate the ray cross test in the scene. One way to reduce the number of such cross tests is to provide an extra bounding surface that compresses a large number of circular surfaces. The light beams may first be cross-tested against the boundary surface to identify a subset of circles smaller than the circle for cross-testing with each light beam. Such boundary surface shapes may be provided in various shapes. In this specification, this collection of boundary surface elements is referred to as Geometry Acceleration Data (GAD).

GAD 구조화(oranization), 요소 및 용도에 대한 더 광범위한 처리법을 미국특허 출원 번호 11/856,612(2007년, 9월 17일 출원)에서 찾을 수 있으며, 이는 이 명세서에 참조문헌으로 포함된다. 따라서,내용에 대해 GAD에 대한 더 간단한 처리가 이하에서 제공되며, 그리고 이러한 문제에 관한 더 상세한 내용을 위에 참조된 출원으로부터 얻을 수 있다.More extensive treatments for GAD oranization, elements and uses can be found in US patent application Ser. No. 11 / 856,612, filed Sep. 17, 2007, which is incorporated herein by reference. Thus, a simpler treatment of GAD for the content is provided below, and more details on this issue can be obtained from the above referenced application.

소개한 바와 같이, GAD 요소는 일반적으로 삼차원 공간에서, 원형들의 개별적인 컬렉션를 둘러싸는 기하학적 모양을 포함하여, 광선과 기하학적 모양의 표면의 교차 실패는 광선이 그 모양 내의 임의의 원형과 교차되지 않는다는 것을 나타낸다. GAD 요소는 모양, 축-정렬 경계 박스, kd-트리, 옥트리 및 그 외 종류의 경계 볼륨 계층을 포함할 수 있으며, 이와 같이 본 발명에 따른 구현예가 kd-트리의 절단 평면과 같은 경계 스킴을 사용하거나, 하나 이상의 원형의 경계를 정하는 경계 표면의 범위를 지정하거나 구체화하는 다른 방법을 사용할 수 있다. 요약하면, GAD 요소는 일차적으로 광선과 원형 사이의 교차부를 더 빠르게 식별하기 위해 원형을 압축하데 유용하기 때문에, GAD 요소는 광선과의 교차에 대해 쉽게 테스트될 수 있는 모양인 것이 바람직하다.As introduced, the GAD element generally includes a geometric shape surrounding a separate collection of circles in three-dimensional space, so that the failure of the ray to intersect the surface of the geometric shape indicates that the ray does not intersect any circle within that shape. . GAD elements may include shapes, axis-aligned bounding boxes, kd-trees, octrees, and other types of boundary volume hierarchies, and thus embodiments in accordance with the present invention employ boundary schemes such as the kd-tree's cut plane. Alternatively, other methods of specifying or specifying a range of boundary surfaces delimiting one or more circular boundaries may be used. In summary, since the GAD element is primarily useful for compressing the circle to more quickly identify the intersection between the ray and the circle, the GAD element is preferably in a shape that can be easily tested for intersection with the ray.

GAD 요소는 서로 관련될 수 있다. GAD 요소의 상관(interrelation)은 이 명세서에서 노드 및 에지를 포함하는 그래프일 수 있으며, 여기서,노드는 GAD 요소를 나타내고, 에지는 GAD 요소 중 두 개 사이의 상관을 나타낸다. 한 쌍의 요소가 에지에 의해 연결되는 경우에, 에지는 노드들 중 하나가 다른 노드와 다른 단편 모양(granularity, 입상)을 가지는 것을 표시하며, 이는 그 에지에 연결된 노드 중 하나가 다른 노드에 비해 많거나 적은 원형의 경계를 정한다는 것을 의미할 수있다. 일부의 경우에, 그래프는 계층화될 수 있으며, 이에 따라 그래프에 대한 방향이 존재하며, 그래프가 모 노드로부터 자 노드의 순서로 가로질러 이러한 방식으로 잔여의 경계가 정해진 원형들을 좁힌다. 일부의 경우에, 그래프는 동종의 GAD 요소를 포함할 수 있으며, 이로써 지정된 GAD 요소가 다른 GAD 요소를 규정하는 경우에, 지정된 GAD 요소는 직접적으로 원형을 규정하지 않는다(즉, 동종의 GAD 구조에서, 원형은 리프 노드 GAD 요소에 의해 직접 정해지며, 비-리프 노드가 직접적으로 원형이 아니라, 다른 GAD 요소를 규정한다). GAD elements may be related to each other. The correlation of a GAD element may be a graph that includes nodes and edges in this specification, where the node represents a GAD element and the edge represents a correlation between two of the GAD elements. In the case where a pair of elements are connected by an edge, the edge indicates that one of the nodes has a different granularity than the other node, which means that one of the nodes connected to that edge is compared to the other node. It can mean that you are delimiting more or less circles. In some cases, the graph may be layered, such that there is a direction for the graph, where the graph narrows the remaining bounded circles in this manner across from the parent node to the child node. In some cases, the graph may include homogeneous GAD elements, such that if the designated GAD element specifies another GAD element, the designated GAD element does not directly define a prototype (ie, in a homogeneous GAD structure). , The prototype is directly determined by the leaf node GAD element, and the non-leaf node directly defines another GAD element, not directly).

GAD 요소에 대한 그래프는 각각의 GAD 요소에 의해 규정된 요소 및/또는 원형의 수적인 면에서 균일성을 유지하는 것을 목표로 구성될 수 있다. 지정된 장면은 이러한 목표가 달성될 때까지 하위분할될 수 있다.The graph for the GAD elements may be configured to maintain uniformity in the numerical aspects of the elements and / or primitives defined by each GAD element. The designated scene may be subdivided until this goal is achieved.

다음의 설명에서, 지정된 GAD 요소를 교차하도록 결정된 광선에 근거하여, 어느 GAD 요소가 다음에 응답하여 테스트되어야 하는지를 결정하는 메커니즘이 존재한다는 사실이 제공된다. 계층적 그래프의 일 예에서, 다음에 테스트될 요소는 일반적으로 테스트된 노드의 자 노드(child node)이다.In the following description, the fact is provided that there is a mechanism for determining which GAD element should be tested in response in the next, based on the light rays determined to intersect the designated GAD element. In one example of a hierarchical graph, the next element to be tested is typically the child node of the node being tested.

이 명세서에서 많은 실시예로 구현된 GAD에 대한 하나의 사용 방법은, 광선이 지정된 GAD 요소와 교차하는 것으로 발견된 때, 그 요소와 교차하도록 결정된 다른 광선과 함께 수집된다는 것을 포함한다. 많은 광선이 수집된 경우에, 그 요소와 연결된 GAD 요소의 스트림이 메인 메모리로부터 페치되고, 각각 다른 컬렉션 광선을 가지는 테스터를 통해 스트림화된다. 따라서, 각각의 테스터는, 기하학적 도형이 중복 기록이 필요하거나 허용된 때 슬로우 메모리로부터 페치되는 동안, 로컬 패스트 메모리에 광선을 변하지 않게 유지한다. 더 일반적으로, 이러한 설명은 연산 자원이 이러한 광선과 기하학적 모양(GAD 요소와 원형)의 교차를 검출하기 위해 광선을 진행시키고, 결과적으로 어떤 광선이 어떤 원형과 충돌하는지를 식별하도록 어떻게 구조화되는가에 대한 일련의 예를 제공한다. One method of use for a GAD implemented in many embodiments herein includes that when a ray is found to intersect a specified GAD element, it is collected with another ray determined to intersect that element. If many rays are collected, the stream of GAD elements associated with that element is fetched from main memory and streamed through a tester, each having a different collection ray. Thus, each tester keeps the rays unchanged in the local fast memory while the geometry is fetched from slow memory when redundant writing is needed or allowed. More generally, this description is a series of how computational resources are structured to advance the rays to detect the intersection of these rays with geometric shapes (GAD elements and circles) and consequently identify which rays collide with which circles. Provide an example.

이러한 실시예가 구현할 수 있는 본 발명의 또 다른 측면은, 다음 중 하나를 포함한다. (1) 교차 테스트로부터 세이디으로 출력을 제공하도록 큐가 제공됨, (2)이러한 모양에 대해 특정한 광선을 테스트하기 위한 결정이 있을 때, 기하학적 모양이 슬로우 메모리로부터 페치되는 동안, 광선 데이터가 자원을 연산하기 위해 어느 정도 로컬화됨, 그리고 (3) 교차 테스트를 수행하는 연산 자원으로 광선을 식별함으로써(광선 식별기를 사용함으로써) 교차 테스트가 구동되고, 이는 각각의 연산 자원이 자신의 로컬화된 메모리로부터 식별된 광선에 대응하는 데이터를 페치하도록 함.Another aspect of the invention in which this embodiment may be implemented includes one of the following. (1) a cue is provided to provide output from the cross-test to the sadie, (2) when a decision is made to test a particular ray for such a shape, while the geometry is fetched from slow memory, the ray data may Some localization to compute, and (3) cross testing is driven by identifying the rays with a computational resource that performs the cross-test (by using a ray identifier), in which each computational resource is derived from its localized memory. Fetch data corresponding to the identified rays.

다음의 설명은 광선 추적법을 사용하여 삼차원 장면의 이차원 표현을 렌더링하기 위한 시스템 및 장치의 예를 나타낸다. 이러한 시스템의 두 개의 개념적 기능 컴포넌트는 (1) 교차를 식별하기 위한 광선 추적 그리고 (2) 식별된 교차부의 세이딩이다.The following description shows an example of a system and apparatus for rendering a two-dimensional representation of a three-dimensional scene using ray tracing. Two conceptual functional components of such a system are (1) ray tracing to identify intersections and (2) shading of identified intersections.

도 1은 원형으로 구성된 장면을 광선 추적할 때 사용하기 위한 시스템의 여러 측면을 도시한다. 일반적으로, 도 1 및 그 외의 도면의 기능성 유닛 중 하나의 기능 또는 역할 중 하나가 다중 하드웨어 유닛 또는 소프트웨어, 소프트웨어 서브루틴에 구현될 수 있으며, 서로 컴퓨터에서 가동될 수 있다. 일부의 경우에, 시스템 기능 및 성능에 영향을 미칠 수 있기 때문에, 이러한 구현예가 더 구체적으로 설명된다.1 illustrates various aspects of a system for use in ray tracing a scene consisting of a circle. In general, one of the functions or roles of one of the functional units in FIG. 1 and the other figures may be implemented in multiple hardware units or software, software subroutines, and may run on a computer with each other. In some cases, such implementations are described in more detail as they may affect system functionality and performance.

도 1은, 기하학적 유닛(101), 교차 프로세싱 유닛(102), 샘플 프로세싱 자원(110), 프레임 버퍼(111), 그리고 GAD 요소 및 원형(원형 및 GAD 저장장치(103))을 포함하는 기하학적 모양, 샘플(106), 광선 세이딩 데이터(107), 그리고 텍스처 데이터(108)를 저장하거나 저장하도록 구성된, 그렇지 않은 경우에 이들을 저장하도록 동작하는 메모리 자원(139)을 도시한다. 기하학적 유닛(101)은 렌더링 도리 장면의 디스크립션(description) 및 원형의 경계를 규정하는 GAD 요소를 포함하는 가속 구조를 입력한다. 교차 프로세싱(102)은 광선과 원형 사이의 식별된 교차부를 세이드하고, 텍스처, 세이딩 코드 및 도시된 데이터 자원으로부터 획득된 그 외의 샘플 정보와 같은 입력을 이용한다. 교차 프로세싱(102)의 출력은 렌더링 될 장면의 이차원 표현을 생성하는 사용될 새로운 광선(이하에서 설명됨)과 색 정보를 포함한다. 이러한 기능성 컴포넌트 모두는 점선(185)에 의해 포괄적으로 표시된 하나 이상의 호스트 프로세싱 자원에 구현될 수 있다.1 shows a geometric shape comprising a geometric unit 101, a cross processing unit 102, a sample processing resource 110, a frame buffer 111, and a GAD element and a circle (circular and GAD storage 103). , Memory 106 138 configured to store or store samples 106, ray shading data 107, and texture data 108, otherwise operative to store them. The geometric unit 101 inputs an acceleration structure that includes a description of the rendering scene and a GAD element that defines a circular boundary. Intersection processing 102 shades the identified intersections between light rays and circles, and uses inputs such as textures, shading codes, and other sample information obtained from the illustrated data resources. The output of the cross processing 102 includes new light rays (described below) and color information to be used to generate a two-dimensional representation of the scene to be rendered. All of these functional components may be implemented in one or more host processing resources, represented generically by dashed lines 185.

위에 설명한 것과 같이, 식별된 광선/원형 교차부의 세이딩 중에, 교차 프로세싱(102)은 교차 테스트될 새로운 광선을 생성할 수 있다. 드라이버(188)는 새로운 광선을 수신하고, 교차 프로세싱 자원(102)과, 광선 데이터 저장장치(105) 및 교차 테스트 유닛(109)을 포함하는 로컬화된 교차 테스트 영역(140) 사이의 통신을 관리하하기 위한 교차 프로세싱(102)과의 인터페이스일 수 있다. 교차 테스트 영역(104)은 교차에 대해 광선을 테스트하고, 인터페이스(112)를 거쳐 원형 및 GAD 저장장치(103)로의 판독 액세스를 가지며, 교차 프로세싱(102)으로 결과 인터페이스(121)를 통해 식별된 교차부의 식별결과를 출력한다. 로컬 광선 데이터 저장장치(105)는 사이즈 면에서 상대적으로 작아질 수 있는 상대적으로 빠른 메모리에 구현되는 것이 바람직하다. 반면에 원형 및 가속 구조 저장장치는 호스트(185)의 메인 동적 메모리일 수 있는 상대적으로 크고 느린 메인 메모리(139)에 구현된다.As described above, during shading of the identified light / circular intersections, cross processing 102 may generate a new light to be cross tested. The driver 188 receives new rays and manages communication between the cross processing resource 102 and the localized cross test region 140 including the ray data storage 105 and the cross test unit 109. It may be an interface with cross processing 102 to decompose. The cross test area 104 tests the light beams for the cross, has read access to the circular and GAD storage 103 via the interface 112, and identified through the result interface 121 to the cross processing 102. Outputs the identification result of the intersection. The local light data storage 105 is preferably implemented in a relatively fast memory that can be relatively small in size. Circular and accelerated structure storage, on the other hand, is implemented in relatively large and slow main memory 139, which may be the main dynamic memory of host 185.

고 해상도 장면을 광선 추적하는 하나의 방법은 관련된 광선 데이터 및 모양 데이터의 볼륨에 관련된다. 예를 들어, 초당 30 프레임으로 풀 HD 해상도 필름을 렌더링하는 것은 일 초에 6천만 픽셀 이상(1920 x 1080 > 2M, 초당 30회)의 색 결정을 요한다. 그리고, 각각의 픽셀 색을 결정하기 위해, 많은 광선이 필요할 수 있다. 따라서, 매 초마다 수억 개 광선이 처리되어야 하고, 모든 광선은 수 바이트의 저장용량을 필요로 하며, 풀 HD 장면의 광선 추적은 초당 수 기가바이트 이상의 광선 데이터와 관련된다. 또한, 임의의 지정 시간에서, 많은 양의 광선 데이터가 메모리에 저장되어야 한다. 거의 항상 액세스 속도와 메모리 사이즈 간의 교환이 이루지며, 따라서 비용-효율적인 큰 사이즈의 메모리는 비교적 느리다. 또한, 큰 사이즈의 메모리는 충분히 큰 데이터 블록이 액세스 되거나 사용되지 않는 한 효과적으로 사용될 수 없다. 따라서, 하나의 과제는 메모리로부터 효과적으로 데이터에 액세스할 만큼 큰 그룹의 광선을 동시에 식별할 수 있어야 한다는 것이다. 그러나, 유사한 원점 및 방향을 가지는 광선의 검색 및 그룹 테스트와 같은 접근방법에 의해 보인 바와 같이, 이와 같은 광선을 식별할 때 프로세싱 과잉, 때로는 심각한 과잉이 존재할 수 있다. 일 측면에서, 다음의 예시적인 아키텍처는 장면 렌더링에 대한 광선 교차 테스트 및 세이딩의 처리효율을 증가시킬 수 있도록, 복수의 연산 자원, 더 빠르고 더 비싼 메모리, 느리고 용량이 큰 메모리를 구조화하고 사용하는 방법을 설명한다.One method of ray tracing a high resolution scene is related to the volume of related ray data and shape data. For example, rendering a full HD resolution film at 30 frames per second requires more than 60 million pixels per second (1920 x 1080> 2M, 30 times per second). And, to determine each pixel color, many rays may be needed. Thus, hundreds of millions of rays need to be processed every second, every ray needs several bytes of storage, and ray tracing of a full HD scene involves more than a few gigabytes of ray data per second. Also, at any given time, a large amount of light ray data must be stored in the memory. Almost always there is a trade-off between access speed and memory size, and therefore large cost-efficient memory is relatively slow. Also, large memory sizes cannot be effectively used unless sufficiently large blocks of data are accessed or used. Thus, one challenge is to be able to simultaneously identify groups of rays that are large enough to effectively access data from memory. However, as shown by approaches such as searching and group testing of rays having similar origins and directions, there may be processing excess, sometimes serious excess, when identifying such rays. In one aspect, the following example architecture structures and uses multiple computational resources, faster and more expensive memory, and slower, larger memory to increase the throughput of ray cross testing and shading for scene rendering. Explain how.

따라서, 도 1은 GAD 요소 및 원형과 교차에 대해 광선을 테스트하는 연산 자원(109)에 로컬화 된 고속 메모리에 저장될 광선 정의 데이터를 포함하는 데이터 플로우에 의한 식별된 교차부의 세이딩(shading)으로부터 교차 테스트를 분리하는 것을 나타낸다. 교차 테스트(109)의 출력은 식별된 원형과 교차하는 식별된 광선에 대한 식별결과를 포함한다. 교차 프로세싱(102)은 이러한 식별결과를 수신하고, 이러한 식별결과에 따른 세이딩을 수행하며, 테스트를 위해 새로운 광선의 인턴스를 생성할 수 있으며, 이는 궁극적으로는 고속 광선 데이터 메모리(105)에 저장된다. 이러한 분리(decoupling)는 사용된 프로세싱 자원에 따라 선택된 통신 수단을 이용하여, 이러한 기술 내용에 따라 소프트웨어로 프로그램된 범용 컴퓨터 및 고정 기능 하드웨어 중 하나 이상을 사용하는 다양한 구현예로 제공될 수 있다. 그러나, 이러한 구현예에서 일어나는 하나의 측면은 광선 정의 데이터와 비교된 교차 테스트 영역(140)에서 광선과의 교차에 관해 테스트 된 모양 데이터가 순간적(transient)이라는 것이다. 다르게 설명하며, 적용가능한 경우에, 고속 메모리가 광선 데이터에 일차도 할당되 는 한편, 모양은 테스터를 통해 스트림화되고 이러한 모양 데이터의 캐싱을 최적화하는 데 사용될 연산 자원이 거의 없다. 다양한 다음의 도면은 이러한 분리(decoupling), 데이터 흐름, 광선 데이터 저장장치 및 교차 테스트 자원과의 병치(collocation)에 대한 더 구체적인 예를 도시한다.Thus, Figure 1 shows the shading of identified intersections by a data flow that includes ray definition data to be stored in a fast memory localized to a GAD element and a computational resource 109 testing the rays for intersection with the prototype. To separate the crossover test from. The output of the crossover test 109 includes the identification of the identified rays intersecting the identified circles. Cross processing 102 may receive these identifications, perform shading based on these identifications, and generate interns of new rays for testing, which ultimately results in fast ray data memory 105. Stored. Such decoupling may be provided in various implementations using one or more of a general purpose computer and fixed function hardware programmed in software in accordance with this description, using communication means selected according to the processing resources used. However, one aspect that occurs in this embodiment is that the shape data tested for intersection with the ray in the intersection test region 140 compared to the ray definition data is transient. In other words, where applicable, high speed memory is also assigned primary to ray data, while shapes are streamed through the tester and there is little computational resource to be used to optimize the caching of such shape data. Various subsequent figures show more specific examples of such decoupling, data flow, ray data storage, and collocation with cross test resources.

또한, 도 1은 최종적으로 프레임 버퍼(111)가 디스플레이(197)를 구동하는데 사용될 수 있다는 사실을 나타낸다. 그러나. 이는 편의를 위해 렌더링이라 불릴 수 있는 교차 테스트 및 세이딩 동작의 결과인 출력의 일 예일 뿐이다. 예를 들어, 출력이 컴퓨터 판독형 매체에 기록될 수 있다. 이 컴퓨터 판독형 매체는 통신 링크에 의해 연결된 연산 자원을 포함하는 네트워크 상으로 전송되거나, 또는 실체가 있는 컴퓨터 판독형 매체 상에 보급되거나 후에 디스플레이하기 위한 렌더링된 이미지의 시퀀스와 같은 렌더링 생성물을 포함한다. 일부의 경우에, 렌더링도리 3-D 장면은, 3-D CAD 모델의 투시도를 포함하는 이미지를 렌더링 또는 몰입형(immersive) 가상 현실 컨퍼런싱의 경우에서와 같은, 실사 3-D 장면의 표현일 수 있다. 이러한 경우에, 렌더링 방법은 물리적 객체의 데이터 표현에 작용하거나 이를 변환한다. 다른 경우에, 3-D 장면은 물리적 객체 및 존재하지 않는 그 외의 객체를 표현하는 일부 객체를 가질 수 있다. 또 다른 추가적인 3-D 장면에서, 장면의 완전성은 비디오 게임 등에서와 같이, 허구일 수 있다. 그러나 결과적으로, 이 방법은 메모리, 디스플레이 및/또는 컴퓨터 판독형 매체의 아티클(article)의 변형인 것이 일반적이다.1 also shows that the frame buffer 111 can finally be used to drive the display 197. But. This is just one example of the output that is the result of a cross test and shading operation that may be called rendering for convenience. For example, the output may be recorded on computer readable media. This computer readable medium includes a rendering product, such as a sequence of rendered images for transmission onto a network containing computational resources connected by a communication link, or for dissemination on a tangible computer readable medium or for later display. . In some cases, the rendering 3-D scene may be a representation of a live-action 3-D scene, such as in the case of rendering or immersive virtual reality conferencing of an image containing a perspective view of a 3-D CAD model. have. In this case, the rendering method acts on or transforms the data representation of the physical object. In other cases, the 3-D scene may have some objects representing physical objects and other objects that do not exist. In another additional 3-D scene, the integrity of the scene may be fictional, such as in a video game or the like. As a result, however, this method is typically a variation of the article in memory, display and / or computer readable media.

또한, 1979년 이래로 광선 추적법을 이용한 렌더링이 구현되어 왔으며, 다양한 기술이 광선 추적법을 이용하여 렌더링을 구현하는 데 필요한 교차 테스트 및 그 외의 기능에 관해 개발되어 왔다. 따라서, 이 명세서에 설명된 특정한 아키텍처 및 방법은 3-D 장면을 2-D 표현으로 렌더링시 사용하기 위한 광선 추적법의 기능적 원리를 대신하지 않는다.In addition, rendering using ray tracing has been implemented since 1979, and various techniques have been developed for cross-testing and other functions required to implement rendering using ray tracing. Thus, the particular architectures and methods described in this specification do not replace the functional principles of ray tracing for use in rendering 3-D scenes in 2-D representations.

도 2는 교차 테스트 영역(140)의 교차 테스트 유닛(109)이 하나 이상의 개별적인 테스트 자원(테스트 셀로도 알려짐)을 포함하는 것을 나타내며, 테스트 자원은 광선에 대한 기하학적 모양을 테스트할 수 있다. 영역(140)은 광선 데이터 저장장치(105)로부터 광선 데이터를 그리고 메모리(139)로부터 기하학적 데이터를 각각 수신하는 테스트 셀(205a-205n)을 포함한다. 각각의 테스트 셀(205a-205n)은 교차 프로세싱(102)으로 결과 인터페이스(121)를 통해 통신에 대한 결과를 생성하고, 이는 지정된 광선이 지정된 원형과 교차하였는지에 대한 식별결과를 포함할 수 있다. 반대로, 광선과 GAD 요소의 교차 테스트의 결과가 로직(203)으로 제공된다. 로직(203)은 광선이 교차한 것으로 결정된 GAD 요소에 이러한 광선을 관련시키는 광선에 대한 레퍼런스의 컬렉션(210)을 유지한다. 2 shows that the cross test unit 109 of the cross test area 140 includes one or more individual test resources (also known as test cells), which can test the geometry of the light beam. Region 140 includes test cells 205a-205n that receive ray data from ray data storage 105 and geometric data from memory 139, respectively. Each test cell 205a-205n generates a result for communication via the result interface 121 with cross processing 102, which may include an identification of whether the designated ray intersects the designated circle. In contrast, the result of the cross test of the light and GAD elements is provided to logic 203. Logic 203 maintains a collection 210 of references to the light rays that associate these light rays with the GAD elements determined to have crossed light rays.

일반적으로, 시스템 컴포넌트는 공지되지 않은, 시간-완료형의, 지정된, 특정한 광선 테스트를 지원하도록 디자인된다. 교차 테스트 유닛(109)은 기하학적 메모리에 대한 판독 액세를 가지며, 입력으로 광선에 대한 참조 큐를 가진다. 교차 테스트의 출력으로서, 각각의 광선은 광선 처음 교차하는 하나의 기하학적 도형(편의를 위해 이 명세서에서 원형(primitive)라 함)과 상관된다. 다른 기하학적 도형(즉, 원형)은 비상관으로 도시될 수 있다.In general, system components are designed to support unknown, time-completed, designated, specific light tests. The cross test unit 109 has a read access to the geometric memory and as input has a reference queue for the light beam. As the output of the crossover test, each ray is correlated with one geometric figure that first intersects the ray (referred to herein as primitive for convenience). Other geometric figures (ie, circles) may be shown uncorrelated.

위에 소개한 바와 같이, 영역(140)은 광선 참조 버퍼와 관련된 관리 로직(203)을 포함하며, 이는 테스트 셀(205a-205n)에서 테스트 될 광선 컬렉션의 리스트(210)를 보관한다. 버퍼 관리 로직(203)은 고정 기능 프로세싱 자원 또는 컴퓨터 판독형 매체로부터 획득된 명령으로 구성된 하드웨어에서 구현될 수 있다. 이러한 명령은 이 명세서에서 로직(203)에 특정된 기능 및 테스크에 따라 모듈에 구조화될 수 있다. 본 발명이 속하는 분야의 기술자는 이러한 설명에 근거하여 로직(203)에 대한 추가 구현예를 제공할 수 있다.As introduced above, region 140 includes management logic 203 associated with the ray reference buffer, which maintains a list 210 of ray collections to be tested in test cells 205a-205n. The buffer management logic 203 may be implemented in hardware consisting of instructions obtained from fixed function processing resources or computer readable media. Such instructions may be structured in modules according to the functions and tasks specific to logic 203 herein. Those skilled in the art to which this invention pertains may provide further implementations for logic 203 based on this description.

로직(203)은 테스트 셀에 광선과 기하학적 도형을 할당할 수 있으며, 이 디자인 내의 다른 유닛과의 통신을 처리할 수 있다. 일 측면에서, 리스트(210) 내의 각 광선 컬렉션은 하나 이상의 기하학적 모양과의 교차에 대해 테스트 될 모든 것인 복수의 광선 식별기를 포함하며, 로직(203)은 이러한 광선 컬렉션을 보관한다. 더 구체적인 예에서, 복수의 광선 식별기는 이러한 컬렉션 내의 식별된 GAD 요소와 교차하도록 정해지며, 복수의 광선과의 교차에 대해 테스트 될 다음 GAD 요소가 GAD 요소의 그래프 내의 교차된 GAD 요소와 관련된다. 지정된 컬렉션에 대해 관련된 요소는 이러한 요소와의 교차 테스트가 시작될 때 메모리(139)로부터 페치된다.Logic 203 may assign light rays and geometric shapes to test cells and handle communication with other units in this design. In one aspect, each ray collection in list 210 includes a plurality of ray identifiers, all of which are to be tested for intersection with one or more geometric shapes, and logic 203 stores such ray collection. In a more specific example, the plurality of ray identifiers are arranged to intersect the identified GAD elements in this collection, and the next GAD element to be tested for intersection with the plurality of rays is associated with the crossed GAD element in the graph of the GAD element. Related elements for a given collection are fetched from memory 139 when a cross-test with these elements begins.

택일적으로 언급된, 로직(203)은 임시 광선 레퍼런스 버퍼 내의 개별적인 자 노드에 대응하는 기하학적 데이터의 하위-부분과 교차하는 광선을 나타내는 레퍼런스를 보관할 수 있으며, 이는 이러한 광선의 추가 프로세싱의 실행을 연기시킨다. 계층적으로 정렬된 GAD의 예에서, 이러한 실행 연기는, 자 노드의 기하학적 하위-부분과 교차하는 광선의 누적 개수가 추가 프로세싱에 적합한 것으로 밝혀진 후속 시간까지, 자 노드 이하의 기하학적 가속 데이터의 하위-부분에 대한 프로새상을 연기할 수 있다.Alternatively, logic 203 may store a reference representing a ray that intersects the sub-portion of geometric data corresponding to an individual child node in the temporary ray reference buffer, which delays the execution of further processing of this ray. Let's do it. In the example of hierarchically aligned GAD, this execution deferral is a sub-sequence of geometric acceleration data below the child node until a subsequent time when the cumulative number of rays intersecting the geometric sub-part of the child node is found to be suitable for further processing. You can postpone the pros of the part.

로직(203)은 셀(205a- 205n)에 대한 테스트를 위한 기하학적 모양을 제공하는 메모리 트랜잭션을 설정하기 위해 메모리(139)와 통신할 수 있다. 로직(203)은 또한 광선 데이터 저장장치(105)와 통신하며, 어떤 광선이 거기에 저장될지를 결정한다. 일부 구현예에서, 로직(203)은 메모리(139)로부터 또는 교차 프로세싱 유닛(102)에서 실행되는 세이딩 프로세스로부터 획득 또는 수신될 수 있으며, 공간이 가용상태일 때, 저장을 위해 메모리(105)로 이러한 광선을 제공하고, 교차 테스트 중에 이용할 수 있다. Logic 203 may communicate with memory 139 to establish a memory transaction that provides a geometry for testing for cells 205a-205n. Logic 203 also communicates with ray data storage 105 and determines which ray will be stored there. In some implementations, logic 203 can be obtained or received from memory 139 or from a shading process executed in cross-processing unit 102, and when space is available, memory 105 for storage. This beam can be provided and used during crossover testing.

따라서, 로직(203)은 GAD 모양의 식별기에 대한 광선 식별기의 연관 데이터를 포함하는 임시 광선 레퍼런스 버퍼를 포함할 수 있다. 하나의 구현예에서, GAD 요소에 대한 식별기는 GAD 요소와 관련된 지정된 컬렉션을 저장하기 위해 버퍼 내의 위치를 식별하도록 해시(hash)될 수 있다. 연관 데이터는, 메모리 내에서 그리고 현재 애플리케이션 내의 일부 장소에서의, 이러한 데이터의 저장 또는 수집을 나타낼 때, 일반적으로 컬렉션이라 하며, "패킷"이라는 용어는 테스트 및 교차 테스트의 결과의 리턴 중에 컬렉션 데이터의 이동을 의미하기 위한 일반적으로 사용된다. 이와 같이 리턴된 결과는 이하에 설명된 바와 같이, GAD 모양과 연관된 메모리 내에 저장된 컬렉션에 합체된다. Thus, logic 203 may include a temporary ray reference buffer that contains the association data of the ray identifier to the GAD shaped identifier. In one implementation, the identifier for the GAD element may be hashed to identify a location in the buffer to store a designated collection associated with the GAD element. Association data is generally referred to as a collection when referring to the storage or collection of such data, in memory and at some place within the current application, and the term "packet" refers to the collection of collection data during Commonly used to mean movement. The result thus returned is incorporated into a collection stored in memory associated with the GAD shape, as described below.

요약하면, 계속하여 도 2는 광선과의 교차에 관해 테스트 될 모양 데이터가 메모리 부터(139) 인출되는 동안, 제 1 메모리(105) 내에 광선 정의 데이터가 저장되는 것을 나타낸다. 위의 설명은 또한, 다음번-테스트 될 복수의 모양이 일단 메모리(139)로부터 페치되고, 이어서 "모" GAD 요소와 교차된 것으로 알려진 일 그룹의 광선과 교차에 대해 테스트 되는 것이 바람직하다는 것을 보여준다.In summary, FIG. 2 shows that the ray definition data is stored in the first memory 105 while the shape data to be tested with respect to the intersection with the ray is fetched from the memory 139. The above description also shows that it is desirable for a plurality of shapes to be next-tested once to be fetched from memory 139 and then tested for a cross with a group of rays known to intersect the "parent" GAD element.

이제, 도 3은 삼차원 장면의 이차원 표현을 광선 추적하기 위한 렌더링 시스템에 사용될 수 있는 영역(140, 도 1)의 ITU(intersection testing unit, 350) 구현예에 대한 블록도를 포함한다. ITU(350)는 복수의 테스트 셀(310a-310n 및 340a-340n)을 포함한다. GAD 요소는 GAD 데이터 저장장치(103b)로부터 기원한 것으로 도시되며, 원형 데이터는 원형 데이터 저장장치(103a)로부터 기원한 것으로 도시된다. 3 now includes a block diagram of an implementation of an intersection testing unit 350 (ITU) of area 140 (FIG. 1) that can be used in a rendering system for ray tracing a two-dimensional representation of a three-dimensional scene. ITU 350 includes a plurality of test cells 310a-310n and 340a-340n. The GAD element is shown as originating from the GAD data store 103b and the prototype data is shown as originating from the circle data store 103a.

테스트 셀(310a-310n)은 GAD 요소와, 이러한 요소에 대해 테스트하기 위한 광선 데이터(즉, 이러한 테스트 셀은 GAD 요소를 테스트함)를 수신한다. 테스트 셀(40a-340n)은 원형과 이러한 원형에 대해 테스트하기 위한 광선 데이터(즉, 이러한 테스트 셀은 원형을 테스트 함)를 수신한다. 따라서, ITU(350)는 원형과의 교차에 대해 광선의 컬렉션을 테스트하고, GAD 요소와의 교차에 대해 별개의 광선 컬렉션을 테스트할 수 있다.Test cells 310a-310n receive GAD elements and ray data (ie, such test cells test GAD elements) for testing against these elements. Test cells 40a-340n receive the prototype and the ray data (ie, such a test cell tests the prototype) for testing against these prototypes. Thus, ITU 350 may test a collection of rays for intersection with a circle and a separate collection of rays for intersection with a GAD element.

ITU(350)는 또한 컬렉션 관리 로직(203a) 및 컬렉션 버퍼(203b)를 포함한다. 컬렉션 버퍼(203b) 및광선 데이터(105)는 메모리(139)(예를 들면)로부터 광선 데이터를 수신할 수 있는 메모리(340) 내에 저장된다. 컬렉션 버퍼(203b)는 GAD 요소와 연관된 광선 레퍼런스를 유지한다. 컬렉션 관리자(203a)는 이러한 컬렉션을 테스트 셀로부터의 교차 정보에 근거하여 관리한다. 컬렉션 관리자(203a)는 또한 광선 컬렉션을 테스트하기 위해, 메모리(139)로부터 원형 및 GAD 요소의 페치를 시작할 수 있다. ITU 350 also includes collection management logic 203a and collection buffer 203b. Collection buffer 203b and ray data 105 are stored in memory 340 that can receive ray data from memory 139 (eg,). Collection buffer 203b maintains a ray reference associated with the GAD element. The collection manager 203a manages this collection based on cross information from the test cell. The collection manager 203a may also begin fetching the prototype and GAD elements from the memory 139 to test the ray collection.

ITU(350)는 식별된 교차부의 식별결과를 리턴한다. 이는 결과 인터페이스(121)를 거쳐 교차 프로세싱(102)으로의 최종적인 준비를 위해 출력 버퍼(375)에서 버퍼링될 수 있다. 식별 정보는, 광선 및 지정 범위에서, 광선이 교차하는 것으로 정해진 원형을 식별하는데 충분하다. ITU 350 returns the identification result of the identified intersection. This may be buffered in the output buffer 375 for final preparation via the result interface 121 to the cross processing 102. The identification information is sufficient to identify, in the light beam and the specified range, the circle in which the light beam is determined to intersect.

ITU(350)는 제어 프로세스 또는 드라이버(예, 드라이버(188))를 통해 호출될 수 있는 기능이나 유틸리티로 보여질 수 있다. 이러한 제어 프로세스나 드라이버는 광선들과 이 광선이 교차에 관해 테스트될 모양을 ITU(350)에 제공한다. 예를 들어, ITU(350)는 드라이버(188)(즉, 세이딩과 같은 렌더링 프로세스 및 초기 광선 생성 기능 유닛와 ITU(350)를 연결하는 프로세스)를 통해 정보를 공급받는다. ITU(350)의 투시도에서, ITU(350)는 자신에게 제공된 정보의 기원을 알 필요가 없다. 왜냐하면 영역(140)이 광선, GAD, 및 자신에 공급되거나 공급된 다른 정보에 근거하여 획득된 원형(더 일반적으로 장면 도형)을 이용하여 교차 테스트를 수행할 수 있기 때문이다.ITU 350 may be viewed as a function or utility that may be invoked through a control process or driver (eg, driver 188). This control process or driver provides the ITU 350 with the rays to be tested for intersection with the rays. For example, ITU 350 is supplied with information through driver 188 (ie, a rendering process such as shading and a process of connecting the initial ray generation functional unit with ITU 350). In perspective view of ITU 350, ITU 350 does not need to know the origin of the information provided to it. This is because region 140 may perform cross-tests using prototypes (more generally scene figures) obtained based on light rays, GAD, and other information supplied or supplied to itself.

위에 설명한 것과 같이, ITU(350)는 어떻게, 언제, 그리고 어떤 데이터가 공급되는 지를 제어할 수 있으며, 이에 따라 ITU(350)는 수동형이 아니며, 예를 들면, 교차 테스트를 위해 요구되는 광선 또는 기하학적 데이터 또는 가속 데이터를 페치할 수 있다. 예를 들어, ITU(350)에는 테스트 될 광선 내의 장면을 식별하기에 충분한 정보와 함께, 교차 테스트하기 위한 많은 수의 광선이 제공될 수 있다. 예를 들어, 지정된 시간에 교차 테스트를 위해 만 개 이상의 광선을 제공받을 수 있으며, 이러한 광선에 대한 테스트를 완료함에 따라, 새로운 광선(교차 프로세싱(102)에 의해 생성됨)이, 이하에 설명된 것과 같이, 대략 초기 개수(initial number)에서 ITU(350)에서 처리될 광선의 수를 유지하기 위해 공급받을 수 있다. ITU(350)는 이후에 프로세싱(광선 컬렉션 버퍼(203b)(도 3 참조)에서) 중에 광선의 임시 저장을 제어한다(로직(203a)(도 3 참조)에서). 또한 이 프로세싱 중에 필요한 원형과 GAD의 요소에 대한 페치 동작을 시작할 수 있다.As described above, the ITU 350 can control how, when, and what data is supplied so that the ITU 350 is not passive, e.g., the ray or geometry required for cross testing. You can fetch data or acceleration data. For example, ITU 350 may be provided with a large number of rays for cross testing, along with enough information to identify the scene within the rays to be tested. For example, more than 10,000 rays may be provided for cross testing at a given time, and upon completion of testing for these rays, new rays (generated by cross processing 102) may be used as described below. Likewise, it may be supplied to maintain the number of rays to be processed at ITU 350 at approximately an initial number. ITU 350 then controls the temporary storage of the light beam during processing (in light collection buffer 203b (see FIG. 3)) (in logic 203a (see FIG. 3)). You can also start fetching the prototypes and elements of the GAD that are needed during this processing.

위에 설명한 것과 같이, GAD 요소는 광선과 비교하여 ITU(350)에서 임시적이다. 왜냐하면, 광선을 정의하는 데이터가 보관된 광선 데이터(105)인데 반해 광선 식별기는 버퍼(203b)에 보관되며 GAD에 관하여 구조화되기 때문이다. 버퍼(203) 및 광선 데이터(105)는 각각 메모리(340)에 보관될 수 있으며, 이는 다양한 방식으로, SRAM 캐시의 하나 이상의 뱅크와 같이, 물리적으로 구현될 수 있다. As described above, the GAD element is temporary at the ITU 350 compared to the ray. This is because the ray identifier 105 is stored in the buffer 203b and is structured with respect to the GAD, whereas the data defining the ray is stored ray data 105. Buffer 203 and ray data 105 may each be stored in memory 340, which may be physically implemented in various ways, such as one or more banks of an SRAM cache.

위에 설명했듯이, 로직(203a)은 메모리(340)에 저장된 광선 컬렉션에 대한 상태를 추적하고, 프로세싱을 위해 준비된 컬렉션을 결정한다. 도 3에 도시된 바와 같이, 로직(203a)는 메모리(340)에 통신가능하게 연결되며, 연결된 테스트 셀 각각으로 테스트를 위한 광선의 전달을 초기화할 수 있다. GAD 요소가 GAD 요소만을 또는 원형만을(이들의 조합이 아님) 규정하는 상황에서, 로직(203a)은 원형이나 그 외의 GAD 요소를 규정하는 특정한 컬렉션이 GAD 요소와 연관되는지 여부에 따라, 광선을 테스트 셀(340a-340n) 또는 테스트 셀(310a-310n)에 할당할 수 있다. As described above, logic 203a tracks the state for the ray collection stored in memory 340 and determines the collection ready for processing. As shown in FIG. 3, logic 203a is communicatively coupled to memory 340 and may initiate the transmission of light beams for testing to each of the connected test cells. In a situation where a GAD element specifies only GAD elements or only circles (not a combination thereof), logic 203a tests the ray, depending on whether a particular collection that defines a circle or other GAD element is associated with the GAD element. The cells may be allocated to the cells 340a-340n or the test cells 310a-310n.

특정한 GAD 요소가 GAD 요소와 원형 모두를 규정할 수 있는 예에서, ITU(350)는 GAD 요소 및 원형 모두를 각각의 테스트 셀에 광선과 함께 제공하기 위한 데이터 경로를 가질 수 있으며, 이로써 로직(203a)은 테스트 자원 사이의 컬렉션의 광선 테스트를 준비할 수 있다. 이러한 예에서, GAD 요소와 원형 사이의(예를 들면, 구 대 삼각형) 모양 상의 전형적인 차이 때문에, 테스트 로직을 선택거나 또는 테스트 될 모양에 대해 최적화된 교차 테스트 알고리즘을 로드하기 위한 식별결과가 로직(203a)으로부터 제공될 수 있다.In an example where a particular GAD element may define both a GAD element and a circle, the ITU 350 may have a data path for providing both the GAD element and the circle with rays to each test cell, thereby providing logic 203a. ) Can prepare for ray testing of the collection between test resources. In this example, due to the typical differences in shape between the GAD element and the circle (eg, sphere versus triangle), the identification result for selecting the test logic or loading the cross-test algorithm optimized for the shape to be tested is determined by the logic ( 203a).

로직(203a)은 테스트 셀(340a-340n) 및 테스트 셀(310a-310n)에 대한 정보를 직접 또는 간접적으로 제공할 수 있다. 간접적인 경우에, 로직(203a)는 각각의 테스트 셀에 정보를 제공하여, 각 테스트 셀이 메모리(340)에 대한 광선 데이터의 페치를 시작할 수 있다. 로직(203a)은 메모리(340)와 별개로 도시되었으나, 설명을 간략히 하기 위해, 로직(203a)이 메모리(340)의 회로 내에 구현될 수 있다. 왜냐하면 로직(203a)에 의해 광범위하게 수행되는 관리 기능이 메모리(340)에 저장된 데이터와 관련되기 때문이다.The logic 203a may provide information about the test cells 340a-340n and the test cells 310a-310n directly or indirectly. In an indirect case, logic 203a provides information to each test cell so that each test cell can begin fetching light ray data to memory 340. Logic 203a is shown separately from memory 340, but for simplicity of explanation, logic 203a may be implemented within circuitry of memory 340. This is because a management function that is extensively performed by logic 203a is associated with data stored in memory 340.

교차 테스트 자원에 의한 메모리(340)로의 액세스의 병렬화(parallelization)를 증가시키는 능력은 이 명세서에 설명된 발명의 일부 측면의 장점이다. 이와 같이, 메모리(340)에 대한 액세스 포트의 수의 증가(바람직하게는 테스트 셀당 하나 이상)가 장점이 된다. 이러한 병렬화에 관련된 예시적인 구조화가 이하에서 추가로 설명된다.The ability to increase parallelization of access to memory 340 by cross test resources is an advantage of some aspects of the invention described herein. As such, an increase in the number of access ports for the memory 340 (preferably one or more per test cell) is an advantage. Exemplary structuring related to such parallelization is further described below.

또한, ITU(350)는 입력 데이터를 제공하고, ITU(350)로부터의 출력을 수신하는 유닛과 비동기식으로 동작할 수 있다. 여기서, "비동기식(asynchronous)"은, 이전에 수신된 광선에 대해 교차 테스트가 계속되는 동안, ITU가 추가적인 광선의 교차 테스트를 수신 및 시작하는 것을 포함할 수 있다. 또한, "비동기식"은 ITU(350)가 광선을 수신하는 순서로 광선에 대한 교차 테스트가 완료될 필요가 없다는 것을 포함한다. 또한, 비동기식은 ITU(350) 내의 교차 테스트 자원이 삼차원 장면 내의 광선 또는 장면에 중첩된 스케줄링 격자의 위치에 관계없이, 또는 모 광선과 적은 수의 모 광선으로부터 나온 자 광선과 같은 세대 관계를 가지는 광선만을 또는 특정한 세대의 광선만(예, 카메라 광선 또는 이차 광선)을 테스트하는 것에 관계없이, 교차 테스트를 할당하거나 배정하는 것이 가능하다. In addition, ITU 350 may operate asynchronously with a unit providing input data and receiving output from ITU 350. Here, “asynchronous” may include the ITU receiving and starting an additional ray crossover test while the crossover test continues for a previously received ray. In addition, “asynchronous” includes that the cross-test for the beams need not be completed in the order in which the ITU 350 receives the beams. In addition, asynchronous means that the cross-test resource in ITU 350 is irrespective of the position of the ray in the three-dimensional scene or the scheduling grid superimposed on the scene, or has a generational relationship, such as the ray of light from the parent ray and a small number of ray rays. Regardless of testing the bay or only a particular generation of rays (eg, camera rays or secondary rays), it is possible to assign or assign cross tests.

ITU(350)는 또한 원형 및 이 원형과 교차된 광선의 식별된 교차부의 식별결과를 수신하는 출력 버퍼(375)를 포함한다. 일 예에서, 식별결과는 원형과 교차된 광선을 식별하는데 충분한 정보와 쌍을 이루는 원형에 대한 식별결과를 포함한다. 광선에 대한 식별 정보는 레퍼런스(가령 인덱스)를 포함할 수 있으며, 이는 광선의 리스트에서 특정한 광선을 식별한다. 예를 들어, 이 리스트는 호스트(185)에서 동작하는 드라이버(188)에 의해 관리될 수 있으며, 이 리스트는 메모리(139)에 보관될 수 있다. 바람직하게는, 메모리(139)가 이러한 정보를 포함하지 않는 경우에, 메모리(139)가 광선의 원점이나 방향과 같은 정보(이는 광선을 재구성하기에 충분함)를 포함할 수 있다. 보통 레퍼런스를 통과시키는데 소수의 비트가 필요하며, 이는 이점이 될 수 있다.The ITU 350 also includes an output buffer 375 that receives the identification of the circle and the identified intersection of the light beams intersecting the circle. In one example, the identification results include identification results for the prototype paired with enough information to identify the ray intersected with the prototype. Identification information for a ray may include a reference (eg, an index), which identifies the particular ray in the list of rays. For example, this list may be managed by the driver 188 operating on the host 185, which may be kept in the memory 139. Preferably, if the memory 139 does not contain such information, the memory 139 may include information such as the origin or direction of the light beam (which is sufficient to reconstruct the light beam). Usually a few bits are required to pass the reference, which can be an advantage.

도 4는 테스트 셀(310a)의 예를 도시하며, 작업 메모리(410)와 테스트 로직(420)을 포함할 수 있다. 작업 메모리(410)는 수 개의 레지스터를 포함하며, 이는 표면과의 교차에 관해 라인 세그먼트를 테스트하기에 충분한 정보를 포함하며, 다르 실시예에서는 더 복잡해질 수 있다. 예를 들어, 작업 메모리(410)는 교차에 관해 수신된 특정한 모양을 테스트하기 위한 테스트 로직(420)을 구성하는 명령을 저장하고, 수신된 데이터에 근거하여 어느 모양이 수신되었는지를 검출할 수 있다. 작업 메모리(410)는 검출된 히트를 임시저장할 수 있으며, 여기서, 각각의 테스트 셀은 기하학적 모양에 대해 일련의 광선을 테스트하도록, 또는 그 역으로 구성된다. 이어서 임시저장된 히트(hit)가 그룹으로 출력될 수 있다. 작업 메모리는 또한 저장장치(103b)로부터 입력된 모양 데이터를 수신한다.4 illustrates an example of a test cell 310a and may include a working memory 410 and test logic 420. The working memory 410 includes several registers, which contain enough information to test the line segment for intersection with the surface, which may be more complex in other embodiments. For example, the working memory 410 may store instructions for configuring test logic 420 to test a particular shape received with respect to the intersection, and detect which shape was received based on the received data. . The working memory 410 may temporarily store the detected hits, where each test cell is configured to test a series of rays for a geometric shape, or vice versa. The temporarily stored hits may then be output as a group. The working memory also receives shape data input from storage 103b.

테스트 로직(140)은 가용한 또는 선택가능한 해상도로 교차 테스트를 수행하고, 검출된 교차부가 존재하는지 여부를 표시하는 이진 값을 리턴할 수 있다. 이러한 이진 값은 판독, 캐싱(임시저장) 또는, GAD 요소 테스트를 위한 메모리(340) 내의 판독 사이클과 같은 판독 사이클 중의 래치를 위한 출력을 위해 작업 메모리에 저장될 수 있다.The test logic 140 may perform a crossover test at an available or selectable resolution and return a binary value indicating whether the detected crossover is present. This binary value may be stored in the working memory for output for latching during a read cycle, such as a read, caching (temporary) or read cycle in memory 340 for GAD element testing.

도 5는 예시적인 메모리 구조화에 대해 상세히 초점을 맞춘, 교차 테스트 유닛(500)의 구현예에 대한 여러 측면을 도시한다. ITU(500)에서, 테스트 셀(510a-510n 및 540a-540n)이 도시되고, 이 예에서 테스트 셀(310a-310n 및 340a-340n)과 대응한다. 이는 테스트 셀의 수에 관한 임의의 제한사항을 의미하는 것은 아니다. 따라서, ITU(500)에서, 원형 및 GAD 요소는 병렬로 테스트될 수 있다. 그러나, 하나의 변형예 등에서의 추가 테스트 셀이 필요한 것으로 결정된 경우에, 임의 테스트 셀이 적절히 재구성될 수있다(하드웨어의 경우에 재배치되고, 소프트웨어의 경우에는 재프로그램됨). 트랜지스터 밀도가 점자 증가함에 따라, 이러한 테스트 셀이 하드웨어 구현시(또는 소프웨어를 실행할 수 있는 자원으로서) 더 많이 수용될 수 있다. 위에 설명한 것과 같이, 테스트 셀의 일부가, 공통 모양(즉, 원형 또는 GAD 요소)에 대해 광선을 테스트할 것이라는 점에서, 동작 그룹으로 취급될 수 있다. 테스트 셀(540a-540n)은 특정한 정밀도 레벨(예, 16 비트)에서 원형과의 교차를 표시하는 이진 값을 리턴할 수 있으며, 더 큰 원형의 경우에 유용한, 원형에서 광선이 교차한 지점에 대한 더 정확한 식별결과를 리턴할 수 있다.5 illustrates various aspects of an implementation of a cross test unit 500, focusing in detail on example memory structuring. In the ITU 500, test cells 510a-510n and 540a-540n are shown, corresponding to test cells 310a-310n and 340a-340n in this example. This does not imply any limitation on the number of test cells. Thus, in ITU 500, the prototype and GAD elements can be tested in parallel. However, if it is determined that an additional test cell in one variant or the like is needed, any test cell may be properly reconfigured (relocated in the case of hardware and reprogrammed in the case of software). As the transistor density increases, the test cells can be accommodated more in hardware implementations (or as a resource capable of executing software). As described above, some of the test cells may be treated as working groups in that they will test the light beams for a common shape (ie, circular or GAD elements). The test cells 540a-540n can return a binary value indicating intersection with the circle at a particular level of precision (e.g., 16 bits), which is useful for larger circles, for points where the rays intersect in the circle. More accurate identification can be returned.

ITU(500)에서, 메모리(540)는 독립적인 복수의 동작 뱅크(510-515)를 포함하며, 이들 각각은 두 개의 포트(식별된 뱅크(515)의 포트(531) 및 포트(532)를 가진다. 하나의 포트는 GAD 테스트 로직(505)을 통해 액세스되고, 다른 하나의 포트는 원형 테스트 로직(530)을 통해 액세스된다. GAD 및 원형 테스트 로직(505 및 530)은 각각 개별적인 작업 버퍼(650-565 및 570-575) 사이의 데이터 플로우를 관리하고, 각각 GAD 저장장치(103a) 및 원형 저장장치(103b)로부터 테스트를 위한 GAD 요소를 획득한다. In ITU 500, memory 540 includes a plurality of independent operating banks 510-515, each of which has two ports (port 531 and port 532 of identified bank 515). One port is accessed through the GAD test logic 505 and the other port is accessed through the prototype test logic 530. The GAD and prototype test logic 505 and 530 are each a separate working buffer 650. Manage the data flow between -565 and 570-575 and obtain GAD elements for testing from GAD storage 103a and prototype storage 103b, respectively.

뱅크(510-515)는, 대부분, GAD 및 원형 테스트 로직(505 및 530)에 의한 광선 데이터로의 비-충돌 액세스를 제공하도록 동작하여, 테스트 셀(510a-510n) 및 테스트 셀(540a-540n)이 각각 별개의 뱅크(510-515)로부터 광선을 제공받을 수 있다. 이러한 비-충돌 액세스는, 이 명세서에서 이해할 수 있는 것과 같이, 예를 들면 별개의 캐시 뱅크와 크로스-바 아키텍처(이는 메모리의 서로 다른 물리적 부분에 대한 포트에 의한 액세스를 가능하게 함)에 의해 구현될 수 있다. 하나 이상의 테스트 셀에 의한 뱅크에 저장된 광선의 테스트가 허용되는 경우에, 테스트 될 두 개의 광선이 동일한 뱅크에 존재할 때 충돌이 일어날 수 있으며, 이러한 경우에, 액세스가 테스트 로직(505 및 530)에 의해 순차적으로 처리될 수 있다. 일부의 경우에, 작업 버퍼(560-565 및 570-575)는 다른 프로세싱이 완료되는 동안, 다음 프로세싱 사이클을 로드할 수 있다. 예를 들어, 영역(578)은 GAD 요소에 대한 테스트 영역을 포함한다. 왜냐하면 이 영역은 GAD 테스터(510a) 및 메모리 뱅크(510)를 포함하기 때문이다. 반면, 영역(579)은, 테스터(510a 및 540a, GAD 및 원형 각각에 대해 하나씩)를 포함하고 영역(578 및 579)의 테스트 셀에 관계된 테스트 시에 사용될 광선 데이터를 저장하는 메모리 뱅크(510)에 액세스를 가지기 때문에, GAD 요소 원형 모두에 대한 테스트 영역을 포함한다. The banks 510-515 mostly operate to provide non-conflict access to the ray data by the GAD and prototype test logic 505 and 530, thereby providing test cells 510a-510n and test cells 540a-540n. ) May each receive light from separate banks 510-515. Such non-conflict access is implemented by, for example, separate cache banks and cross-bar architectures, which allow access by ports to different physical parts of memory, as can be understood herein. Can be. If testing of the rays stored in the bank by one or more test cells is allowed, a collision may occur when two rays to be tested exist in the same bank, in which case access is made by the test logic 505 and 530. Can be processed sequentially. In some cases, job buffers 560-565 and 570-575 may load the next processing cycle while other processing is completed. For example, region 578 includes a test region for the GAD element. This is because this area includes the GAD tester 510a and the memory bank 510. In contrast, region 579 includes testers 510a and 540a, one for each of the GAD and prototype, and stores a memory bank 510 that stores light data to be used in a test relating to the test cells of regions 578 and 579. Because it has access to, it includes a test area for all of the GAD element prototypes.

일관되게 광선을 테스트함으로써, 어느 광선이 어느 테스트 셀에 어느 광선이 배정되는지를 추적하는 동작이 감소될 수 있다. 예를 들어, 각각의 컬렉션은 32개의 광선을 가질 수 있으며, 테스트 셀(310a-310n) 중 32개가 존재할 수 있다(510a-510n) . 예를 들어, 컬렉션 내의 4번째 광선을 테스트 셀(310d)에 일관되게 제공함으로써, 셀(310d)은 이에 제공되었던 광선에 관한 정보를 유지할 필요가 없고, 교차의 식별결과만을 리턴하면 된다. 도시된 바와 같이, 일관성을 유지하기 위한 다른 구현예가 제공될 수 있으며, 이는 테스트 셀 중의 광선 식별기들의 패킷을 전송하고, 테스트 셀이 패킷에 교차 결과를 기록하도록 한다. By consistently testing the rays, the operation of tracking which rays are assigned to which test cells can be reduced. For example, each collection may have 32 rays, and there may be 32 of the test cells 310a-310n (510a-510n). For example, by consistently providing the fourth light ray in the collection to the test cell 310d, the cell 310d does not need to maintain information about the light rays that were provided to it, and only needs to return the identification result of the intersection. As shown, another implementation may be provided to maintain consistency, which sends a packet of ray identifiers in the test cell, allowing the test cell to write the intersection result in the packet.

광선 컬렉션을 위한 저장장치가 광선 컬렉션을 위한 n-웨이 인터리빙 캐시로 구현될 수 있으며, 이에 따라 임의의 지정된 광선 컬렉션이 광선 컬렉션 버퍼(203b 또는 520)의 n개의 부분 중 하나에 저장될 수 있다. 광선 컬렉션 버퍼(203b 또는 520)은 버퍼의 n개의 부분들 각각에 저장된 광선 컬렉션의 리스트를 보관할 숭 lT다 광선 컬렉선 버퍼(203b 또는 520)의 구현예는 광선 컬렉션과 연관된 GAD의 요소의 식별 특성을 사용할 수 있으며, 예를 들면 장면을 렌더링하는데 사용된 GAD 요소 사이에 고유한 식별기 스트링가 사용될 수 있다. 영숫자 스트링 캐릭터 스틜은 숫자, 또는 해시(hash) 등이다. 예를 들어, 해시는 광선 수집 버퍼(203b, 520)의 n개의 부분 중 하나를 참조할 수 있다.Storage for the ray collection may be implemented as an n-way interleaving cache for the ray collection, such that any designated ray collection may be stored in one of the n portions of the ray collection buffer 203b or 520. The ray collection buffer 203b or 520 will keep a list of ray collections stored in each of the n portions of the buffer. For example, a unique identifier string may be used between the GAD elements used to render the scene. An alphanumeric string character set is a number or a hash. For example, the hash may refer to one of n portions of the ray collection buffers 203b and 520.

다른 구현예에서, GAD의 요소는, 예를 들면 버퍼의 일부에 사용된 영숫자 스트링의 세그먼트를 매핑함으로써, 광선 수집 버퍼(203b, 520)의 지정 부분에 저장되도록 정해질 수 있다. 원형/광선 교차 출력(580)은 잠정적인 원형/광선 교차를 식별하기 위한 출력을 표현하며, 출력(580)은 직렬 또는 병렬일 수 있다. 예를들어, 32개의 원형 테스트 셀(540a-540n)이 존재하는 경우에, 출력(580)은 방금 테스트 된 원형에 대한 각각의 광선의 교차부의 존재 또는 비 존재를 표시하는 32비트를 포함할 수 있다. 물론, 예를 들면 패킷 구현예와 같은 다른 구현예에서, 출력이 테스트 셀로부터 직접 나올 수도 있다. 출력은 직렬일 수있으며 패킷 내의 테스트 셀에 의해 직렬로 저장될 수 있다.In other implementations, elements of the GAD may be arranged to be stored in designated portions of the ray collection buffers 203b, 520, for example by mapping segments of alphanumeric strings used in portions of the buffer. Circular / ray intersection output 580 represents an output for identifying potential circular / ray intersection, and output 580 may be in series or in parallel. For example, if there are 32 circular test cells 540a-540n, the output 580 may include 32 bits indicating the presence or absence of the intersection of each ray for the circle just tested. have. Of course, in other implementations, such as, for example, packet implementations, the output may come directly from the test cell. The output may be serial and may be stored serially by test cells in the packet.

광선 데이터가 광원(가령, 세이더(shader))으로부터 메모리(340)(520)에 수신된다. 컬렉션 관리 로직(예, 도 2 및 3의 203a)은 컬렉션으로 초기에 광선을 할당하도록 동작하며, 여기서 각각의 컬렉션은 GAD의 요소와 연관된다. 예를 들어, GAD의 요소는 그래프의 루트 노드일 수 있으며, 수신된 모든 광선이 루트 노드에 연관된 하나 이상의 컬렉션에 초기에 할당된다. 광선의 수신은 또한 전체 컬렉션이 되도록 크기가 정해진 그룹(예, 입력 큐로부터)에 포함될 수 있으며, 이러한 각각의 컬렉션은 예를 들면, 광선 컬렉션 버퍼(203b)에서 식별된 컬렉션과 같이 취급될 수 있다. Light ray data is received in the memory 340 and 520 from a light source (eg, a shader). Collection management logic (eg, 203a in FIGS. 2 and 3) operates to initially assign rays to a collection, where each collection is associated with an element of the GAD. For example, an element of the GAD may be the root node of the graph, with all received rays initially assigned to one or more collections associated with the root node. Receipt of the rays may also be included in a group sized to be the entire collection (eg from an input queue), each of which may be treated like a collection identified in the rays collection buffer 203b, for example. .

하나의 컬렉션의 처리에 집중하여, 다수의 컬렉션이 병렬로 처리될 수 있으며, 메모리(340)로부터의 테스트 노드와 연관된 컬렉션의 광선의 검색은, 예를 들면 컬렉션 내에 데이터로 저장된 이러한 광선의 어드레스(광선 식별기)를 메모리 (340)로부터 또는 도 5의 예에 의하면, 테스트 셀(예, 테스트 셀(560-565))의 수신을 위한 복수의 출력 포트 상에 광선 데이터를 제공하는 뱅크(510-515)로부터, 이러한 광선을 검색하기 위해 제공함으로써, 컬렉션 관리 로직(203a)에 의해 시작된다.Concentrating on the processing of one collection, multiple collections can be processed in parallel, and retrieval of the rays of the collection associated with the test node from the memory 340 can be achieved by, for example, the address of such rays stored as data within the collection. Ray identifier) from the memory 340 or according to the example of FIG. 5, a bank 510-515 providing ray data on a plurality of output ports for reception of a test cell (eg, test cells 560-565). Is provided by the collection management logic 203a by providing for retrieving such rays.

테스트를 위해 선택된 노드에 의해 규정된 GAD 소자의 테스트에 관하여(즉, 선택된 노드와 연관된 GAD 요소는 다른 GAD 요소를 규정함), 테스트 중인 컬렉션의 광선을 위한 광선 데이터의 분산이 종료되고, 규정된 GAD 요소의 페칭이 수행된다(광선 분산에 연이은 이러한 페칭이 필수적인 것은 아님). 이러한 페칭에 관하여, 로직(203a)은 GAD 저장장치(103b)로 어드레스 정보를 입력할 수 있으며(또는 제공된 메모리 관리 수단이라면 어느 것에 의해서든), 어드레스 GAD 요소를 테스트 셀(310a-310n)로 출력한다. 이러한 경우에서와 같이, 다중 GAD 요소가 규정되는 경우에, 다중 GAD 소자의 블록 판독을 위해, 이 요소는 테스트 셀로 직렬적으로 스트림화 되도록 배열될 수 있다(예를 들면 직렬 버퍼를 이용).Regarding the testing of the GAD element defined by the node selected for testing (ie, the GAD element associated with the selected node defines another GAD element), the distribution of the ray data for the ray of the collection under test is terminated, and Fetching of the GAD elements is performed (this subsequent fetching is not necessary for ray dispersion). With respect to this fetching, the logic 203a can input address information into the GAD storage 103b (or by any provided memory management means) and output the address GAD element to the test cells 310a-310n. do. As in this case, where multiple GAD elements are defined, for block reading of the multiple GAD elements, these elements may be arranged to be serially streamed into the test cell (eg using a serial buffer).

테스트 셀(예, 310a-310n)에서, 컬렉션의 광선은 직렬로 제공된 GAD 요소(예, 각각의 테스트 셀 내의 서로 다른 광선)와의 교차를 위해 테스트 될 수 있다. 광선이 교차되는 것으로 결정된 경우에, 교차된 GAD 요소에 대한 컬렉션이 존재하는지 여부가 결정되고, 그러한 경우에 광선이 그 컬렉션(공간이 허락됨) 부가되고, 그렇지 않은 경우에 그 컬렉션이 생성되며, 광선이 부가된다. 존재하는 컬렉션에 공간이 없으면, 새로운 컬렉션이 생성될 수 있다.In test cells (eg, 310a-310n), the light rays of the collection can be tested for intersection with the GAD elements provided in series (eg, different light rays in each test cell). If it is determined that the rays intersect, it is determined whether a collection exists for the crossed GAD elements, in which case the rays are added to that collection (space allowed), otherwise the collection is created, Rays are added. If there is no space in an existing collection, a new collection can be created.

일부 구현예에서, 테스트 셀(310a-310n)의 번호에 대한 컬렉션 내 광선의 최대 수의 1:1 대응 관계가 제공되어, 컬렉션의 모든 광선이 지정된 GAD 요소에 병렬로 테스트될 수 있으며, 이는 하나의 아키텍처를 포함할 수 있다. 이 아키텍처에서, 처리량은 일반적으로 광선 대 테스트 셀의 1:1 대응관계를 이용하여 획득될 수 있는 것과 유사하다. 그러나 서로 다른 테스트 셀 사이에서 패킷(예, 위에 설명된 것과 같이 컬렉션을 나타내는 정보)의 직렬적 전달을 위해 제공하여, 지정된 컬렉션의 전체 광선이 병렬로 테스트 된 것으로 보이더라도, 서로 다른 패킷으로부터의 광선을 테스트할 수 있게 한다.In some implementations, a 1: 1 correspondence of the maximum number of rays in a collection to a number of test cells 310a-310n is provided such that all rays of the collection can be tested in parallel to a given GAD element, which is one It may include the architecture of. In this architecture, throughput is generally similar to what can be obtained using a 1: 1 correspondence of light to test cells. However, it provides for the serial transfer of packets (eg, information representing a collection as described above) between different test cells, so that even if the entire ray of a given collection appears to have been tested in parallel, rays from different packets Allows you to test

이후에, 광선은 테스트 셀에 제공된 원형과 교차에 대해 테스트된다(즉, 이 실시예에서, 각각의 테스트 셀은 서로 다른 광선을 가지며 공통 원형을 가지는 광선을 테스트 함). 테스트 후에, 각각의 테스트 셀은 검출된 교차부를 식별한다.Thereafter, the ray is tested for intersection with the circle provided to the test cell (ie, in this embodiment, each test cell has a different ray and tests the ray with a common circle). After the test, each test cell identifies the detected intersection.

컬렉션의 각 광선은 테스트 셀에 제공된 GAD 소자와의 교차에 대해 테스트 셀 내에서 테스트 된다(예를 들어, 도 5의 다중 뱅크 예(영역(578, 579)가 도시됨)에서, 광선이 예를 들면, GAD 요소 테스트 영역 및/또는 원형 테스트 영역에 로컬화되는 것으로 간주할 수 있으며, 이에 따라 뱅크는 광선 테이터를 이용하여 각 종류의 하나 이상의 테스터를 돕는다). Each ray of the collection is tested in the test cell for intersection with the GAD element provided in the test cell (eg, in the multi-bank example of FIG. 5 (regions 578 and 579 are shown), the ray is an example. For example, it may be considered to be localized in the GAD element test area and / or the circular test area, whereby the bank uses ray data to assist one or more testers of each kind).

GAD 요소와의 교차에 관한 광선 테스트로부터의 출력이 원형 교차에 관한 동일한 광선의 테스트와 다르기 때문에(즉, 결과적으로 GAD 요소와의 교차부가 GAD 요소를 위한 컬렉션으로의 컬렉션이 되며, 반면에 원형과의 교차부는 그 원형과 가장 인접한 교차부의 결정 및 이러한 교차부의 출력이 됨), 특정한 광선이 병렬로 테스트될 두 개의 컬렉션에 존재하는 경우에도, 컬렉션 데이터를 재기록하거나 교차결과를 출력하는데 충돌이 발생하지 않는 것이 일반적이다. 예를 들어, 테스트 셀(340a-340n)의 다중 인스턴스화 중에 원형 교차부에 대한 광선의 다중 컬렉션을 테스트함으로써, 추가적인 병렬화가 구현되는 경우에, 다중 교차부의 저장과 같은 테스트의 순차적 완료, 또는 비트 잠금 등을 강제하도록 구성될 수 있다. 도 5의 실시예에서, 지정 광선에 관한 데이터가 단 하나의 뱅크에서 하나의 테스터 타입(예, 지정 관선이 하나의 메모리 뱅크에 위치함) 제공될 수 있는 경우에, 다중 GAD 테스터는 예를 들면 동시에 동일한 광선을 테스트하지 않으며, 이로써 재기록 충돌 문제를 회피할 수 있다.Since the output from the ray test on the intersection with the GAD element differs from the test of the same ray on the circular intersection (ie, the intersection with the GAD element becomes a collection into the collection for the GAD element, whereas The intersection of is the determination of the intersection closest to its circle and the output of these intersections), even if a particular ray exists in two collections to be tested in parallel, no collisions occur in rewriting the collection data or outputting the intersection result. It is not common. For example, if additional parallelization is implemented by testing multiple collections of rays for circular intersections during multiple instantiations of test cells 340a-340n, sequential completion of the test, such as storage of multiple intersections, or bit locks. And so on. In the embodiment of FIG. 5, if the data for a given beam of light can be provided in one tester type (e.g., a designated conduit is located in one memory bank) in only one bank, the multiple GAD testers are for example. At the same time, the same rays are not tested, thereby avoiding the problem of rewriting collisions.

요약하면, 이 방법은 광선을 수신하고, 이 광선을 컬렉션에 할당하며, 테스트 대기 컬렉션을 선택하며, 선택된 컬렉션의 광선을 적절한 테스트 셀에 할당하고, 테스트 셀을 통해 교차 테스트를 위한 적합한 기하학적 도형을 스트리밍하는 단계를 포함할 수 있고, 여기서, 대기상태(radiness)는 알고리즘적으로 결정될 수 있다. 출력은 기하학적 도형이 장면 원형 또는 GAD 요소인지에 달려있다. GAD 요소에 대해 테스트된 광선에 관하여, GAD 요소는 테스트될 컬렉션과 연관된 노드와의 그래프 연결에 근거하여 식별되며, 광서은 테스트될 GAD 요소와 연관된 컬렉션에 추가된다. 컬렉션은 대기상태를 보여주며, 대기된 때 테스트하도록 선택된다. 원형과 광선의 교차에 관하여, 가장 인접한 교차부가 광선에 의해 추적된다. 광선은 광선 컬렉션과 연관된 대 테스트되기 때문에, 특정한 광선에 대한 교차 테스트는 관련된 컬렉션이 테스트를 위해 대기인 것으로 결정될 때까지, 연기되는 것으로 가정한다. 광선은 다중 컬렉션으로 일관성 있게 수집될 수 있으며, 이는 이러한 광선이 장면 기하학적 도형의 서로 별개의 부분에 대해 테스트되도록 한다(즉, 이들은 중위순회(in order of traversal) 식으로 테스트 될 필요가 없다).In summary, this method receives a ray, assigns it to a collection, selects a test-waiting collection, assigns the ray of the selected collection to the appropriate test cell, and creates a suitable geometric shape for cross-testing through the test cell. Streaming, wherein the radiance may be determined algorithmically. The output depends on whether the geometry is a scene circle or a GAD element. With respect to the rays tested for the GAD element, the GAD element is identified based on a graph connection with the node associated with the collection to be tested, and the optical document is added to the collection associated with the GAD element to be tested. The collection shows the wait state and is selected to test when it waits. Regarding the intersection of the circle and the ray, the nearest intersection is tracked by the ray. Since the ray is tested vs. associated with the ray collection, the cross-test for a particular ray is assumed to be postponed until the associated collection is determined to be waiting for testing. Rays can be collected consistently in multiple collections, which allows these rays to be tested against distinct parts of the scene geometry (ie they do not need to be tested in an in order of traversal).

이전에 설명한 것과 같이, ITU는 메모리에 저장되고, 광선으로부터 이전에 수신된 광선의 정보 표현이 입력된다. ITU는 이러한 광선에 대해, 복수의 컬렉션의 하나 이상의 광선 컬렉션을 이용하여 각각의 광선의 조합(association)을 유지한다. ITU는, 또한 메모리에 저장된 복수의 컬렉션에 대한 컬렉션 포화(fullness)의 식별결과를 보관한다. 이러한 식별결과는 전체 컬렉션을 나타내는 개개의 플래그일 수 있으며, 또는 기전 컬렉션과 연관된 복수의 광선을 표현하는 번호일 수 있다. 더 구체적으로, 그리고 테스트 알고리즘을 구현하는 것에 관련된 다른 구현예 및 변형예가 위에 참조된 관련 애플리케이션에 제공되며, 여기에 문자적으로 제공된 정보는 이들의 제한적으로 취급하는 것이 아니라는 것을 보여준다. As previously described, the ITU is stored in memory, and an informational representation of the ray previously received from the ray is input. The ITU uses one or more ray collections of a plurality of collections for these rays to maintain the association of each ray. The ITU also stores the identification of collection fullness for a plurality of collections stored in memory. This identification may be an individual flag representing the entire collection or may be a number representing a plurality of rays associated with the mechanism collection. More specifically, and other implementations and variations related to implementing the test algorithms are provided in the relevant applications referenced above, which show that the information provided herein is not to be taken as their limiting.

이러한 포인트에 대한 설명으로부터 분명히 알 수 있듯이, 광선은 광선의 컬렉션에 제공된 정보에 근거하여 메모리로부터 로드(또는 액세스)된다. 따라서, 이러한 로딩은 각각의 광선을 나타내는 데이터가 저장된 개별적인 메모리 위치를 결정한다. 이러한 데이터는 그 광선 컬렉션에 포함되며, 예를 들면 하나의 광선 컬렉션은 메모리 위치 또는 저장장치에 대한 그 외의 레퍼런스의 리스트를 포함하며, 저장장치에 컬렉션 내의 광선에 관한 광선 데이터가 저장된다. 예를 들어, 광선 컬렉션은 메모리(예, 메모리(340)), 도는 메모리의 뱅크(예, 뱅크(510), 도는 다른 구현예 내의 위치에 대한 레퍼런스를 포함한다. 이러한 레퍼런스는 베이스(base)로부터의 오프셋의 절대값이거나 이러한 데이터를 참조하는 다른 적합한 방식일 수 있다. 이러한 측면은 별개의 광선 데이터와 광선 컬렉션 데이터가 유지되는 투시도에 표현된다. 그러나, 일부 구현예에서, 이러한 구분은, 광선 컬렉션 데이터 및 광선 테이터가 예를 들면 내용 연관적 데이터베이스에 보관될 수 있다는 점에서, 매우 명시적이거나 분명할 필요가 없다. 여기서, 컬렉션 및 광선 사이 및 컬렉션과 GAD의 요소 사이의 연관성이 유지되고 테스트를 위해 컬렉션과 연관된 광선 및 컬렉션과 연관된 GAD 요소를 식별하는 데 사용된다. As is apparent from the description of these points, the rays are loaded (or accessed) from the memory based on the information provided in the collection of rays. Thus, this loading determines the individual memory locations where data representing each ray is stored. Such data is included in the ray collection, for example one ray collection contains a list of memory locations or other references to storage, where the ray data about the rays in the collection is stored. For example, the ray collection includes references to locations in memory (e.g., memory 340), or banks of memory (e.g., bank 510, or other implementations). The aspect may be represented in a perspective view in which separate ray data and ray collection data are maintained, but in some embodiments, this distinction is a ray collection. There is no need to be very explicit or explicit in that data and ray data can be kept in a content-related database, for example, where associations between collections and rays and between elements in the collection and GAD are maintained and tested. To identify the rays associated with the collection and the GAD elements associated with the collection.

또한, 광선 데이터는, 원형 또는 GAD 요소가 테스트 셀을 통해 순회됨에 따라 테스트 셀 내에 "고정(stationa)"되는 것이 분명하다. 관련 애플리케이션에서 설명한 것과 같이, 다른 구현예가 가능하다. 그러나, 이러한 설명의 주요 초점은, 기하학적 도형이 페치 및 테스트되는 중에 로컬화 되거나 테스트 셀과 함께 고정될 광선을 제공한다는 것이다.It is also evident that the ray data is “stationa” within the test cell as the circular or GAD element is traversed through the test cell. As described in the related application, other implementations are possible. However, the main focus of this description is that the geometric figure provides a ray to be localized or fixed with the test cell during fetching and testing.

이러한 구현예의 여러 측면이 도 6을 참조하여 제공된다. 특히, 교차 테스트 로직의 또 다른 구현예는 테스트 제어 로직(603)(도 2의 테스트 로직과 유사)을 포함하는 프로세서(605)를 포함하며, 여기서 테스트 제어 로직은 메모리 인터페이스(625), 명령 캐시(630), 명령 디코더(645) 및 데이터 캐시(650)에 연결하기 위한 페치 유닛(620)을 포함한다. 데이터 캐시(650)는 테스트 셀(610a- 610n)을 공급한다. 명령 디코더(645)는 또한 입력을 테스트 셀(610a- 610n)에 제공한다. 명령 생성기(665)가 명령 입력을 명령 디코더(645)로 제공한다. 테스트 셀은 재기록 유닛(660)으로 검출된 교차부의 식별결과를 출력하고, 재기록 유닛은 순차적으로 데이터를 데이터 캐시(650)에 저장할 수 있다. 재기록 유닛(660)으로부터의 출력은 또한 명령이 생성될 때, 명령 생성기(665)로의 입력으로 사용된다. 프로세서(605)에서 사용된 명령들은 단일 명령, 다중 데이터 다양성을 가지는 것으로 가정하며, 이 경우에 테스트 셀에서 처리된 명령은 정의된 표면(예, 원형 및 GAD 요소) 및 광선 사이의 교차 테스트이다. Various aspects of this embodiment are provided with reference to FIG. 6. In particular, another implementation of cross test logic includes a processor 605 that includes test control logic 603 (similar to the test logic of FIG. 2), where the test control logic includes a memory interface 625, an instruction cache. 630, a command decoder 645, and a fetch unit 620 for connecting to the data cache 650. The data cache 650 supplies the test cells 610a-610n. The command decoder 645 also provides an input to the test cells 610a-610n. The command generator 665 provides the command input to the command decoder 645. The test cell outputs the identification result of the intersection detected by the rewriting unit 660, and the rewriting unit may sequentially store data in the data cache 650. The output from the rewrite unit 660 is also used as an input to the command generator 665 when the command is generated. The instructions used in the processor 605 are assumed to have a single instruction, multiple data diversity, in which case the instruction processed in the test cell is a cross test between the defined surface (eg, circular and GAD elements) and the ray.

일 예에서, "명령"은 원형 또는 GAD 요소와 같은 기하학적 모양을 정의하는 데이터를 포함하며, 다중 데이터 요소들은 "명령"으로 제공된 기하학적 모양에 대해 테스트를 하기 위한 광선에 대한 개별적인 레퍼런스를 포함할 수 있다. 이와 같이, 기하학적 모양 및 다중 광선 레퍼런스의 조합은 여러 개의 도시된 테스트 셀로 전달가능한 정보로 이루어진 별개의 패킷이라 가정한다. 일부의 경우에, 패킷 전달이 연속적으로 처리되어, 다중 패킷들이 복수의 테스트 셀 사이에서 "인 플라이트(in flight)" 상태가 된다. In one example, the "command" includes data defining a geometric shape, such as a circular or GAD element, and the multiple data elements may include a separate reference to a ray for testing against the geometric shape provided by the "command". have. As such, the combination of geometric shape and multi-beam reference is assumed to be a separate packet of information that can be delivered to several illustrated test cells. In some cases, packet delivery is processed continuously, resulting in multiple packets being " in flight " between a plurality of test cells.

이러한 테스트 셀은 큰 명령 세트를 가지는 풀-피처(full-featured) 프로세서의 형태로 존재할 수 있으며, 이와 같은 패킷은 각각 패킷의 목적을 구별하는데 충분한 다른 정보를 포함할 수 있다. 예를 들어, 여러 다른 목적으로 존재하는 패킷들로부터 교차 테스트를 위해 형성된 패킷을 구분하기 위해 포함된 다수의 비트가 존재할 수 있다. 또한, 다양한 교차 테스트 명령이 제공될 수 있으며, 이는 서로 다른 원형 모양 및 서로 다른 GAD 요소 모양에 관하여, 또는 서로 다른 테스트 알고리즘에 관하여, 포함된다.Such test cells may exist in the form of a full-featured processor with a large instruction set, and such packets may each contain other information sufficient to distinguish the purpose of the packet. For example, there may be a number of bits included to distinguish packets formed for cross testing from packets that exist for several different purposes. In addition, various cross test commands may be provided, which are included with respect to different circular shapes and different GAD element shapes, or with respect to different test algorithms.

전형적인 예에서, 각각의 교차 테스트 패킷은 초기에 기하학적 요소에 관한 레퍼런스 또는 기하학적 요소(이들 중 하나는 GAD 요소임)에 관한 데이터를 포함하거나, 원형에 대한 레퍼런스를 포함할 수 있다. 그리고 기하학적 요소와의 교차에 대한 테스트를 위해 복수의 광선에 대한 레퍼런스(즉, 위에 설명한 "패킷")를 포함할 수 있다.In a typical example, each cross test packet may initially contain a reference to a geometric element or data about a geometric element (one of which is a GAD element) or a reference to a prototype. And reference to a plurality of rays (ie, "packets" as described above) for testing for intersection with the geometric elements.

디코더(645)는 기하학적 요소에 관한 레퍼런스를 결정하기 위한 명령을 해석하고, 페치(620)(메모리 인터페이스(625)와 같은 메모리 인터페이스에 대한 제어)를 통해 요소의 페치를 개시할 수 있다. 일부 구현예에서, 디코더(645)는 미래에 필요한 기하학적 요소의 페치를 시작하기 위한 다수의 명령을 예상할 수 있다. 기하학적 요소는 페치(620)에 의해 디코더(645)로 제공될 수 있으며, 여기서 디코더(645)는 기하학적 요소를 테스트 셀(610a-610n)로 제공한다.Decoder 645 may interpret the instructions to determine a reference for the geometric element and initiate the fetch of the element via fetch 620 (control to a memory interface, such as memory interface 625). In some implementations, the decoder 645 can anticipate a number of instructions for initiating the fetch of geometric elements needed in the future. The geometric elements may be provided to the decoder 645 by the fetch 620, where the decoder 645 provides the geometric elements to the test cells 610a-610n.

디코더(645)는 또한 데이터 캐시(650)에 대한 기능적 어드레스와 같은 명령으로부터 광선 레퍼런스를 제공하며, 이는 테스트 셀(610a-610n) 각각으로 각각의 광선을 교차 테스트하는데 충분한 개별적인 데이터를 제공한다. 광선과 연관된 데이터(교차 테스트에 필요하지 않음)는 제공될 필요가 없다. 따라서, 데이터 캐시(650)는 교차 테스트 셀과 같이 동작하는 하나 이상의 연산 자원에 대해 로컬화된 광선 데이터로 기능을 할 수 있다.Decoder 645 also provides a ray reference from an instruction, such as a functional address for data cache 650, which provides sufficient individual data to cross test each ray into each of test cells 610a-610n. Data associated with the light beam (not required for crossover testing) need not be provided. Thus, the data cache 650 may function as localized ray data for one or more computational resources that behave like cross test cells.

기하학적 요소는 각각의 테스트 셀(610a-610n)에서 개개의 광선과 함께 교차 테스트되며, 교차부의 식별결과가 재기록 유닛(660)에 의해 수신되도록 각각의 테스트 셀(610a-610n)로부터 출력된다. 테스트된 기하학적 요소의 속성에 따라, 재기록 유닛(660)은 두 개의 서로 다른 기능 유닛 중 하나를 수행한다. 테스트 셀(610a-610n)이 교차에 관해 원형을 테스트한 경우에, 재기록 유닛(660)은 테스트될 원형과 교차된 각각의 광선의 교차부부를 출력한다. 테스트 셀(610a-610n)이 GAD 요소를 테스트한 경우에, 재기록 유닛은 테스트 셀(610a-610n)의 출력을 명령 유닛(665)으로 제공한다. The geometric elements are cross tested with individual light beams in each test cell 610a-610n, and are output from each test cell 610a-610n so that the identification result of the intersection is received by the rewriting unit 660. Depending on the attributes of the geometric elements tested, the rewrite unit 660 performs one of two different functional units. In the case where the test cells 610a-610n tested the circle for the intersection, the rewriting unit 660 outputs the intersection of each ray intersected with the circle to be tested. When the test cells 610a-610n have tested the GAD elements, the rewrite unit provides the output of the test cells 610a-610n to the command unit 665.

명령 유닛(665)은 추가 교차 테스트 시 테스트 셀에 명령을 할 가상 명령을 조합하도록 동작한다. 명령 유닛(665)은 지정된 GAD 요소와 교차한 광선을 특정하하는 테스트 셀(610a-610n) 입력, 명령 캐시(630), 그리고 GAD 입력(679)으로부터의 입력을 이용하여 다음과 같이 동작한다. 테스트 셀(610a-610n)로부터의 입력을 이용하여, 명령 유닛(665)은, 테스트 셀(610a-610n)로부터의 입력에 특정된 GAD 요소에 연결된 GAD 요소를, GAD 입력에 근거하여 결정한다(즉, 명령 유닛(665)은 어떤 GAD 요소가 지정된 GAD 요소에 대해 식별된 교차부에 근거하여 다음으로 테스트되어야 하는지를 결정한다).The instruction unit 665 is operative to combine virtual instructions that will command the test cell upon further cross testing. The instruction unit 665 operates as follows using inputs from the test cells 610a-610n, the instruction cache 630, and the GAD input 679 that specify the ray intersecting the designated GAD element. Using inputs from the test cells 610a-610n, the instruction unit 665 determines a GAD element connected to the GAD element specified for the input from the test cells 610a-610n based on the GAD input ( That is, the instruction unit 665 determines which GAD element should next be tested based on the intersection identified for the designated GAD element).

명령 유닛(665)은 교차된 요소와 연결됨으로써 식별된 GAD의 각 요소를 위해 명령 캐시에 저장된 명령이 이미 존재하는지, 그리고 그 명령은 임의의 추가 광선 레퍼런스를 수용할 수 있는지(즉, 명령의 모든 데이터 슬롯이 파일되었나?)를 결정한다. 명령 유닛(665)은 식별된 광선 수만큼 그 명령으로 테스트 셀 입력에서 교차하는 것으로 식별된 광선을 부가하고, 자연 광선 레퍼런스를 수신할 만큼 다른 멸영을 생성한다. 명령 유닛(665)은 테스트 셀 입력에서 식별된 요소와 연결됨으로써 식별된 GAD의 각 요소에 대해서도 이를 실행한다. 따라서, 테스트 셀 입력(교차 식별결과)을 처리한 후에, 동일한 GAD 요소와 교차하는 것으로 식별된 광선이 동일한 GAD 요소에 연결된 GAD의 요소에 대한 광선의 테스트를 특정하는 명령에 각각 부가된다. 생성된 이러한 명령은 명령 캐시(630)에 저장된다.The instruction unit 665 is connected with the intersected elements to determine if there is already an instruction stored in the instruction cache for each element of the identified GAD and that the instruction can accept any additional ray reference (i.e. all of the instructions Is the data slot filed?). The instruction unit 665 adds the rays identified as intersecting at the test cell input with the command by the number of rays identified, and generates another ruin enough to receive a natural ray reference. The instruction unit 665 executes this for each element of the identified GAD by being connected with the element identified in the test cell input. Thus, after processing the test cell inputs (cross-identification results), light rays identified as intersecting with the same GAD element are each added to instructions specifying a test of light rays for the elements of the GAD connected to the same GAD element. These generated instructions are stored in the instruction cache 630.

명령은 GAD 입력(670)으로부터 수신된 GAD 요소의 구조화에 근거하여 명령 캐시(630)에서 구조화될 수 있다. 명령 유닛(665)은, 로직(203a)과 명령 유닛(665) 모두가 어느 광선이 어느 GAD 요소와 충돌하는지에 대한 식별결과를 수신하고 이러한 광선을 이후 테스트를 위해 서로 그룹화된다는 점에서, 로직(203a)과 유사한 기능을 수행한다. 도 6의 시스템은, 테스트를 위한 광선 패킷이 서로 다른 기능을 수용하기 위한 여러 타입 중 한 타입의 패킷일 수 있다는 점에서, 범용 시스템이 된다.The instructions may be structured in the instruction cache 630 based on the structure of the GAD elements received from the GAD input 670. The instruction unit 665 receives the result of identifying which rays collide with which GAD element and both the logic 203a and the instruction unit 665 group these rays together for later testing. 203a). The system of FIG. 6 becomes a general purpose system in that the ray packet for testing may be one of several types of packets for accommodating different functions.

예를 들어, GAD 입력(670)은 GAD의 그래프를 제공할 수 있으며, 여기서 그래프의 노드는 GAD의 요소를 나타내고, 여러 쌍의 노드는 에지가 연결된다. 에지는 어느 노드가 다른 노드와 연결되는지를 표시하며, 명령 유닛(665)은 지정된 GAD 요소에 대한 캐시에 명령이 이미 존재하는지를 식별하기 위해, 노드를 연결하는 다음의 에지에 의해 명령 캐시(630)를 검색할 수 있다. 다중 명령이 지정된 GAD 요소를 위해 존재하는 경우에, 이들은 하나의 리스트에 링크되거나, 순서가 정해지거나 또는 서로 연관될 수 있다. 가령, 관련 명령이 발견될 수 있는 명령 캐시(630) 내의 잠정적인 위치를 식별하기 위한 GAD 요소 ID를 해싱하는 것과 같은 다른 방법 구현될 수 있다.For example, the GAD input 670 can provide a graph of the GAD, where the nodes of the graph represent elements of the GAD, and the pair of nodes are edge connected. The edge indicates which node is connected with another node, and the instruction unit 665 is configured by the instruction cache 630 by the next edge connecting the nodes to identify whether the command already exists in the cache for the specified GAD element. You can search for. If multiple commands exist for a given GAD element, they can be linked to one list, ordered, or associated with each other. Other methods may be implemented, such as hashing a GAD element ID to identify a potential location in the instruction cache 630 where relevant instructions may be found.

명령은 테스트 중인 GAD 노드를 참조할 수 있으며, 이에 따라 이 명령은 발행 및 디코드(각각의 연결 노드에 관해 명령을 저장하는 것과 반대임) 된 명령에 응답하여, GAD의 연결 노드를 페치하도록 한다. 이러한 각각의 연결 노드는 각 테스트 셀에 보관된 개개의 광선으로 테스트하기 위한 테스트 셀(610a-610n)을 통해 스트림화될 수 있다(즉, 광선 데이터는 각각의 테스트 셀에 복수의 GAD 요소가 제공되는 동안 테스트 셀에 고정상태로 남아 있으며, 각각의 테스트 셀은 각각의 GAD 소자에 대해 광선을 차례로 테스트한다).The command may refer to the GAD node under test, thereby causing the command to fetch the connected node of the GAD in response to the issued and decoded instructions (as opposed to storing the command for each connected node). Each of these connection nodes can be streamed through test cells 610a-610n for testing with individual light rays stored in each test cell (ie, light data is provided by a plurality of GAD elements in each test cell). The test cell remains stationary in the test cell, and each test cell in turn tests the beam against each GAD device).

따라서, 이러한 실시예에 따라 구현된 프로세서는 연결된 노드에 대해 교차 테스트를 위한 제 1 노드와의 교차에 관해 식별된 광선을 수집하는 명령을 생성하거나 획득하기 위한 기능성을 제공한다. 위에 설명된 예에서와 같이, 프로세서(605)로 제공된 GAD가 계층적인 경우에, GAD의 그래프는 계층적 순서로 연결될 수 있다. Thus, a processor implemented in accordance with this embodiment provides functionality for generating or obtaining instructions for connecting nodes to collect the identified rays with respect to the intersection with the first node for cross testing. As in the example described above, where the GAD provided to the processor 605 is hierarchical, the graphs of the GADs may be concatenated in a hierarchical order.

GAD의 예시적인 연결 및 소스가 예시되며, 다른 배치도 가능하다. 예를 들어, 메모리(615)는 GAD 요소에 대한 소스일 수 있다. 그러나 이는 기하학적 데이터보다는 고속 메모리에 광선(즉, 광선을 정의하는 데이터 및 발견된 현재 가장 인접한 원형 교차부와 같은 데이터)을 저장하는 것이 바람직하다. 여기서 지정된 프로세싱 아키텍처가 허용된다. 또한, 위의 예에서, 테스트 결과에 근거하여 테스트 될 다음 노드(즉, 다음 가속 요소 또는 원형)가 결정되었으며, 응답적으로, 패킷이 기하학적 모양마다 인스터스화 되었다. 이러한 설명으로부터 명백하게 알 수 있는 다른 구현예는, 지정된 노드의 자 노드에 대한 테스트를 시작하도록 결정하는 때 "자" 노드마다 패킷을 인스턴스화하는 것을 포함할 수 있으며, 지정된 노드는 시간 상 뒤에 자 명령/컬렉션을 생성한다.Exemplary connections and sources of GAD are illustrated and other arrangements are possible. For example, memory 615 may be the source for the GAD element. However, it is desirable to store the rays (i.e., data defining the rays and data such as the present nearest nearest circular intersection) in a high speed memory rather than geometric data. The processing architecture specified here is allowed. Also, in the above example, the next node to be tested (i.e., the next accelerating element or prototype) was determined based on the test result, and in response, the packet was instantiated per geometric shape. Another implementation that may be apparent from this description may include instantiating a packet for each "child" node when deciding to start testing for a child node of a specified node, where the specified node may be a child command / Create a collection.

도 7은 교차 테스트 및 광선 세이딩(카메라 광선을 포함하는 새로운 광선 생성을 포함)의 분리 동작에 관한 큐를 사용할 수 있는 광선 추적 시스템(예, 시스템(700))의 여러 측면을 추가로 도시한다. 시스템(700)은 교차 테스트를 위한 광선의 제공 및 교차 테스트의 오나료를 가능하게 하며, 이로써 도 1-6의 시스템과 같이, 다른 순서로, 세이딩에 대한 출력을 생성한다. 마찬가지로, 교차 테스트 자원은 이전에-식별된 교차부의 세이딩 해상도를 제한하지 않고, 광선의 교차 테스트를 진행할 수 있다.FIG. 7 further illustrates various aspects of a ray tracing system (eg, system 700) that may use cues for cross testing and separation operations of ray shading (including generating new rays including camera rays). . System 700 enables the provision of light beams for cross testing and on-off of cross testing, thereby producing outputs for shading in a different order, such as the system of FIGS. 1-6. Likewise, the cross test resource can proceed with cross testing of the beams without limiting the shading resolution of the previously-identified cross.

도 7은 복수의 교차 테스트 자원(ITR, 705a-705n)을 도시하며, 각각의 자원은 광선 데이터 저장장치(766a-766n)에 각각 연결된다. 이 저장장치는 자원에서의 교차에 관해 테스트될 광선을 정의하는 데이터를 저장한다. ITR의 각 그룹 및 광선 데이터 저장장치(예, 광선 데이터(766a) 및 ITR(705a))는 테스트 자원의 로컬화된 그룹화(예, 704에 도시된 그룹화)와 같이 도시될 수 있다. 이는 예를 들면 도 5의 그룹화(578 및 579)와 같은 이전 그룹화와 유사하다.7 shows a plurality of cross test resources (ITRs) 705a-705n, each resource coupled to a ray data storage 766a-766n, respectively. This storage stores data that defines the ray to be tested for intersection in the resource. Each group of ITRs and ray data storage (eg, ray data 766a and ITR 705a) may be shown as a localized grouping of test resources (eg, the grouping shown at 704). This is similar to previous groupings such as, for example, grouping 578 and 579 in FIG.

광선 데이터 저장장치(766a-766n)는 프라이빗 L1 캐시, L2 캐시의 공유 또는 매핑된 부분, 등과 같은 메모리일 수 있으며, 이전 예에서와 같이, 고속 메모리가 기하학적 데이터보다는, 특정된 프로세싱 자원에 로컬화된 광선 데이터를 저장하는 데 쓰이는 것이 바람직하다. 광선 데이터의 로컬화된 저장장치는 여기에 사용된 교차 테스트 알고리즘에 의해 더 쉽게 만들어지며, 이는 광선이 더 많은 로컬화된, 고속 메모리 저장될 수 있게 시간 길이를 증가시키고, 소형 메모리의 스래시 량을 감소시킨다. 이와 같이, 이러한 광선 저장장치는, 지정된 광선에 대한 데이터가, 장면의 교차 테스트가 완료될 때까지, 동일한 로컬 메모리에 저장된다는 점에서 준정적(quasistatic) 형인 것으로 보일 수 있다. Ray data storage 766a-766n may be memory such as a private L1 cache, a shared or mapped portion of the L2 cache, and the like, as in the previous example, where fast memory is localized to a specific processing resource, rather than geometric data. It is desirable to be used to store the generated ray data. Localized storage of ray data is made easier by the cross-test algorithm used here, which increases the length of time that the ray can be stored in more localized, faster memory, and reduces the amount of thrash in small memory. Decreases. As such, this ray storage device may appear to be quasistatic in that the data for a given ray is stored in the same local memory until the crossover test of the scene is complete.

데이터 정의 광선은 테스트 제어 유닛(703)(이전 도면에서 로직(203b) 등과 유사)으로부터 출력(743)을 거쳐 로드된다. 테스트 제어 유닛(703)은 광선 완료 큐(730)를 통해 ITR(705a-705n)에서의 교차 테스트가 완료되도록 하는 광선에 대한 식별기를 포함하는 입력을 수신한다.The data defining light beam is loaded via the output 743 from the test control unit 703 (similar to logic 203b, etc. in the previous figure). The test control unit 703 receives an input that includes an identifier for the light beam through which the cross-test at the ITRs 705a-705n is completed via the light completion queue 730.

큐(730)는 광선 식별기(일부 예로서 Ray ID(1, 18, 106, 480)이 도시됨)를 저장한다. 큐(730)는 ITR(705a-705n)로부터 입력을 얻으며, 이는 교차된 가장 인접한 교차부를 식별하기 위해 테스트 된 장면에서 테스트를 완료한 광선을 나타낸다. 이와 같이, 큐(730)는 ITR(705a-705n)로부터 지정된 출력이 GAD 요소에 대한 정보 또는 가장 인접한 원형(ITR(705a-705n)이 두 타입의 모양을 모두 테스트할 수 있는 경우에 사용될 수 있음)의 교차부를 나타내는지를 결정할 수 있는 결정 포인트(751)로부터 공급될 수 있다.Queue 730 stores ray identifiers (in some examples Ray IDs 1, 18, 106, 480 are shown). Queue 730 takes input from ITRs 705a-705n, which represents the light beam that completed testing in the tested scene to identify the nearest intersection crossed. As such, the queue 730 can be used if the output specified from the ITRs 705a-705n is information about the GAD element or if the nearest circle (ITR 705a-705n) can test both types of shapes. May be supplied from decision point 751 which may determine whether to represent the intersection of

결정 포인트(751)는 이전에 설명된 두 가지 타입의 교차 제어 기능 유닛를 표현한다. 하나는 GAD/광선 교차부가 교차 테스터에 더 가깝다는 것이고, 다른 하나는 가장 인접한 검출 원형/광선 교차부가 세이딩을 위해 출력된다는 것이다. 일부 앞선 아키텍처에서, 별개의 테스트 셀이 각각 사용되는 경우에, 결정 포인트는 가장 인접한 (가능한) 원형 교차부가 존재할 때 추적할 수 있다.Decision point 751 represents the two types of cross control functional units described previously. One is that the GAD / ray intersection is closer to the cross tester, and the other is that the nearest detection circular / ray intersection is output for shading. In some earlier architectures, where separate test cells are each used, decision points can be tracked when there is the nearest (possible) circular intersection.

결정 포인트(751)로부터, GAD 결과가 먹스(752)로 입력되고, 먹스는 큐(725)로부터 광선 ID 입력을 수신하며, 이 큐는 입력(742)로부터 수신된 광선 ID를 저장하고, 입력은 광선 제어 유닛(703)으로부터 순차적으로 입력된다. 광선 제어 유닛(703)은, 테트스 제어 유닛(703)으로부터 출력(743)을 통해 광선 데이터(766a-766n)에 제공될 광선 정보에 대응하는 광선 식별기를 가지는 입력(742)을 포함한다. 따라서, 큐(725)에서 식별된 데이터 정의 광선(광선 식별기에 의해(광선 ID)에 의해 식별됨)이 출력(743)을 거쳐, 위와 같은 메모리에 저장하기 위해 광선 데이터(766a-766n) 제공된다. 광선 ID가 형성되는 방법에 관한 예가 이하에 제공된다. From decision point 751, a GAD result is input to mux 752, the mux receives a ray ID input from cue 725, which stores the ray ID received from input 742, and the input is It is input sequentially from the light beam control unit 703. The ray control unit 703 includes an input 742 having a ray identifier corresponding to ray information to be provided to the ray data 766a-766n via the output 743 from the test control unit 703. Thus, the data defining rays (identified by the ray identifiers (ray IDs)) identified in the queue 725 are output via the output 743 and ray data 766a-766n are provided for storage in such a memory. . An example of how the ray ID is formed is provided below.

두 개의 큐(730, 725)는 광선에 대한 일련의 식별기(광선 ID)를 표현한다. 그러나 이하에 설명할 것과 같이, 복수의 광선은 일반적으로 지정된 기하학적 모양에 대해 동시에 테스트된다. 따라서, 이러한 경우에 큐(725)는 광선 ID의 패킷에 대한 광선 ID를 저장하는 것이 바람직하며, 따라서 큐(730)는 지정된 모양과 연관된 복수의 광선 ID를 각각 가지는 일련의 엔트리들을 나타낼 수 있다.Two cues 730 and 725 represent a series of identifiers (ray IDs) for the rays. However, as will be described below, a plurality of light rays are generally tested simultaneously for a given geometric shape. Thus, in this case the queue 725 preferably stores the ray ID for the packet of ray IDs, so the queue 730 may represent a series of entries each having a plurality of ray IDs associated with the specified shape.

구체적인 예로서, 이러한 아키텍처를 구동하는 알고리즘은 일반적으로 복수의 고아선이 지정된 모양에 대해 테스트 될 필요가 있는 것으로 결정될 때까지 대기하며, 이후에 이러한 테스트가 수행되고 결과가 출력된다. 따라서, 복수의 광선이 동시에 테스트를 완료하고 거의 동시에 테스트를 시작할 것이라고 가정하는 것이 일반적이다. 효과적으로, 이러한 테스트가 완료된 광선은, 광선들이 초기에 인스턴스화 하는 방법 및 시점의 면에서 또는 광선들이 가속 계층구조(hierachy)를 통과하는 경로에 의해 완전히 서로 분리될 수 있다. 역으로, 큐(725)는 장면의 고정 GAD 요소(예, GAD 요소의 계층 구조의 루트 노드)에 대해 테스트 될 새로운 광선의 그룹 또는 패킷을 포함하는 것으로 간주될 수 있다. As a specific example, the algorithm driving this architecture generally waits until it is determined that a plurality of orphans need to be tested for a given shape, after which such a test is performed and the results are output. Thus, it is common to assume that a plurality of rays will complete the test at the same time and begin testing at about the same time. Effectively, these tested light rays can be completely separated from each other in terms of how and when the rays initially instantiate or by the path through which the rays pass through the acceleration hierarchy. Conversely, queue 725 may be considered to contain a group or packet of new rays to be tested against a fixed GAD element of the scene (eg, the root node of the hierarchy of GAD elements).

이러한 새로운 광선은 카메라 세이더(735)와 그 외의 세이더(710a-710n)을 포함하는 광원으로부터 발생한다. 카메라 세이더(735)는 장면에서 테스트될 일차 광선을 생성함에 따라, 별개로 식별된다. 세이더(710a-710n)는 스레드 및/또는 하나 이상의 프로세서의 코어와 같은 연산 자원에서 동작하고, 광선과 원형 사이의 식별된 교차부에 적절한 응답을 구체화하는 명령 또는 로직의 실행을 나타낸다. 보통, 이러한 응답은 원형과 연관된 세이딩 코드에 의해 적어도 부분적으로 결정되며, 다양한 여러 다른 영향 및 고려가 이루어질 수 있다.These new rays originate from light sources including camera shaders 735 and other shaders 710a-710n. Camera shader 735 is identified separately as it generates the primary rays to be tested in the scene. Shaders 710a-710n represent the execution of instructions or logic that operate on computing resources such as threads and / or cores of one or more processors, and specify appropriate responses to identified intersections between rays and circles. Usually, this response is determined at least in part by the shading code associated with the prototype, and various other effects and considerations can be made.

세이더(710a-710n)는 분산 포인트(772)를 통해 교차된 광선 및 원형의 식별기를 수신하고, 테스트 제어 유닛(703)(도 8a 참조)의 출력(745)으로부터 이러한 광선 데이터를 수신한다. 분산 포인트(772)는 지정된 원형에 대한 코드를 실행할 능력을 가지는 연산 자원으로 이러한 광선 데이터를 제공하는 데 사용될 수 있으며, 따라서 이러한 능력 결정을 위한 임의의 수단이, 로드 측정, 연산 자원에 의해 설정된 플래그, 포화 식별기(fullness indicator)와 FIFO 분리를 포함하는 이러한 분산을 제어하는데 사용될 수 있다. 또는 라운드 로빈이나 의사 랜덤 분산 스킴이 사용될 수 있다.The shaders 710a-710n receive intersecting light rays and circular identifiers through the scattering points 772 and receive such light data from the output 745 of the test control unit 703 (see FIG. 8A). Dispersion point 772 can be used to provide such ray data with a computational resource having the ability to execute code for a given prototype, so that any means for determining this capability is determined by load measurement, a flag set by the computational resource. It can be used to control this distribution, including saturation indicator and FIFO separation. Or round robin or pseudo random variance scheme may be used.

세이더(710a-710n)의 출력은 다른 광선(편의를 위해 이차 광선이라 함, 또한 카메라(735)로부터의 출력은 광선을 포함함)을 포함할 수 있다. 이 예에서, 이 시점에 광선은 광선을 정의하는 원점 또는 방향 데이터를 포함한다. 그러나 테스트 제어 유닛(703)에 의해 제공되는 것이 바람직한 연된 광선 ID를 가질 필요는 없다.The outputs of the shaders 710a-710n may include other light rays (called secondary light rays for convenience, and the output from the camera 735 includes light rays). In this example, the ray at this point contains origin or direction data defining the ray. However, it is not necessary to have a softened beam ID that is preferably provided by the test control unit 703.

알 수 있는 바와 같이, 테스트 제어 유닛(703)은 교차 테스트 자원의 광선 상태를 모니터할 수 있으며, 도 8-9를 참조하여 더 상세히 설명할 것과 같이, 완료된 광선 데이터(766a-766n)의 광선을 교체하도록 새로운 광선을 할당한다. ITR(705ㅁ-705n)로의 광선 ID의 분산이 분산기(780)에 의해 수행되며, 이는 도 10을 참조하여 상세히 설명된다. 이러한 분산은 지정된 식별기에 의해 식별된 광선을 정의하는 데이터를 광선 데이터(766a-766n)의 메모리가 저정함으로써 일차적으로 제어된다. 또한, 도 10을 참조하여 설명되는 것과 같이, 컬렉션 대기상태와 같인 고려조건에 근거하여, 분산기(780)는 광선 ID가 큐(725)로부터 획득되는 시점을 제어한다.As can be seen, the test control unit 703 can monitor the ray condition of the cross test resource and, as will be described in more detail with reference to FIGS. 8-9, the ray of completed ray data 766a-766n. Allocate a new ray to replace. Dispersion of the ray IDs into the ITRs 705-705n is performed by the spreader 780, which is described in detail with reference to FIG. This distribution is primarily controlled by the memory of the ray data 766a-766n storing data defining the ray identified by the designated identifier. Also, as described with reference to FIG. 10, based on considerations such as the collection wait state, the spreader 780 controls the point in time at which the ray ID is obtained from the queue 725.

이제, 도 8a로 돌아가서, 테스트 제어 유닛(703)의 일부가 도시된다. 이 제어 유닛은 광선 데이터(766a-766n)와 각각 연관된 메모리 뱅크를 포함하며, 각 뱅크는 광선 데이터가 밀집된 슬롯을 가지며 메모리 어드레스에 의해 어드레스될 수 있다. 도 8a는 광선 완료 큐(730)로부터의 출력(744)이 광선 식별기(1, 18, 106, 480)을 포함하는것을 도시하며, 식별기들 각각은 메모리(803) 내에 할당된 공간을 가진다. 이러한 공간은 출력(744)으로부터 이러한 광선 식별기의 수신에 응답하여 겹쳐 쓰기/채위지도록 허용된다. 분산 포인트(722)에 대한 출력(745)은 세이딩에서 사용하기 위한 광선 데이터를 포함한다. 출력(745)은 또한 다른 데이터를 포함할 수 있다. 실제로, 메모리(803)는 프로세스 실행 세이더(710a-710n)와 같은, 다른 프로세스에 의해서도 사용되는 메모리에 구현될 수 있다. 이러한 경우에, 출력(745)은 연산 자원에 의해 메모리(803)로부터 이러한 데이터의 인출을 표현할 수 있다(또는 인출에 의해 구현될 수 있다).Turning now to FIG. 8A, a portion of the test control unit 703 is shown. This control unit includes memory banks each associated with light beam data 766a-766n, each bank having a slot in which the light beam data is dense and can be addressed by a memory address. 8A shows that the output 744 from the ray complete queue 730 includes ray identifiers 1, 18, 106, and 480, each of which has an allocated space in memory 803. This space is allowed to be overwritten / filled in response to receiving such a ray identifier from output 744. Output 745 to scatter point 722 includes ray data for use in shading. Output 745 may also include other data. Indeed, memory 803 may be implemented in memory that is also used by other processes, such as process execution shaders 710a-710n. In such a case, the output 745 may represent (or may be implemented by) the retrieval of such data from the memory 803 by computational resources.

링크(741, 742, 743, 744, 745, 750, 790)와 같이, 도 7에서 다양한 통신 링크가 확인된다. 이러한 링크 중 하나는 전체 아키텍처 구현예에 따라 구현될 수 있으며, 고유 메모리 영역, 물리적 링크, 확장 버스상에 설치된 가상 채널, 공유 레지스터 공간 등을 포함할 수 있다.Various communications links are identified in FIG. 7, such as links 741, 742, 743, 744, 745, 750, 790. One of these links may be implemented according to an overall architecture implementation and may include unique memory regions, physical links, virtual channels installed on an expansion bus, shared register space, and the like.

도 8b는 출력(741)에서(예를 들면, 카메라 세이더(735)와 같은 세이딩 동작으로부터) 새로운 광선에 대한 데이터가 나오는 것을 도시한다. 이러한 광선 데이터는 적어도 광선 원점과 방향 정보를 포함한다. 이제, 테스트 제어 유닛(703)은 이러한 새로운 광선을, 광선 데이터(766a-766n)와 다르게 메모리(803) 내의 위치에 새로운 광선을 할당한다. 각각의 광선 원점과 방향에 연관된 식별기는 식별기가 저장되었던 위치에 따라 달라진다. 따라서, 입력(742)(큐(725)에 관한 입력)은 이러한 근거에 따라 정해진 광선 식별기를 수신한다. 또한, 출력(743)은 광선 식별기 및 메모리(803)에 저장된 이들과 관련된 원점 및 방향 정보를 모두 포함한다. 도 8a 및 8b에 도시된 광선 ID의 배치는, 광선 ID가 관련 데이터를 식별하기 위한 메모리를 인덱스하는데 사용될 수 있다는 점에서 편리하다. 그러나, 결과적으로 ITR(705a-705n) 및 메모리(803) 내의 광선 데이터의 식별결과가 광선 식별 데이터 사용에 의해 영향을 받는 한, 광선에 관한 다른 종류의 식별기가 사용될 수 있다. 8B shows data for new light rays coming out of output 741 (eg, from a shading operation such as camera shader 735). Such ray data includes at least ray origin and direction information. The test control unit 703 now assigns this new light ray to a location in the memory 803 that is different from the light ray data 766a-766n. The identifier associated with each ray origin and direction depends on where the identifier was stored. Thus, input 742 (input to queue 725) receives a ray identifier determined according to this basis. The output 743 also includes both origin and direction information associated with them stored in the beam identifier and memory 803. The arrangement of the ray ID shown in Figs. 8A and 8B is convenient in that the ray ID can be used to index a memory for identifying relevant data. However, as a result, as long as the identification result of the ray data in the ITRs 705a-705n and the memory 803 is affected by the use of the ray identification data, other kinds of identifiers for the ray can be used.

도 9a는, 내용 연관 메모리(content associative memory, 910)가 서로 다른 광선 데이터와 각각 연관된 키(905)를 유지하는 경우의 예를 도시한다.9A shows an example where a content associative memory 910 holds a key 905 each associated with different ray data.

도 9b는 각각의 광선 데이터(766a-766n) 내의 슬롯이 인터페이스(743)를 통해 테스트 제어 유신(703)으로부터 광선 데이터를 수신하도록 제공되는 것을 도시한다. 이러한 슬롯은 다중 뱅크로 추가적으로 하위분할되거나, 인터리브될 수 있으며, 및/또는 캐시로부터 데이터의 검색을 쉽게 하도록 하는 다른 캐시 구조화 메커니즘일 수 있다. 여기서 광선이 저장을 위해 분산될 필요가 있는 경우에, 이러한 분산은 광선 ID의 최하위 비트에 근거하여, 광선 ID의 해시에 의해, 또는 분산이 발생할 다수의 뱅크를 이용한 모듈 분할에 의해, 라운드 로빈 큐잉(queuing), 또는 그 외에 메모리로 광선 데이터를 분산시키는데 사용될 수 있는 분산 메커니즘에 의해 진해될 수 있다. 임의 지정 부분 내에서, 광선 데이터가 광선 Id에 근거하여 저장될 수 있다.9B shows that a slot in each ray data 766a-766n is provided to receive the ray data from the test control update 703 via the interface 743. Such slots may be further subdivided into multiple banks, interleaved, and / or other cache structured mechanisms to facilitate retrieval of data from the cache. Where the ray needs to be distributed for storage, this dispersion is based on the least significant bit of the ray ID, either by hashing the ray ID, or by module partitioning with multiple banks where dispersion will occur. queuing, or otherwise, may be enriched by a distribution mechanism that can be used to distribute the ray data into memory. Within any designated portion, ray data may be stored based on ray Id.

요약하면, 도 7-9b는 테스트될 광선이 제어 로직에 의해 수집되고, 식별기가 할당되는 아키텍처를 도시하며, 식별기는 서로 다른 교차 테스트 자원에 연결된 개개의 캐시에 광선 정의 데이터가 저장될 수 있는 메모리 위치에 근거하는 바람직하다. 원형 교차 테스트 결과는 완성됨에 따라 테스트 자원으로부터 나오며, 테스트 로직은 이후에 테스트 될 필요가 하는 새로운 광선으로 이러한 완료된 광선에 대한 메모리 위치를 재할당한다. 완료된 광선은 복수의 서로 다른 교차 프로세싱/세이딩 자원 중 하나에서 공유될 수 있으며, 이는 테스트될 추가 광선을 생성할 수 있다. 일반적으로 광선은 가속 구조 전체를, 가장 인접한 원형 교차부가 식별될 때까지(또는 광선이 장면 배경 이와의 것과 교차하지 못하 것으로 결정될 때까지), 교차 테스트 자원을 통해 순회하는 것이 일반적이다.In summary, Figures 7-9B illustrate an architecture in which the beams to be tested are collected by the control logic and the identifiers are assigned, the identifiers being memory where the beam definition data can be stored in individual caches connected to different cross test resources. Preferred based on location. The prototype cross test results come from the test resources as they are completed, and the test logic reallocates memory locations for these completed rays with new rays that need to be tested later. The completed rays may be shared in one of a plurality of different cross processing / saiding resources, which may create additional rays to be tested. In general, light rays traverse the entire accelerated structure through cross test resources until the closest circular intersection is identified (or until the light is determined not to intersect with the scene background).

도 10으로 돌아가서, 시스템을 렌더링하기 위한 추가적인 아키텍처 측면을 도시한다. 도 10의 일 측면은 교차 테스트를 위해 구성된 프로세서에 연결된 개별적인 캐시 메모리에 광선 데이터가 저장될 수 있다는 것이다. 다른 측면은, 분산기(780)가 어떻게 ITR(705a-705n)과 인터페이스할 수 있는 가이다. 도시된 추가 측면은 테스트에 관한 모양 데이터가 테스터에 제공되는 방식에 관한 것이다.Returning to FIG. 10, additional architectural aspects for rendering the system are shown. One aspect of FIG. 10 is that ray data may be stored in a separate cache memory coupled to a processor configured for cross testing. Another aspect is how the disperser 780 can interface with the ITRs 705a-705n. A further aspect shown relates to how shape data relating to the test is provided to the tester.

분산기(780)는 통신 링크(790)(하드웨어, 인터ㅍ로세스 또는 인터스레드 통신 등으로 구현됨)를 통해 먹스(751, 도 7)로부터 광선 식별기를 수신한다. 이러한 광선 ID는 컬렉션 관리 유닛(1075)으로 전송되며, 여기서 다음으로 테스트 될 객체를 규정하는 광선 ID와 개별적인 GAD 요소 사이의 연관성이 유지된다. 또한 광선 ID는, 광선 ID가 이들의 컬렉션을 테스트하기 위해 컬렉션 관리 및 저장 유닛(1075)으로부터 결정결과를 기다리는 큐(1021, 1022, 1023) 사이의 결정(dicision, 1013, 1014, 1015)에 의해 분산될 수 있다. 예를 들어, 컬렉션(1045)은 테스트를 위해 준비하는 것으로 결정되고, 광선 ID가 개별적인 ITR(705a-705n)로 송신되며, ITR의 캐시(1065a-1065n)은 이러한 광선 ID 각각에 대한 데이터를 포함한다. 컬렉션 관리 유닛(1075)는 또한, 테스트에 필요한 기하학적 모양의 검색을 시작하기 위해, GAD 요소 데이터 및/또는 원형 데이터를 저장하기 위한 인터페이스를 가질 수 있다. Disperser 780 receives the ray identifier from mux 751 (FIG. 7) via communication link 790 (implemented in hardware, interprocess or interthreaded communication, or the like). This ray ID is sent to the collection management unit 1075, where the association between the ray ID defining the object to be tested next and the individual GAD elements is maintained. The ray ID is also determined by the decision 1013, 1014, 1015 between the queues 1021, 1022, 1023 waiting for the decision result from the collection management and storage unit 1075 to test their collection. Can be dispersed. For example, the collection 1045 is determined to be ready for testing, the beam IDs are sent to individual ITRs 705a-705n, and the cache 1065a-1065n of the ITRs contains data for each of these beam IDs. do. The collection management unit 1075 may also have an interface for storing GAD element data and / or prototype data to begin searching for the geometric shape required for the test.

이러한 모양은 예를 들면 링크(112)를 거쳐 저장장치(103, 도 1)로부터의 큐(1040)에 도달한다. 이러한 모양은 지정된 컬렉션과 연관된 GAD의 요소와의 연관성에 근거하여 식별된다. 예를 들어, 계층적 GAD의 경우에, 이러한 모양은 모 GAD 요소의 자일 수 있다. 각각의 ITR은 자신의 광선을 큐(1040)로부터의 모양에 대해 직렬적으로 테스트할 수 있다. 따라서, 최고 처리량은 캐시(1065a-1065n) 사이에 지정된 컬렉션의 광선이 동일하게 분산된 때 획득될 수 있으며, 컬렉션 관리 유닛(1075)은 지정된 광선 컬렉션의 테스트 결과에 근거하여 컬렉션을 가장 쉽게 업데이트할 수 있다. 지정된 컬렉션의 여러 광선이 하나의 캐시에 존재할 때, 나머지 교차 테스트가 지연된다. 또는 이들은 다음 컬렉션으로부터의 광선을 테스트할 수 있다. 컬렉션 테스트 동기화가 다시 요구되기 전에, 최대치의 순서가 뒤바뀐 테스트가 조절될 수 있다.This shape reaches the queue 1040 from storage 103 (FIG. 1), for example via link 112. These shapes are identified based on their association with the elements of the GAD associated with the specified collection. For example, in the case of hierarchical GAD, this shape may be the child of the parent GAD element. Each ITR may test its rays in series for appearance from cue 1040. Therefore, the highest throughput can be obtained when the rays of the specified collection are equally distributed between the caches 1065a-1065n, and the collection management unit 1075 can most easily update the collection based on the test results of the designated rays collection. Can be. When multiple rays of a given collection exist in one cache, the rest of the cross-test is delayed. Or they can test the rays from the next collection. Before the collection test synchronization is required again, the tests in which the maximum order is reversed can be adjusted.

출력단(750a-750n)에서 출력이 생성되고(이 출력 단은 링크(750, 도 7)의 컴포넌트일 수 있음), 이는 결정 포인트(751, 도 7)로 제공된다. 위에 설명한 것과 같이, 이러한 아키텍처는 임의의 모양(예, 원형 또는 GAD 요소)을 테스트하는 ITR로 제공된다. 또한, 컬렉션 관리 유닛(1075)에 연결된 결정 포인트(751)는, GAD 교차 테스트의 결과가 지정된 GAD 요소에 지정된 광선이 충돌한다는 결정을 포함한다는 것을 나타내고, 이는 식별된 광선이 GAD 요소에 대응하는 컬렉션에 부가되도록 한다. 따라서, 다른 구현예는 GAD 테스트 결과를 컬렉션 관리 유닛(1075)에 직접 제공하는 단계를 포함할 수 있다. 더 일반적으로, 설명된 예들은 잠정적인 정보의 플로우를 예시하며, 이와 다른 플로우도 이들로부터 알 수 있다.An output is generated at the output stages 750a-750n (which can be a component of the link 750, FIG. 7), which is provided to decision points 751, FIG. 7. As described above, this architecture is provided by an ITR that tests any shape (eg, a circular or GAD element). In addition, decision point 751 coupled to collection management unit 1075 indicates that the result of the GAD crossover test includes a determination that the specified ray collides with the specified GAD element, which indicates that the collection of identified rays corresponds to the GAD element. To be added to. Thus, another implementation may include providing the GAD test results directly to the collection management unit 1075. More generally, the described examples illustrate the flow of tentative information, and other flows can be appreciated from them.

주의해야한 본 발명의 다른 측면들은, 지정된 광선 컬렉션에 대한 하나 이상의 광선 ID가 큐(1021, 1022, 1023, 컬렉션(1047)에 의해 도시됨) 중 어느 하나에 저장될 수 있다는 것이다. 이러한 경우에, 그 큐에 대한 ITR은, 광선 및 이차 테스트(또는 많은 후속 테스트)에 대한 결과, 양자를, 가용하게 됨에 따라, 테스트할 수 있다. 결정 포인트(751)는 컬렉션의 모든 결과 모아지기를 기다리거나, "스트래글러(straggler)" 결과가 필요에 따라 전파될 수 있다.Another aspect of the invention to be noted is that one or more ray IDs for a given ray collection may be stored in any of the queues 1021, 1022, 1023, shown by the collection 1047. In this case, the ITR for that queue can be tested, as it becomes available, as a result of the ray and secondary test (or many subsequent tests). Decision point 751 may wait for all the results of the collection to be collected, or the "straggler" results may be propagated as needed.

요약하면, 도 10은 복수의 테스트 자원에 대한 큐에 분산될 하나 이상의 모양과 광선 식별기의 패킷이 연관되도록 하는 시스템 구조화를 나타내며, 복수의 자원은 각각 광선 데이터의 서브세트를 저장한다. 각각의 테스트 자원은 테스트 자원으로 로드된 모양에 대해 각 광선 식별기에 의해 식별된 광선 데이터를 페치한다. 바람직하게, 이 모양은 모든 테스트 자원을 통해 일관되게 직렬로 스트림화 될 수 있다. 모양은 메인 메모리 내의 하나의 어드레스에서 시작하는 자(children)의 시퀀스로 식별될 수 있다. 따라서, 도 10은 하나의 모양이 다중 광선에 대해 동시에 테스트되는 시스템 구조화를 설명한다. In summary, FIG. 10 illustrates a system structure that allows packets of ray identifiers to be associated with one or more shapes to be distributed in a queue for a plurality of test resources, each of which stores a subset of ray data. Each test resource fetches the ray data identified by each ray identifier for the shape loaded into the test resource. Preferably, this shape can be streamed serially consistently across all test resources. The shape can be identified by a sequence of children starting at one address in main memory. Thus, FIG. 10 illustrates system structuring in which one shape is tested simultaneously for multiple rays.

그러나, 다른 예는 일련의 서로 다른 교차 테스트 자원에서 연이어 모양을 테하는 단계를 제공하며, 여기서 모양 데이터와 광선 식별기 패킷은 교차 테스트 자원들 사이를 이동한다. 복수의 패킷을 인 플라이트 상태로 만듦으로서, 테스트의 처리량이 증가한다. 이러한 방식의 여러 예가 이하에서 설명된다.However, another example provides for the step-by-step shape in a series of different cross test resources, where the shape data and the ray identifier packet move between cross test resources. By putting multiple packets in flight, the throughput of the test is increased. Several examples of this approach are described below.

도 11은 복수의 연산 자원(1104-1108)의 링 버스 배치가 구현될 수 있는 컴퓨터 아키텍처의 제 1 예를 도시한다. 각각의 연산 자원은 프라이빗 L1 캐시(1125a-1125n)에 대한 액세스를 가지고, 이는 교차 테스트에 사용된 임의의 연산 자원에 대해, 메모리(340) 내의 모양 데이터 저장장치(1115)로부터 그 연산 자원으로 제공된 기하학적 모양을 이용하여 교차 테스트될 광선 데이터를 포함한다. 연산 자원(1104-1108) 사이의 통신은 버스(1106)에 의해 이루어질 수 있고, 이는 복수의 포인트 투 포인트 링크 또는 이러한 인터-프로세서 통신에 적합한 임의 다른 아키텍처를 포함할 수 있다. 11 illustrates a first example of a computer architecture in which ring bus placement of a plurality of computing resources 1104-1108 may be implemented. Each computing resource has access to a private L1 cache 1125a-1125n, which is provided to that computing resource from shape data storage 1115 in memory 340 for any computing resource used for cross testing. Includes ray data to be cross-tested using geometric shapes. Communication between computational resources 1104-1108 can be accomplished by bus 1106, which can include a plurality of point-to-point links or any other architecture suitable for such inter-processor communication.

연산 자원이 특정한 메모리 구조(가령, L2 캐시(1130, 1135))를 공유하는 경우에, 이들 연산 자원(예, L2 캐시(1130))를 공유하는 연산 자원(1107, 1106)) 사이의 통신은 일정한 목적의 캐시를 통해 서로 통신할 수 있다. 추가적으로, 시스템에서 테스트 중인 광선에 관한 데이터의 사본이 광선 데이터(1110) 내에, 광선 데이터(1110a-1110n) 사이의 이들의 서브세트의 분산을 위해, 보관될 수 있으며, 이러한 광선 데이터가 L2(1130) 및 L2(1135)를 통해 전송될 수 있고, 이들의 많은 부분이 L2 캐시(이하에 설명됨)에 저장될 수 있다. 모양 데이터(115)는 또한 메모리(340)에 존재할 수 있고, L2(1130, 1135) 중하나 이상에 그리고 캐시(1125a-1125n) 중 어느 하나에 임시로 존재한다. 그러나, 이러한 캐시에 저장된 광선 데이터는 모양 데이터가 중복기록되지 않게 보호되며, 이러한 모양에 할당된 공간의 크기는 일반적으로, 테스트에서 다음에 사용될 시점에 대한 표시 없이, 모양 데이터를 유지하려 하지 않고도, 테스트를 위해 대기하며 모양 데이터(115)에 대한 레이턴시를 방어하기에 충분한 것으로 현재 식별된 광선 패킷에 대해 사용될 수 있는 것으로 한정된다. 다르게 설명하면, 광선 데이터에 대해, Where computational resources share a particular memory structure (eg, L2 caches 1130 and 1135), communication between computational resources 1107 and 1106 sharing these computational resources (eg, L2 cache 1130) They can communicate with each other through a purposeful cache. In addition, a copy of the data regarding the light beam under test in the system may be stored in the light data 1110, for the distribution of their subset between the light data 1110a-1110n, such light data being L2 (1130). ) And L2 1135, and many of them may be stored in the L2 cache (described below). Shape data 115 may also be present in memory 340 and temporarily present in one or more of L2 1130, 1135 and in either cache 1125a-1125n. However, the ray data stored in these caches is protected from overwriting the shape data, and the amount of space allocated to these shapes is typically used without attempting to maintain shape data, without indication of when it will be used next in the test, It is limited to what can be used for the currently identified ray packet as waiting for a test and sufficient to defend against latency for shape data 115. In other words, for ray data,

도 11은 또한, 최근 최소 사용(least recently used) 교체와 같은 전형적인 캐시 관리 알고리즘을 사용하는 것을 피하는 것이 바람직하다.11 also preferably avoids using a typical cache management algorithm, such as replacing a least recently used.

도 11은 또한 교체 테스트에 부가하여, 연산 자원(1104)에서 애플리케이션 및/또는 드라이버(120)가 실행될 수 있다는 것을 설명한다. 또한, 광선 프로세스(1121)는 연산 자원(1108)에서 실행될 수 있고, 패킷 데이터(1116)는 패킷 프로세스(1121)에 의해 사용되도록 캐시(1125a)에 저장될 수 있다. 다른 패킷 데이터는 L2(1129)에 저장될 수 있다. 그러나 광선 데이터와 유사하게, 최고속 가용 메모리에 패킷 데이터를 저장항하는 것이 바람직하다. 패킷 프로세스는 이전 도면에서 수생된 컬렉션 및 그외 관리 로직과 동일한 많은 기능을 수행한다. 즉, 어느 광선이 어느 GAD 요소와 교차되는지를 추적하고, 예를 들면 충분한 광선이 교차된 GAD 요소의 자에 대해 충분한 광선이 테스트되도록 함으로써, 테스트 대기 중인 GAD 요소를 선택한다. 11 also illustrates that in addition to the replacement test, the application and / or driver 120 can be executed on the computational resource 1104. In addition, ray process 1121 may be executed on arithmetic resources 1108, and packet data 1116 may be stored in cache 1125a for use by packet process 1121. Other packet data may be stored in L2 1129. However, similar to ray data, it is desirable to store packet data in the fastest available memory. The packet process performs many of the same functions as the collection and other management logic derived from the previous figures. That is, it selects the GAD element that is waiting to be tested by tracking which rays intersect which GAD element and, for example, having enough rays tested for the ruler of the GAD element with sufficient rays crossed.

이러한 예에서, 패킷 프로세스(1121)가 중심화되기 때문에, 복수의 광선 식별기 및, 식별된 광선과의 교차에 대해 테스트될 모양에 대한 레퍼런스 또는 모양에 대한 데이터 중 하나가 를 포함하는 패킷을 발행함으로써 동작한다. 교차 테스트를 수행하는 각각의 연산 자원(1104-1107)은 패킷을 수신한다. 예를 들어, 복수의 포인트 투 포인트 링크(이하에 설명됨)에서 연속적으로 또는 공유 버스 타입 매체(이는 도 10의 아키텍처와 유사함)에서 동시에 수신된다. 각각의 연산 자원(1104-1107)은 로컬화된 광선 데이터(1110a-1110n)가 패킷에 식별된 임의의 광선에 대한 데이터를 저장하는지 여부를 결정하고, 그러한 경우에 그 광선에 대한 데이터를 검색하며 이를 테스트하고 결과를 출력한다.In this example, because the packet process 1121 is centralized, it operates by issuing a packet that includes a plurality of ray identifiers and a reference to the shape to be tested for intersection with the identified ray or data for the shape. do. Each computing resource 1104-1107 that performs a cross test receives a packet. For example, it is received continuously on a plurality of point-to-point links (described below) or simultaneously on a shared bus type medium (which is similar to the architecture of FIG. 10). Each computing resource 1104-1107 determines whether localized ray data 1110a-1110n stores data for any ray identified in the packet, and in that case retrieves data for that ray. Test it and print the result.

GAD 요소 교차에 대한 결과는 패킷 프로세스(1121)에 의해 추적되기 때문에, 패킷 프로세스91121)로 이러한 결과를 되돌려주는 임의의 통신 메커니즘이 적합하다. 이러한 메커니즘은 시스템의 전체 아키텍처에 근거하여 선택될 수 있다. 일부 예시적인 접근 방식이 이하에 설명되며, 각각의 발견된 교차부에 대한 개별적인 식별결과을 포함할 수 있다. 또는 각각의 테스트 자원이 교차 결과를 이용하여 순환 패킷을 밀집시킬 수 있다.Since the results for GAD element intersections are tracked by the packet process 1121, any communication mechanism that returns this result to the packet process 9121 is suitable. This mechanism can be selected based on the overall architecture of the system. Some exemplary approaches are described below and may include separate identification results for each found intersection. Alternatively, each test resource can use a crossover result to densify cyclic packets.

도 12는 캐시(1281-1284)와 연관된, 연산 자원(1205-1208)의 구조화의 추가 실시예를 도시하며, 캐시들 각각은 광선 데이터(1266a-1266n)와 패킷 데이터(1216a-1216n)를 저장한다. 각각의 연산 자원(1205-1208)은 큐(1251-1208)에 의해 하나 이상의 다른 연산 자원으로 연결된다. 광선 프로세스(1210)는 큐(1250)를 통해 입력을 연산 자원(1205)으로 제공한다. 광선 프로세스(1210)는 애플리케이션/드라이버(1202)와 통신한다. 연산 자원(1208)으로부터의 출력(1255)은 광선 프로세스(1210)에 연결된다. 또 다른 출력(1256)은 연산 자원(1205)에 연결된다. 원형 및 GAD 저장장치(103)는 모양 데이터에 대한 판독 속을 연산 자원(1205-1208)으로 제공한다.12 shows a further embodiment of the structure of the computational resources 1205-1208, associated with the caches 1281-1284, each of which stores ray data 1266a-1266n and packet data 1216a-1216n. do. Each computing resource 1205-1208 is connected to one or more other computing resources by the queues 1251-1208. Ray process 1210 provides input to computational resource 1205 via queue 1250. Ray process 1210 is in communication with an application / driver 1202. The output 1255 from the computational resource 1208 is coupled to the ray process 1210. Another output 1256 is coupled to a computational resource 1205. The prototype and GAD storage 103 provides the computational resources 1205-1208 for reading in shape data.

광선 프로세스(1210)은 테스트를 위한 광선을 수신하거나 생성하고, 식별된 광선에 대한 광선 식별기 및 광선 데이터를 포함하는 패킷을 형성한다. 패킷은 큐(1250-1254)를 통해 연산 자원(1205-1208) 각각으로 전달된다. 각각의 연산 자원(1205-1208)은 지정된 패킷 내 광선의 일부를 취득하고, 일부예에서 단 하나의 광선만을 취득하며, 광선 데이터(1266a-1266n) 내의 광선의 일부를 저장한다. 다른 예는 특정한 연산 자원(1205-1208)에 예정된 패킷을 송신하여 광선 프로세스(1210)가 어느 광선 데이터가 어느 로컬화된 광선 데이터(1266a-1266n)에 저장될 것인지를 결정하도록 하는 것을 포함한다.Ray process 1210 receives or generates a ray for testing, and forms a packet that includes ray identifiers and ray data for the identified ray. Packets are delivered to each of the computational resources 1205-1208 through the queues 1250-1254. Each computational resource 1205-1208 acquires a portion of the ray in the designated packet, in some examples only one ray and stores a portion of the ray in ray data 1266a-1266n. Another example includes sending a predetermined packet to a particular computational resource 1205-1208 such that the ray process 1210 determines which ray data is to be stored in which localized ray data 1266a-1266n.

광선이 로컬화된 저장장치에 로딩된 후에, 이들은 이후에, 원점 및 방향 데이터 없이 광선 ID 만을 포함하는 패킷에 의해 식별된다. 이러한 패킷은 또한 그 패킷에서 식별된 광선에 대해 테스트될 모양에 대한 레퍼런스 또는 데이터 중 하나를 포함한다. 일부 예에서, 이러한 패킷을 형성하는 데이터가 연산 자원((1205-1208)의 로컬화된 메모리(1281-1284) 사이에 분산된다. 따라서, 연산 자원(1205-1208)들 각각은 지정된 시점에 시스템에서 테스트 중인 광선에 대한 패킷 데이터의 일부를 보관하고 이로써 어느 광선이 어느 모양에 대해 다음번에 테스트될 것인가 관한 정보가 분산된다. 따라서, 각각의 연산 자원(1205-1208)은 테스트 대기중인 컬렉션의 테스트를 시작하기 위해, 광선 ID 및 모양 정보의 패킷을 발행할 수 있다.After the rays are loaded into the localized storage, they are then identified by packets containing only the ray ID without origin and direction data. This packet also contains either a reference or data to the shape to be tested for the light beam identified in that packet. In some examples, the data forming these packets is distributed between the localized memory 1281-1284 of the computational resources 1205-1208. Thus, each of the computational resources 1205-1208 is configured at a specified point in time. Keeps a portion of the packet data for the ray under test and distributes information about which ray will be tested next for which shape, so that each computational resource 1205-1208 can test the collection of pending tests. To begin with, it may issue a packet of ray ID and shape information.

각각의 패킷은 큐 및 연산 자원을 통과하는 라운드(round)를 만들고, 이어서 그 내부에 집약된 교차 테스트의 결과에 따라 발신 연산 자원으로 재 전송한다. 일 구현예에서, 각각의 연산 자원(1205-1208)은 발행할 패킷에 대한 모양 데이터를 페치한다. 예를 들어, 연산 자원(1205)이 테스트 대기 중인 패킷(예, 지정된 GAD 요소에 대한 광선의 컬렉션)을 가지는 경우에, 연산 자원은 이러한 어소시에이션(예, GAD 요소의 자)에 의해 테스트될 모양을 페치하고, 각각의 모양에 대한 데이터를 가지는 패킷을 만들고, 큐(1251)로부터 각각의 패킷을 전송할 수 있다.Each packet makes a round through the queue and computational resources, and then retransmits it to the originating computational resource according to the results of the cross-test aggregated therein. In one implementation, each computing resource 1205-1208 fetches shape data for the packet to issue. For example, if a computational resource 1205 has a packet waiting to be tested (e.g., a collection of rays for a given GAD element), the arithmetic resource may be shaped to be tested by this association (e.g., the ruler of the GAD element). It can fetch, create a packet with data for each shape, and send each packet from queue 1251.

다음으로, 패킷이 다른 연산 자원을 통해 이동한 후에, 연산 자원(1205)은 자신이 방출한 각각의 패킷을 수신한다. 수신된 때, 각 패킷은 나머지 연산 자원들(1206-1208)에 저장되었거나/저장된 패킷에 식별된 광선과의 교차에 관한 패킷(레퍼런스 또는 정의 데이터) 내의 모양을 테스트한 결과가 포함되어 있다. 연산 자원(1205)은, 나머지 연산 자원이 그들의 테스트를 수행하기 전 또는 후에 아무때나, 광선 데이터(1266a)에 로컬화된 임의의 식별된 광선을 테스트할 수 있다. 따라서, 광선 정의 데이터는, 교차 테스트 자원에 연결된 복수의 고속 메모리 사이에 분산될 수 있으며, 테스트 결과가 분산 방식으로 수집될 있다.Next, after the packet has moved through another computational resource, the computational resource 1205 receives each packet it has emitted. When received, each packet contains the results of testing the shape in the packet (reference or definition data) regarding the intersection with the ray identified in the packet stored / stored in the remaining computational resources 1206-1208. Computational resource 1205 may test any identified rays localized to ray data 1266a at any time before or after the remaining compute resources perform their tests. Thus, ray definition data can be distributed among a plurality of fast memories coupled to cross test resources, and test results can be collected in a distributed manner.

도 12에 따른 아키텍처를 구현하는 방법은 사용 중인 물리적 시스템의 다양한 특성을 고려한다. 예를 들어, 큐가 일 방향으로 패킷을 전송하는 것으로 도시되었다. 그러나, 양 방향으로 모두 패킷을 전송하는 것도 구현될 수 있다(즉, 양방향 큐 또는 다중 큐). 또한, 도 12는 연산 자원 사이에 데이터 패킷이 퍼져 있어, 더 많은 L2 캐시, 그리고 잠재적으로 대용량 메모리(가령 메인 메모리(103))로의 다른 포트로의 더 분산된 형태의 메모리 액세스을 가능하게 한다.The method of implementing the architecture according to FIG. 12 takes into account various characteristics of the physical system in use. For example, a queue is shown that transmits packets in one direction. However, transmitting packets in both directions can also be implemented (i.e. bidirectional queues or multiple queues). 12 also spreads data packets between computational resources, allowing more L2 caches and potentially more distributed forms of memory access to other ports to potentially large memory (eg, main memory 103).

패킷 데이터가 중앙화되면, 데이터 레퍼런스와 함께 일 방향으로 전송된 패킷은 예를 들면, 연산 자원(1205)에 의해 페치도니 데이터를 가질 수 있고, 데이터 레퍼런스와 함께 다른 방향으로 전송된 패킷은 연산 자원(1208)에 의해 페치된 데이터를 가질 수있다. 이러한 상황은 링버스 아키텍처(단방향 또는 양방향)로 임의 엔트리 포인트를 제공하도록 일반화될 수 있다.If the packet data is centralized, a packet transmitted in one direction with the data reference may have fetched data, for example, by the computing resource 1205, and a packet transmitted in the other direction with the data reference may have a computational resource ( 1208) may have data fetched. This situation can be generalized to provide any entry point with a ringbus architecture (one-way or two-way).

내용으로부터 입증된 바와 같이, 설명된 큐는, 복수의 교차 테스트 자원을 포함하는 시스템으로 교차 테스트를 위한 새로운 광선을 입력하기 위한, 그리고 교차 테스트 자원들을 서로 연결하는 하나 이상의 큐를 포함할 수 있다. 일부의 경우에, 새로운 광선을 입력하는 큐가 광선 정의 데이터(예, 교차 테스트 자원에 연결된 캐시 내의 데이터를 저장하기 위해 대기하는 큐)를 포함할 수 있다. 이러한 큐는 광선 정의 데이터를 저장하는 메인 메모리에 열거되는 식으로 구현될 수 있다. 패킷을 송신하는 교차 테스트 자원을 상호 연결하는 큐가 광선 정의 데이터가 아닌 광선 식별기만을 포함하는 것이 바람직하다. As demonstrated from the content, the described queue may include one or more queues for inputting new rays for cross testing into a system comprising a plurality of cross test resources, and for interconnecting cross test resources. In some cases, the queue for entering new rays may include ray definition data (eg, a queue waiting to store data in a cache connected to the cross test resource). Such a queue may be implemented in an enumerated manner in main memory that stores ray definition data. It is preferable that the queue interconnecting the cross test resources for sending packets only contain ray identifiers, not ray definition data.

도 13은 시스템(1200)의 잠재적인 구현예의 일부를 나타내며, 여기서 연산 자언은 칩의 코어들로 구현될 수 있으므로, 연산 자원(1205)은 하나의 코어이고, 연산 자원(1206)은 다른 코어이다. 여기서 큐(1251)가 인터코어 통신이다. 또한 모양 데이터와 마찬가지로 광선 데이터를 저장할 수 있는 중간삽입 L2 캐시(1305)가 도시된다. 13 illustrates some of the potential implementations of system 1200, where the computational terminology may be implemented in cores of a chip, so that computational resource 1205 is one core and computational resource 1206 is another core. . The queue 1251 here is intercore communication. Also shown is an interpolated L2 cache 1305 that can store ray data as well as shape data.

이전 도면을 참조하여 설명한 바와 같이, L2 캐시(1305)는 장면 도형 및 가속 데이터의 일부를, 이러한 데이터를 저장함으로써, 광선 데이터의 스레싱이 증가되지 않는 한, 저장한다(즉, 광선 데이터가 캐시 저장장치의 우선순위를 지정한다).As described with reference to the previous figures, the L2 cache 1305 stores some of the scene figures and acceleration data, as long as the thrashing of the ray data is increased by storing such data (ie, ray data is cached). Prioritize storage).

도 14a-14c는 각각 예시적인 시스템에 대한 다양한 구현예에 따라 큐가 가질 수 있는 다양한 관계를 도시한다. 일반적으로, 연산 자원간 통신은 직렬적이거나 1:1일 필요가 없다. 예를 들어, 도 14a는 큐(1405, 1406) 모두에 하나의 입력(1404)이 입력될 수 있고, 이 큐들은 각각 하나의 연산 유닛(1407, 1408)에 각각 특화될 수 있다. 예를 들어, 연산 유닛(1407, 1408)이 단일한 물리적 칩에 구현되는 경우에, 입력(1404)은 칩 레벨 입력일 수 있으며, 각각의 큐(1405, 1406)가 특정한 코어에 관한 것일 수 있다.14A-14C each illustrate various relationships a queue may have in accordance with various implementations of the exemplary system. In general, communication between computational resources need not be serial or 1: 1. For example, in FIG. 14A, one input 1404 may be input to both queues 1405 and 1406, and these queues may be specialized for one computation unit 1407 and 1408, respectively. For example, where computing units 1407 and 1408 are implemented on a single physical chip, input 1404 may be a chip level input, and each queue 1405 and 1406 may be for a particular core.

도 14b는 단일 입력이 다중 코어에 입력되는 것을 도시하며, 이들 코어는 각각 연산 유닛(1407, 1408)에 입력을 할 수 있고, 이 연산 유닛은 각각 데이터를 반대 큐(1406, 1405)로 개별적으로 전송할 수 있다. 도 14c는 큐(1411)이 입력(1410)을 수신하고, 출력 연산 유닛(1407, 1408) 모두로 제공하는 것을 나타낸다. 따라서, 도 14a-14c는, 다양한 큐잉 전략이 이러한 측면들에 따른 패킷 전송을 구현할 수 있다는 것을 설명한다.FIG. 14B shows that a single input is input to multiple cores, which cores may input to computing units 1407 and 1408, respectively, which individually route data to opposite queues 1406 and 1405. Can transmit 14C shows that queue 1411 receives input 1410 and provides it to both output computing units 1407 and 1408. Thus, FIGS. 14A-14C illustrate that various queuing strategies may implement packet transmissions in accordance with these aspects.

도 15는 캐시 계측구조의 다중 레벨이 존재하는 경우에(예를 들면, 레벨 1 캐시(1502, 1503) 및 레벨 2 캐시(1504), 광선 데이터의 다양한 조합이 제공될 수 있다는 것을 나타낸다. 예를 들어, 광선 데이터(1507)는 광선 데이터(1505, 1506)의 개별적인 서브세트를 포함할 수 있으며, 마찬가지로 다른 광선 데이터가 광선 데이터(1505 또는 1506) 중 하나 이상에 존재하지 않을 수 있다. 하나의 큐가 하나 이상의 연산 자원(도 14c)에 입력되는 경우에서와 같이, 광선 데이터(1505, 1506)가 동적으로 변할 수 있으며, 광선 데이터는 광선 데이터(1507)에 저장된 광선의, 광선 데이터(1505 또는 1506) 중 하나로의 할당결과를 반영할 수 있다.15 illustrates that where multiple levels of cache metrology exist (eg, level 1 caches 1502 and 1503 and level 2 caches 1504, various combinations of ray data may be provided. For example, ray data 1507 may include separate subsets of ray data 1505 and 1506, and similarly no other ray data may be present in one or more of ray data 1505 or 1506. One queue Ray data 1505, 1506 can be dynamically changed, as is the case when the input to one or more computational resources (FIG. 14C), the ray data 1505 or 1506 of the ray stored in ray data 1507. ) To reflect the result of the assignment.

도 16은 저장할 수 있는 큐(1251) 및 데이터의 구현 예를 상세히 도시한다. 패킷(1601a-1601n)이 도시되고, 각각은 개개의 광선 식별기(1605a-1605p, 1606a-1606p, 1607a-1607p)와 대응하는 충돌 정보 필드(1610a-1610p, 1611a-1611p, 1612a-1612p)를 가진다. 패킷(1601a)은 모양 1에 대한 데이터(1615a)를 포함하고, 패킷(1601b)은 모양 2에 대한 데이터(1615b)를 포함하며, 패킷(1601n)은 모양 n에대한 데이터 (1615n)을 포함한다. 물론, 다양한 다른 큐잉 전략(이들 중 일부느 도 14a-14c에 도시됨)이 구현될 수 있다.16 illustrates in detail an implementation example of a queue 1251 and data that can be stored. Packets 1601a-1601n are shown, each having collision information fields 1610a-1610p, 1611a-1611p, 1612a-1612p, corresponding to individual ray identifiers 1605a-1605p, 1606a-1606p, 1607a-1607p. . Packet 1601a includes data 1615a for shape 1, packet 1601b includes data 1615b for shape 2, and packet 1601n includes data 1615n for shape n . Of course, various other queuing strategies (some of which are shown in FIGS. 14A-14C) may be implemented.

여기에 사용된 큐잉(queuing)이란 용어는 임의 지정된 연산 자원에서 테스트된 광선에 대해 퍼스트 인/퍼스트 아웃 조건을 부가하지 않는다. 평균적으로, 임의 지정된 패킷에서 식별된 광선은 서로 다른 연산 자원에 대해 로컬화된 광선 저장장치 사이에 균등하게 분산될 것이며, 이에 따라 각각의 패킷에 대해 병렬화가 이루어진다. 하나의 연산 자원에서 하나의 패킷에 대해 복수의 광선이 테스트될 필요가 없는 경우에, 다른 연산 자원이 그 패킷에 대해 교차할 광선을 가지지 않는 버블이 형성될 수 있다. 이러한 버블은 다른 연산에 의해 채워질 수 있으며, 다른 연산은 다른 패킷의 교차 테스트를 포함한다. 일부 예에서, 각각의 연산 자원은 다중 스레드에 대한 상태를 유지하고 지정된 패킷에 대해 고정 상태인 스레드들 사이의 스위칭을 할 수 있다. 패킷들 사이의 각각의 교차 테스트에 대한 임게 데이터가 레지스터에 보관될 수 있는 한, 기본 처리량 이득이 실현되어야 한다.The term queuing, as used herein, does not add a first in / first out condition for a ray tested on any specified computational resource. On average, the rays identified in any given packet will be evenly distributed among the ray storage localized for different computational resources, resulting in parallelization for each packet. In the case where a plurality of light rays need not be tested for one packet in one computing resource, a bubble may be formed in which no other computing resource has rays to intersect for that packet. These bubbles can be filled by other operations, which involve cross testing of other packets. In some examples, each computational resource may maintain state for multiple threads and switch between threads that are stationary for a given packet. Basic throughput gains should be realized as long as the data for each cross test between packets can be kept in a register.

예시적인 시스템의 동작의 측면에서 부분적으로 요약하면, 각각의 연산 자원은 패킷의 수신에 응답하여 동작한다. 특정한 연산 자원에 대한 입력 큐로부터 패킷이 도착하면, 연산 자원은 그 패킷 내의 광선 식별기를 검사하고, 패킷 내의 식별된 광선이 개별적인 메모리 내에 식별된 광선에 대해 저장된 데이터를 가지는지 결정한다. 다르게 설명하면, 패킷은 어느 연산 자원이 패킷 내에 식별된 광선에 대한 광선 데이터로의 고속 액세스를 포함하느냐에 대한 우선순위 정보 없이 광선 식별기를 이용하여 형성될 수 있다. 나아가, 각각의 연산 자원은 패킷 내 식별된 모든 광선에 대한 광선 데이터를 획득하기 위한 응답식 시도를 하지 않다. 다만 패킷 내 식별된 임의 광선에 대한 로컬 고속 메모리 내의 광선 데이터를 가지는지에 대해서만 결정하며, 식별된 모양과의 교차에 대한 광선 테스트만을 수행한다.Partly summarized in terms of operation of the example system, each computing resource operates in response to receipt of a packet. When a packet arrives from an input queue for a particular computational resource, the computational resource examines the ray identifier in that packet and determines whether the identified ray in the packet has data stored for the identified ray in a separate memory. Stated differently, a packet may be formed using a ray identifier without prioritization information on which computational resources include fast access to ray data for the ray identified in the packet. Furthermore, each computational resource does not make a responsive attempt to obtain ray data for all identified rays in the packet. It only determines if it has ray data in the local fast memory for any ray identified in the packet, and performs ray ray tests only for intersection with the identified shape.

도 17은 예시적인 연산 자원에서 패킷이 처리될 수 있는 방식에 대한 여러 측면을 설명하기 위한 것이다. 도 17은 연산 자원(1206)으로 패킷(1601)이 입력되는 것을 도시한다. 연산 자원(1206)은, 패킷(1601a)으로부터의 광선 식별결과를 이용하여 광선 데이터를 질의한다(예를 들면, 광선(1605a)이 광선 ID(31)를 가지며 광선 데이터 저장장치(1266b) 내의 광선 ID(31)와 일치되는가를 질의). 광선 ID)(31)과 관련된 원점 및 방향이 1290을 통해 검색된다. 또한, 모양 데이터(패킷에서 식별된 경우에)가 현재 모양 데이터가 저장된 메모리 자원(1291)으로부터 획득된다(1715). 패킷에 모양 데이터가 제공되지 않으면, 그 모양 데이터가 직접 사용된다. 이어서, 광선(31)은 모양 1(또는 검색된 데이터에 의해 정의된 모양)과 교차에 대해 테스트 된다.17 is intended to illustrate various aspects of how a packet can be processed in an exemplary computing resource. 17 illustrates that a packet 1601 is input to an arithmetic resource 1206. The computational resource 1206 queries the ray data using the ray identification result from the packet 1601a (eg, ray 1605a has ray ID 31 and ray in ray data storage 1266b). Query whether ID 31 is matched). The origin and direction associated with ray ID) 31 is retrieved through 1290. Also, shape data (if identified in the packet) is obtained (1715) from the memory resource 1291 where the current shape data is stored. If shape data is not provided in the packet, the shape data is used directly. The light beam 31 is then tested for intersection with shape 1 (or the shape defined by the retrieved data).

테스트된 모양이 GAD 요소이면(1725), 이러한 교차 테스트의 효과는 테스트된 광선과의 교차 가능성을 가질 수 있는 장면 원형의 소단위 서브세트를 결정하기 위한 것이다. 따라서, 포지티브 히트(충돌) 결과가 광선 식별기(즉 광선 식별기(31))에 관한 위치의 패킷(1610a)에 재기록 된다(1726). 일부 구현예에서, 패킷의 전송자가 전송된 광선 ID가 무엇인지, 패킷에서 어떤 순서인지를 추적할 수 있고, 유추된 순서가 전송과 동일한 순서인 것으로 하여, 결과만 재기록할 필요가 있다. 따라서, 테스터를 통과한 후에, 패킷 전송(방출) 자원은 테스트 결과를 처리할 수 있다.If the tested shape is a GAD element (1725), the effect of this crossover test is to determine a subset of scene prototypes that may have the potential to intersect with the tested ray. Thus, the positive hit (collision) result is rewritten (1726) in packet 1610a at a location relative to the ray identifier (i.e. ray identifier 31). In some implementations, the sender of the packet can track what ray ID is transmitted and what order in the packet, and the inferred order is the same order as the transmission, so only the results need to be rewritten. Thus, after passing the tester, the packet transfer (emission) resource can process the test results.

다른 한편으로, 테스트된 모양이 원형(1730)인 경우에, 가장 가까운 원형 교차 결정 유닛(1731)은 검출된 교차부가 임의의 이전 것과 인접한 지 여부를 결정할 수 있다. 이어서 검출부가 교차된 원형인 경우에, 선택적으로 교차부 거리가 패킷을 이용하여 저장되거나 출력될 수 있다. 지정된 광선이 다중 패킷과(즉, 다중 GAD 요소와 동시에) 연관되기 때문에, 광선이 GAD 요소와 연관될 때마다 카운트가 보관될 수 있고(1733), 이에 따라 카운트는 광선이 테스트가 필요한 임의의 다른 패킷에 더 이상 존재하지 않는 시점을 결정할 수 있고, 그 광선을 위해 쓰인 메모리가 다른 광선의 입력을 위해 프리 상태가 되도록 한다. On the other hand, when the tested shape is a circle 1730, the closest circular crossover determination unit 1731 may determine whether the detected crossover is adjacent to any previous one. Then, in the case where the detector is a crossed circle, optionally the intersection distance can be stored or output using the packet. Since a given ray is associated with multiple packets (ie, concurrent with multiple GAD elements), a count may be kept every time the ray is associated with a GAD element (1733), so that the count is any other that the ray needs to be tested. It is possible to determine when a packet no longer exists and cause the memory used for that ray to be free for the input of another ray.

요약하면, 로컬 고속 저장장치 내의 각 광선과 연관된 데이터가 가장 인접한 검출된 원형 교차 식별기를 포함하는 것이 바람직하며, 이 식별기는 교차부에 대해 파라미터화된 거리 및 원형 레퍼런스를 포함한다. 각 광선과 연관된 다른 데이터는 광선이 존재하는 GAD 요소 광선 컬렉션의 카운트를 포함한다. 각각의 컬렉션이 테스트된 후에, 카운트가 감소되고, 또 다른 컬렉션이 생성될 때 카운트가 증가한다. 카운트가 제로일 때, 가장 인접한 교차 원형인 것으로 식별된 원형이 그 광선에 의해 교차될 것으로 결정된다.In summary, it is preferred that the data associated with each ray in the local fast storage device include the nearest detected circular intersection identifier, which includes a parameterized distance and circular reference to the intersection. Other data associated with each ray includes the count of the GAD element ray collection in which the ray is present. After each collection is tested, the count is decremented and the count is incremented when another collection is created. When the count is zero, it is determined that the circle identified as the nearest intersection circle will be crossed by that ray.

도 18은 예시적인 SIMD(single instruction multiple data) 아키텍처에 관한 것이며, 이는 패킷이 테스트를 위한 기하학적 모양의 스트립의 시작부분을 식별할 수 있는 경우에 사용될 수 있다. 일 예에서, GAD 요소 그래프의 노드가 하나 이상의 다른 노드에 에지가 연결되며, 여기서 각각의 노드는, 박스를 규정하도록 정렬된 축이나 구와 같은 기하학적 가속 데이터의 요소를 나타낸다. 일부 예에서, 그래프가 계층적이고, 따라서 지정된 노드의 테스트 시, 지정된 노드의 자는 모 노드에 의해 규정된 원형의 선택을 규정하도록 알려진다. GAD 요소는 최종적으로 원형의 선택을 규정한다.18 relates to an exemplary single instruction multiple data (SIMD) architecture, which may be used where a packet can identify the beginning of a strip of geometric shape for testing. In one example, nodes in the GAD element graph are edge connected to one or more other nodes, where each node represents an element of geometric acceleration data, such as an axis or sphere, aligned to define a box. In some examples, the graph is hierarchical, and therefore, when testing a designated node, the chair of the designated node is known to define the selection of the circle defined by the parent node. The GAD element finally defines the selection of the prototype.

구현예에서, 가속 요소의 스트링(이는 지정된 요소의 자임)은 스트링 내의 제 1 요소의 메모리 어드레스에 의해 식별될 수 있다. 이어서, 아키텍처가 사전지정된 스트라이드 길이를 다음 요소의 시작점의 데이터에 제공한다. 플래그가 제공되어 지정된 노드의 자 노드인, 요소의 지정된 스트링의 말단을 표시하기 위해 제공될 수 있다. 유사하게, 원형의 스트립이 다음 원형을 정의하기 위해 알려진 스트라이드 길이를 가지는 시작 메모리에 의해 식별될 수 있다. 삼각형 스트립에 대해 더 구체적으로, 시퀀스의 두 개의 꼭짓점은 각각 여러 삼각형을 정의할 수 있다.In an implementation, the string of acceleration elements, which are children of the designated element, may be identified by the memory address of the first element in the string. The architecture then provides a predetermined stride length to the data of the starting point of the next element. A flag may be provided to indicate the end of the specified string of elements that is a child node of the designated node. Similarly, a circular strip can be identified by the starting memory having a known stride length to define the next prototype. More specifically for triangular strips, two vertices of a sequence may each define several triangles.

도 18은 도 6에 도시된 SIMD 아키텍처와 유사한 SIMD 아키텍처의 여러 측면을 도시하기 위한 것이다. 이 예에서, 다중 광선 식별기(1605a-1605n)와, 선택적으로 교차 테스트 결과(1610a-1610n)를 수신하기 위한 저장 공간, 그리고 모양 정의 데이터, 모양에 대한 식별기, 또는 테스트 될 모양의 스트립(예, 삼각형 원형)의 시작에 대한 식별기(1815a)를 포함하는 모양 데이터를 포함하는 패킷(1601)이 수신된다. FIG. 18 is intended to illustrate various aspects of a SIMD architecture similar to the SIMD architecture shown in FIG. 6. In this example, the multi-beam identifiers 1605a-1605n, optionally storage space for receiving cross-test results 1610a-1610n, and shape definition data, identifiers for shapes, or strips of shapes to be tested (e.g., A packet 1601 is received that includes shape data that includes an identifier 1815a for the beginning of a triangle circle).

이러한 예시적인 아키텍처는, 적은 수의 강력한 개별적인 프로세싱 자원(대용량 캐시를 가짐)이 교차 테스트에 사용되는 경우에 적합하다. 여기서, 각각의 개별적인 프로세싱 자원은 평균적으로, SIMD 명령에 의해 테스트될 수 있는 광선의 수와 거의 동일한 로컬 저장장치의 광선의 수를 가지는 것으로 예상할 수 있다(대조적으로, 도 10은 각각의 컬렉션에 대해 각각의 캐시가 하나의 광선을 가지는 것이 바람직한 경우의 예를 도시함). 예를 들어, 네 개의 광선은 SIMD 실행 유닛에서 네 개의 광선이 동시에 테스트 될 수 있는 경우에, 각각의 패킷 내의 SIMD 유닛이 거치는 로컬 저장장치에 통계적으로 네 개의 광선을 가지는 것이 바람직하다. 예를 들어, 네 개의 별개의 프로세싱 자원이 제공된 경우에, 각각은 네 개의 광선을 테스트할 수 있는 SIMD 유닛을 가지고 하나의 패킷은 약 16 개의 참조된 광선을 가질 수 있다. 택일적으로, 개별적인 패킷이 SIMD 유닛을 가지는 각각의 프로세싱 자원으로 제공될 수 있으며, 이에 따라 예를 들면 패킷은 4 x SIMD 유닛이 존재하는 경우에 참조되는 4 개의 광선을 가질 수 있다. This example architecture is suitable when a small number of powerful individual processing resources (with large caches) are used for cross testing. Here, it can be expected that each individual processing resource will, on average, have a number of rays of local storage that is approximately equal to the number of rays that can be tested by the SIMD instruction (in contrast, FIG. 10 is in each collection). Shows an example where it is desirable for each cache to have one ray. For example, if four rays can be tested simultaneously in the SIMD execution unit, it is desirable to have four rays statistically in the local storage that the SIMD unit in each packet goes through. For example, if four separate processing resources are provided, each has a SIMD unit capable of testing four rays and one packet may have about 16 referenced rays. Alternatively, a separate packet may be provided to each processing resource having a SIMD unit, such that, for example, the packet may have four rays referenced when there is a 4 x SIMD unit.

일부 예에서, 패킷(1601)을 수신하는 제 1 연산 자원(1205)은 모양의 스트립에 대한 데이터를 획득하기 위해 식별기(1815a)를 사용할 수 있다. 이어서, 광선 데이터(1266a)에 저장된 패킷(1601a)에 참조된 각각의 광선이 연산 유닛(1818a-1818d)에서 테스트된다. 모양 스트립의 예에서, 모양 스트립(1816)이 검색되고, 이는 모양 1-4를 포함한다. 각각의 모양은 각각의 연산 유닛(1818a-1818d)을 통해 스트림화 될 수 있고, 그 유닛에 로드된 광선과의 교차에 대해 각각 테스트된다. 스트립의 각 모양에 대하여, 연산 자원은, 모양 중 하나에 대해 광선을 테스트 한 결과를 각각 포함하는 패킷(패킷(1820)이 도시됨)을 체계화할 수 있다.In some examples, first computing resource 1205 receiving packet 1601 may use identifier 1815a to obtain data for a strip of shape. Subsequently, each ray referenced in the packet 1601a stored in the ray data 1266a is tested in the computing units 1818a-1818d. In the example of a shape strip, shape strip 1816 is retrieved, which includes shapes 1-4. Each shape can be streamed through each computing unit 1818a-1818d and tested for intersection with the rays loaded in that unit, respectively. For each shape of the strip, the computational resource may organize packets (packet 1820 shown) that each contain the results of testing the light beams for one of the shapes.

택일적으로, 별개의 비트가 교차 결과를 수신한 각각의 광선에 대한 결과 섹션에 제공될 수 있으며, 하나의 패킷이 통과될 수 있다. 저속 메모리로부터 다시 페치되는 것을 방지하기 위해, 이러한 방식은, 다중 연산 자원이 L2를 공유할 수 있는 경우에, 또는 제 1 연산 자원에 의한 패치가 다른 연산 자원으로 모양 데이터를 전송하도록 하는 경우에, 가장 적합한 것으로 여겨진다. 예를 들어, DMA 트랜잭션은 다중 타깃을 포함할 수 있다. 각각의 타깃은 테스트도리 모양의 지정도니 스트림을 수신할 필요가 있는 서로 다른 연산 자원이며, 일부 구현예에 적합한 메모리 트랜잭션 모델의 일 예이다. 원칙적인 고려 사항은 한번 이상 메인 메모리(103)로부터 동일한 데이터를 페치하는 것을 감소시켜야 한다는 것이다.Alternatively, a separate bit may be provided in the result section for each ray that received the intersection result, and one packet may be passed. To avoid being fetched back from the slow memory, this approach can be used when multiple computational resources can share L2, or when a patch by a first computational resource sends shape data to another computational resource. It is considered the most suitable. For example, a DMA transaction may include multiple targets. Each target is a different computational resource that needs to receive a testdori shaped stream and is an example of a memory transaction model suitable for some implementations. The principle consideration is to reduce the fetch of the same data from main memory 103 more than once.

이전에 설명한 바와 같이, 각각의 교차 테스트 자원은 어느 식별기가 자신의 광선 데이터 저장장치에 저장된 광선 데이터를 가지는지를 결정한다. 임의의 광선에 대해, 광선 원점 및 방향이 검색된다. 위에서, 테스트 자원이 하나 이상의 식별된 모양의 시퀀를 이용하여 지정된 식별 광선을 테스트할 수 있는 예가 제공되었다. 드러나, 프로세싱 자원은 실질적인 추가 레이턴시 없이, 동시에 지정된 광선과 교차하는 복수의 모양을 테스트하거나, 또는 하나의 모양과 복수의 광선을 테스트하거나, 또는 이 둘의 조합을 테스트할 수 있다. 도 18에서, SIMD 아키텍처가 도시되고, SIMD 아키텍처는 교차 테스트를 위해 구성된 하나의 연산 자원 내에 , 네 개의 SIMD 유닛 각각이 연이어 제공되는 모양과 교차하는 서로 다른 광선을 테스트할 수 있다. 모양의 시퀀스는 일련의 모양에 대한 검색을 시작하도록, 장면 데이터 저장장치(340)로의 인덱스로 사용된 모양 스트립 레퍼런스에 근거하여 페치될 수 있으며, 일련의 모양은 각각 연산 유닛(123) 내에서 테스트 되며 4 개이다. As previously described, each cross test resource determines which identifier has ray data stored in its ray data storage. For any ray, the ray origin and direction is retrieved. Above, an example has been provided in which a test resource can test a designated identifying ray using a sequence of one or more identified shapes. Apparently, processing resources can test a plurality of shapes that intersect a specified beam at the same time, or test one shape and a plurality of beams, or a combination of the two, without substantial additional latency. In FIG. 18, a SIMD architecture is shown, which can test different rays intersecting the shape of each of the four SIMD units successively provided within one computational resource configured for cross testing. The sequence of shapes can be fetched based on the shape strip reference used as an index into the scene data storage 340 to begin searching for a series of shapes, each of which is tested in the computation unit 123. And four.

바람직하게, 광선들은 수집된 광선과 가속 데이터의 요소 사이의 검출된 교차부에 근거하여 컬렉션으로 수집된다. 따라서, 이 예에서, 다른 광선이 4의 서로 다른 모양에 대해 각각의 SIMD 유닛에서 테스트되는 경우에, SIMD 유닛을 포함하는 연산 자원은 광선의 패킷으로 결과를 재 포맷할 수 있고, 이들 패킷은 각각 모양을 참조한다. Preferably, the rays are collected into the collection based on the detected intersection between the collected rays and the elements of the acceleration data. Thus, in this example, when different rays are tested in each SIMD unit for different shapes of four, the computational resources containing the SIMD units may reformat the results into packets of rays, each of which is a packet of rays. See shape.

SIMD 유닛을 사용하는 다른 아키텍처가 컬렉션으로, 수집된 복수의 광선에 대한 페치를 대신 제공할 수 있다. 위에 설명된 것과 같이, 이러한 광선은 컬렉션과 연관된 모양에 관계된 모양에 대해 다음번에 교차 테스트된다. 예를 들어, 모양에 대해 수집된 것에 연결된 16 또는 32 개의 모양이 존재할 수 있다. 이러한 모양의 제 1 서브세트가 서로 다른 SIMD 유닛으로 로드될 수 있고, 수집된 광선은 각각의 SIMD 유닛을 통해 스트림화될 수 있다(즉, 동일한 광선이 동시에 각각의 SIMD를 통과한다). 결과 패킷은 각각의 SIMD 유닛에 의해 독립적으로 형성될 수 있으며, 다음 모양이 SIMD 유닛으로 로드된다. 광선은 이후에 SIMD 유닛을 통해 재 순환될 수 있다. 이러한 프로세스는 모든 관계된 모양이 수집된 광선에 대해 테스트될 때까지 계속된다.Other architectures using SIMD units may instead provide a fetch for a plurality of collected rays as a collection. As described above, these rays are next cross tested for the shape relative to the shape associated with the collection. For example, there may be 16 or 32 shapes linked to the collection for the shapes. This first subset of shapes can be loaded into different SIMD units, and the collected rays can be streamed through each SIMD unit (ie, the same rays pass through each SIMD at the same time). The resulting packet can be formed independently by each SIMD unit, and the next shape is loaded into the SIMD unit. The light beam can then be recirculated through the SIMD unit. This process continues until all relevant shapes have been tested for the collected rays.

도 18b는 이러한 예에 관한 연산 유닛(1818a)의 시간-기반 진행내용을 도시한다. 시간 1에서, 모양 1 및 광선 1이 테스트된다. 이러한 모양은 1 내지 q로 번호가 매겨지고, 컬렉션으로부터의 광선은 1 에서 n까지 번호가 매겨진다. 시간 n에서, 모양 1 과 광선 n이 테스트된다. 다음 사이클(시간 q - 1 * n+1)의 시작시점에서, 최종 모양이 연산 유닛(1818a)에서 테스트를 시작한다.18B shows the time-based progression of computing unit 1818a for this example. At time 1, shape 1 and ray 1 are tested. These shapes are numbered from 1 to q, and light rays from the collection are numbered from 1 to n. At time n, shape 1 and ray n are tested. At the beginning of the next cycle (time q-1 * n + 1), the final shape starts testing in the computation unit 1818a.

도 19는 연산 자원 사이의 교차 테스트에 대해 패킷(1905)이 분산되는 방식 및 테스트 결과의 여러 측면을 도시하고, 테스트 결과는 식별된 모양(1905)과 연관된 패킷의 광선에 대한 메모리를 관리하는 연산 자원(1910) 내에 최종적으로 합쳐진다. 도 19는 프로세싱 중의 예시적인 시스템 상태를 도시한다. 구체적으로, 연산 자원(1910-1914)은 각각 연산 자원에 액세스 가능한 메모리에 저장된 광선에 대한 광선 ID 정보를 수신하고, 교차부에 대해 식별된 모양을 테스트하며, 결과(1915-1919)를 출력한다. 출력 결과는 식별된 충돌(1915, 1917, 1919)을 포함한다. 충돌(hit) 또는 비 충돌(miss) 어느 것이나 디폴트 동작이 될 수 있으므로, 예를 들어 비 충돌이 포지티브 값 또는 디폴트(고정) 값으로 표시되지 않거나, 패킷 내 디폴트 값이 비충돌로 설정될 수 있다. 테스트 후에, 연산 자원(1910)은 적어도 충돌 정보를 수집하고, 여기서 연산 자원(1910)은 테스트 시스템 내의 모든 패킷 정보 또는 특정한 모양에 대한 것을 포함하는 이들의 서브세트를 관리할 수 있다.FIG. 19 illustrates various aspects of test results and how the packet 1905 is distributed for cross testing between computational resources, where the test results are operations that manage memory for the ray of packets associated with the identified shape 1905. Finally merged within the resource 1910. 19 illustrates an example system state during processing. Specifically, arithmetic resources 1910-1914 each receive ray ID information for a ray stored in a memory accessible to the arithmetic resource, test the identified shape for the intersection, and output a result 1915-1919. . The output result includes the identified conflicts 1915, 1917, 1919. Either hit or non-collision may be the default behavior, so for example, non-collision may not be indicated as a positive or default (fixed) value, or the default value in the packet may be set to non-collision. . After the test, the computational resource 1910 collects at least collision information, where the computational resource 1910 can manage all packet information or a subset thereof, including those for a particular appearance in the test system.

메모리(1966)의 예시적인 구조화는, 복수의 광선 ID(광선 A, B 등)에 매핑된 모양 레퍼런스의 논리적 구조화를 나타낸다. 또한, Ref #1(테스트 중인 모양에 대한 레퍼런스)에 관계된 행(row)의 일부 슬롯이 빈 상태라는 것을 나타낸다. 따라서, 연산 자원(1910)이 충돌 결과를 수신한 때, 이는 먼저 지정된 Ref #1 컬렉션의 남은 빈 슬롯을 채우고, 이어서 1966에는, 광선 n이 메모리(1966) 내에 Ref #1에 대한 새로운 패킷을 시작하는 것이 도시된다. 이제, Ref #1에 대한 패킷이 포화 상태이기 때문에, 이러한 패킷은 테스트를 위해 대기상태인 것으로 결정될 수 있다. 일부 예에서, Ref #1에 의해 참조된 모양의 자 GAD 요소가 페치되고, 패킷이 각 패킷 내의 Ref #1과 관련된 모든 광선을 이용하여 형성된다. 예를 들어, Ref #1의 32개의 자(children) 패킷이 존재하며, 따라서 32개의 패킷이 도시된 패킷(1922-1924)를 이용하여 형성될 수 있다. 일부 실시예에서, 연산 자원(1910)은 자 모양(child shape)을 정의하는 데이터를 페치하고 패킷(1922-1924)에 데이터를 저장할 수 있다. 택일적으로, 다른 연산 자원들이 이러한 데이터를 페치할 수 있도록 하는 레퍼런스가 제공된다.Exemplary structuring of the memory 1966 represents a logical structuring of shape references mapped to a plurality of ray IDs (rays A, B, etc.). It also indicates that some slots in the row related to Ref # 1 (reference to the shape under test) are empty. Thus, when the computational resource 1910 receives a collision result, it first populates the remaining empty slots of the specified Ref # 1 collection, and then at 1966, ray n starts a new packet for Ref # 1 in memory 1966. Is shown. Now, because the packet for Ref # 1 is saturated, this packet can be determined to be waiting for testing. In some examples, a child GAD element of the shape referenced by Ref # 1 is fetched, and a packet is formed using all the rays associated with Ref # 1 in each packet. For example, there are 32 children packets of Ref # 1, and thus 32 packets can be formed using the illustrated packets 1922-1924. In some embodiments, arithmetic resources 1910 may fetch data defining child shapes and store data in packets 1922-1924. Alternatively, references are provided that allow other computational resources to fetch such data.

일부의 경우에, 연산 자원(1910)은 또한 생성된 패킷 내에 식별된 광선을 저장할 수 있으며, 따라서 패킷을 전송하기 전에 광선을 먼저 테스트할 수 있다. 이러한 경우에, 연산 자원(1910)은 전송된 패킷에 이미 페치된 모양 데이터를 저장할 수 있다. 도 12를 참조하여 설명된 것과 같이, 구혀예들은 이러한 패킷이 하나 이상의 다른 연산 자원으로 전송되도록 할 수 있다(예, 양방향 큐잉, 임의-대-임의 등). In some cases, arithmetic resources 1910 may also store the identified rays in the generated packets, thus testing the rays first before sending the packet. In this case, the computation resource 1910 may store shape data already fetched in the transmitted packet. As described with reference to FIG. 12, the examples may allow such a packet to be sent to one or more other computing resources (eg, bidirectional queuing, random-to-random, etc.).

도 20은 설명된 여러 측면에 따른 방법이 구현되는 방식에 대한 일부 예를 설명하기 위한 것이다. 패킷이 모양 정보, 광선 ID 및 충돌 정보가 재 기록되는 위치를 포함하여 전송되며(2005), 충돌 정보는 제로화(zero'ed) 될 수 있고, 이 지점에서 "무시(don't care) 될 수 있다. 제 1 테스트는 광선 1 ID에 관해 수행되며(2006), 충돌을 발견한다. 따라서 하나의 1 이 패킷에 기록되고, 패킷이 제 2 테스트로 전달된다(2007). 여기서, 광선 3이 제 2 테스트에 대해 로컬화 된 것으로 밝혀지고, 비 충돌된 것으로 발견되었다. 따라서 하나의 0가 기록되고(또는 유지되고), 테스트(2006)로부터의 충돌 정보가 패킷 방향으로 이동된다(즉, 패킷 내 광선이 순차적으로 테스트 됨). 제 3 테스트(2008)가 광선 2에 대해 수행되고, 충돌로 발견된다. 이러한 예는 패킷 내 광선이 패킷에 존재하는 순서를 벗어나 테스트 될 수 있다는 것을 보여주고, 그리고 어느 테스터가 지정된 광선 ID에 관한 광선 데이터에 액세스하는데 가장 적합했느냐에 따라 테스트 순서가 결정된다는 것을 보여준다. 테스트는 모든 광선 ID가 테스트될 때까지 계속된다(2009). 이어서, 패킷이 합쳐지며, 이는 충돌 정보만이 유지될 필요가 있다는 것을 의미한다. 이러한 합체는 전송된 패킷을 발생한 연산 자원에서 일어날 수 있다. 새로운 충돌 결과가 이전에 존재한 패킷으로부터의 충돌 결과와 결합될 수 있다(도 19 참조). 이서, 패킷 내 광선의 컬렉션이 테스트 대기 상태인지 여부가 결정된다(2025)(에, 포화(fullness)에 근거). 그렇지 않으면, 다른 패킷이 처리될 수 있다(2040). 대기 상태이면, 이 패킷과 연관된 모양의 자 모양이 페치되고(2030), 여기서 모 노드(2041)는, 예를 들면 단계(2042)에서 식별된 노드의 모양 및 자 노드이다. 이어서, 새로운 패킷이 모와 연관된 패킷으로부터의 광선 식별기를 가지는 각각의 자 모양에 대해, 생성될 수 있다.20 is intended to illustrate some examples of how a method in accordance with the described aspects may be implemented. A packet is sent, including shape information, ray ID, and location where collision information is rewritten (2005), and collision information can be zero'ed and "don't care" at this point. The first test is performed on ray 1 ID (2006) and finds a collision, thus one 1 is written to the packet and the packet is passed to the second test (2007), where ray 3 is first. It was found to be localized for 2 tests, and found to be non-collision, so one zero is recorded (or maintained), and collision information from test 2006 is moved in the packet direction (i.e. within the packet). Rays are tested sequentially) A third test (2008) is performed on rays 2 and found to be collisions, this example shows that the rays in a packet can be tested out of the order in which they exist in the packet, and Which tester is responsible for the specified beam ID The test sequence is determined by whether it was best suited to access the ray data, the test continues until all the ray IDs have been tested (2009), then packets are merged, which means that only collision information needs to be maintained. This coalescing can occur in the computational resource that originated the transmitted packet A new collision result can be combined with a collision result from a previously existing packet (see Figure 19). It is determined whether or not this test is waiting (at 2025, based on fullness). Otherwise, another packet may be processed (2040). If in waiting, the shape of the shape associated with this packet is Is fetched 2030, where the parent node 2041 is, for example, the shape and child node of the node identified in step 2042. The new packet is then sent to the packet associated with the parent. For each shape having a beam identifier of the emitter, it may be generated.

도 21 및 22는, 구현하는 데 사용될 수 있는 시스템의 내용 면에서, 위에 설명된 방법적 측면들의 요약을 돕는다. 구체적으로, 도 21은 방법(2100)이 원형 및 GAD 요소를 메인 메모리에 정하는 단계(2105)와, 광선 정의 데이터(예, 원점 및 방향 정보)를 사용하여 교차 테스트를 위한 광선을 정의하는 단계(2110)를 포함하는 것을 나타낸다. 각각의 광선은 식별기를 이용하여 식별될 수 있다(2115). 광선 정의 데이터의 서브세트가 이러한 복수의 자원의 개별적인 프로세싱 자원과 연관된 로컬화된 메모리에 저장된다. 광선은 이러한 광선에 관한 식별기와 모양 데이터를 프로세싱 자원 사이에 분산시킴으로써(2125), 테스트를 위해 스케줄 된다. 광선은 로컬로 저장된 광선에 대한 정의 데이터를 가지는 프로세싱 자원에서 테스트된다(2130). 일부의 경우에, 각각의 광선은 하나의 로컬 메모리 내에 정의 데이터를 가질 수 있다. 21 and 22 help to summarize the method aspects described above in terms of the content of a system that can be used to implement. Specifically, FIG. 21 illustrates a method 2100 in which the method 2100 assigns circular and GAD elements to the main memory, and defines beams for cross testing using beam definition data (e.g., origin and direction information). 2110). Each ray may be identified 2115 using an identifier. A subset of the ray definition data is stored in localized memory associated with individual processing resources of these plurality of resources. Rays are scheduled for testing by distributing identifiers and shape data about these rays between processing resources (2125). The ray is tested 2130 at a processing resource having definition data for the locally stored ray. In some cases, each ray may have definition data in one local memory.

광선과 원형 사이의 교차부에 대한 식별결과가 연산 자원의 제 1 서브세트에서 제 2 서브세트로 전달된다(2135). 제 2 서브세트는 교차부를 세이딩(2140) 한다. 이러한 세이딩은 새로운 광선을 생성하고, 이에 대한 정의 데이터가 로컬화된 메모리 사이에 분산되며(2145), 바람직하게는 완료된 광선에 대한 정의 데이터를 교차한다. 이러한 광선은 이어서 위에 설명한 것과 같이 테스트된다. 연산 자원의 서브세트가 인스턴스화에 의해 또는 연산 자원 할당에 의해 구현될 수 있으며, 연산 자원은 멀티스레드 프로세서 또는 코어에서 동작하는 인스턴스화 스레드를 포함한다. 할당은 시간에 대해 변경될 수 있으며, 교차 테스트 및 세이딩을 위한 자원들 사이에 정적 할당(allocation)이 필요한 것은 아니다. 예를 들어, 교차 테스트의 스레드를 실행하는 코어가 일련의 교차 테스트를 완료할 수 있으며, 원형과의 광선 교차에 대한 많은 식별결과를 이용하여 메모리 공간을 채운다. 이어서 코어는 이러한 교차부를 세이딩하도록 스위치할 수 있다.The identification of the intersection between the ray and the circle is passed 2135 from the first subset of computational resources to the second subset. The second subset shades 2140 the intersections. This shading creates a new ray and the definition data for it is distributed between localized memories (2145), preferably crossing the definition data for the completed ray. This beam is then tested as described above. A subset of computational resources may be implemented by instantiation or by computational resource allocation, where the computational resources include instantiation threads running in a multithreaded processor or core. The allocation can change over time, and static allocation between resources for cross testing and shading is not required. For example, a core running a thread of cross-tests may complete a series of cross-tests and fill the memory space with many identifications for ray intersection with the prototype. The core may then switch to shade these intersections.

위의 일부 예는 교차에 대한 GAD 요소의 테스트의 관점에서 주로 설명되었다. 여기서 이러한 테스트의 결과는 점차 작아지는 원형의 그룹화에 대해 광선을 그룹화하는 것이다(특정한 GAD 요소를 이용한 광선 ID의 조합화를 통해). 최종적으로, 테스트에 의해 식별된 GAD 요소는, GAD 요소와 연관된 그룹의 일부가 됨에 따라, 식별된 광선에 대해 테스트될 원형을 규정할 것이라는 사실이 설명되었다. 원형을 가지는 패킷에 대하여, 교차 테스트의 결과는 광선/원형 교차부의 식별결과이며, 이는 (편의를 위해) 보통, 광선을 정의하는 다른 데이터를 이용하여, 지정된 광선에 관해 검출된 가장 인접한 교차부를 추적함으로써, 얻어진다.Some of the examples above have been described primarily in terms of testing GAD elements for intersections. The result of this test is to group the rays for a progressively smaller circular grouping (via a combination of ray IDs using specific GAD elements). Finally, it has been described that the GAD element identified by the test will define a prototype to be tested for the identified rays as it becomes part of the group associated with the GAD element. For packets with a circle, the result of the crossover test is the identification of the ray / circle intersection, which (for convenience) usually tracks the nearest intersection detected for a given ray, using other data defining the ray. It is obtained by doing this.

이어서, 지정된 광선이 장면 전체에 대해 테스트된 후에, 가장 인접한 검출된 교차부가, 존재하는 경우에, 각각의 광선에 관하여, 광선 ID를 애플리케이션 또는 드라이버로 또는 세이딩 프로세스를 시작하기 위해 이러한 결과를 사용하는 다른 프로세스로, 리턴한다. 이 명세서의 다양한 실시예에서와 같이, 광선 식별기는 큐잉 전략을 통해 리턴될 수 있다(즉, 어느 연산 자원이 특정한 교차에 대한 세이딩 코드를 실행할 것인지를 특정하거나 사전 지정된 세이딩 자원에 의해 테스트된 검출된 교차부를 가지는 특정한 교차 테스트 자원을 특정할 필요가 없다). 일부 교차 테스트에서, 무게중심좌표가 교차 테스트에 관하여 계산되고, 이러한 좌표는 필요한 경우에, 세이딩에 이용될 수 있다. 이는 교차 테스터로부터 세이더로 전송될 수 있는 다른 데이터의 예이다.Subsequently, after the specified ray has been tested for the entire scene, the nearest detected intersection, if any, is used for each ray, using these results to start the shading process or the ray ID to the application or driver. Return to the other process. As in various embodiments of this specification, ray identifiers may be returned via a queuing strategy (i.e. specifying which computational resources to execute the shading code for a particular intersection or tested by a predetermined shading resource. There is no need to specify a particular cross test resource with detected cross sections). In some crossover tests, the center of gravity coordinates are calculated with respect to the crossover test, and these coordinates can be used for shading if necessary. This is an example of other data that may be sent from the cross tester to the shader.

일반적으로, 이 명세서에 설명된 기능 유닛, 특성 및 그 외의 로직 중 어느 하나가 다양한 연산 자원을 구현하는 데 사용될 수 있다. 연산 자원은 스레드, 코어, 프로세서, 고정 기능 프로세싱 소자 등일 수 있다. 또한, 다른 기능 유닛(가령, 컬렉션 또는 패킷 관리 유닛)이 복수의 연산 자원사이에 분산되거나 하나의 연산 자원에 로컬화될 수 있는 프로세스, 스레드 또는 작업으로 제공되거나 구현될 수 있다(예, 복수의 물리적 연산 자원 사이에 분포된 복수의 스레드). 작업은 주로 연산 자원에 의해 관리된 컬렉션을 가지는 모양에 대한 교차 테스트 결과를 포함하는 인 플라이트 상태의 패킷을 식별하는 단계를 포함한다.In general, any of the functional units, features, and other logic described herein may be used to implement various computing resources. The computational resource may be a thread, core, processor, fixed function processing element, or the like. In addition, other functional units (eg, collections or packet management units) may be provided or implemented as a process, thread or task that may be distributed among a plurality of computing resources or localized to one computing resource (e.g., a plurality of Multiple threads distributed among physical compute resources). The task primarily involves identifying packets in an in-flight state that contain cross-test results for shapes having collections managed by the compute resource.

마찬가지로, 교차 테스트를 위해 사용된 연산 자원은 검출된 교차부를 세이딩하는데 사용된 세이딩 프로세스와 같은, 다른 프로세스를 주관할 수 있다. 예를 들어, 교차 테스트를 실행하는 프로세서는 세이딩 스레드를 실행할 수 있다. 예를 들어, 링버스 구현예에서, 하나의 프로세싱 자원에 대한 큐가 교차 테스트를 위한 임의의 패킷을 현재 가지지 않는 경우에, 데이터 프로세싱 자원은 이전에 식별된 교차부를 세이딩하기 위한 스레드를 대신에 시작할 수 있다. 지정된 프로세서에서 교차 테스트 스레드를 가지는 것과 이 스레드에 의해 검출된 광선 교차부에 관해 세이딩 스레드를 동작시키는 것 사이의 근본적인 차이는, 필요조건이나 일반적인 관계가 존재하지 않는다는 것이다. 대신에, 큐잉된 광선/원형 교차부가 세이딩 스레드에 광선 입력을 제공하고, 따라서 교차 테스트 자원과 세이딩 자원 사이의 매핑이 임의 대 임의 방식일 수 있으며, 이에 따라 서로 다른 하드웨어 유닛 또는 소프트웨어 유닛이 동일한 고아선에 대한 교차 테스트 및 세이딩을 수행할 수 있다.Similarly, the computational resources used for cross testing may host other processes, such as the shading process used to shade the detected intersections. For example, a processor executing a cross test can execute a shading thread. For example, in a Ringbus implementation, if the queue for one processing resource does not currently have any packets for cross testing, then the data processing resource replaces the thread for shading the previously identified intersection. You can start The fundamental difference between having a cross test thread on a given processor and operating the shading thread with respect to the ray intersection detected by this thread is that no requirement or general relationship exists. Instead, the queued ray / circular intersection provides ray input to the shading thread, so that the mapping between the cross test resource and the shading resource can be arbitrary versus arbitrary, so that different hardware units or software units Cross testing and shading can be performed on the same orphanage.

마찬가지로, 다양한 큐 및 서로 다른 기능 유닛 사이의 통신을 중재하는 여러 인터페이스(예, 교차 테스트 자원들 사이 및 교차 테스트 및 세이딩 사이)가 이들을 구현하는데 적합한 물리적 자원과 관련된 고려사상에 따라 선택될 수 있는 다양한 버퍼링 전략 중 하나에 따라 하나 이상의 메모리에 구현될 수 있다. 큐들은 발신지 자원 또는 종착지 자원에 의해 제어될 수 있다. 다르게 설명하면, 종착지가 공유 버스상의 데이터를 기대하고 필요한 데이터를 취득할 수 있으며, 또는 데이터가 메모리 매핑, 직접 통신 등에 의해 종착지로 어드레스될 수 있다.Similarly, various interfaces that mediate communication between various queues and different functional units (eg, between cross test resources and between cross test and shading) may be selected according to considerations related to the physical resources suitable for implementing them. It can be implemented in one or more memories according to one of various buffering strategies. Queues may be controlled by source resource or destination resource. In other words, the destination may expect data on the shared bus and obtain the required data, or the data may be addressed to the destination by memory mapping, direct communication, or the like.

추가 예에 의하면, 코어가 멀티스레딩을 지원하는 경우에, 스레드는 다른 스레드가 교차 프로세싱에 특화될 수 있는 동안, 세이딩에 특정될 수 있다. 그러나 광선 데이터(이는 교차 테스트 자원에 대한 캐시 할당 우선순위를 유지함)를 보관하는 대신에, 텍스처나 다른 세이딩 정보를 페치한 결과 발생하는 캐시의 비-일관성(혼동)을 피하도록 주의해야 한다. As a further example, where the core supports multithreading, a thread may be specific to shading while other threads may be specialized for cross processing. However, instead of keeping ray data (which maintains cache allocation priority for cross-test resources), care must be taken to avoid inconsistencies (confusing) of the cache resulting from fetching textures or other shading information.

이러한 아키텍처는 모양 데이터에 대한 캐시 조건을 줄이는 장점을 가지기 때문에, 데이터 종류에 관한 캐시 일관성(coherency)에 대한 고려사항이 감소된다. 사실상, 일부 구현예에서는, 특정한 모양 데이터를 계속 이용할 수 있도록 하는데, 그리고 모양 데이터가 다시 사용될 시점을 예측하는 것이 어렵지 않다. 대신에, 광선 ID의 지정된 패킷이 테스트를 위해 대기상태인 때, 이러한 패킷에 관한 모양 데이터가 최고속 메모리로부터 획득되고, 이를 저장하며, 다른 패킷을 처리하기 위한 현재 작업로드가 이와같은 패치 동작에서 발생된 레이턴시를 방어할 것이다. 교차를 위해 이러한 모양을 테스트한 후에, 모양 데이터가 겹쳐쓰기 되도록 허용될 수 있다. Since this architecture has the advantage of reducing cache conditions for shape data, considerations about cache coherency with respect to data types are reduced. In fact, in some implementations, it is possible to continue to use certain shape data, and it is not difficult to predict when the shape data will be used again. Instead, when a specified packet of rays IDs is waiting for testing, shape data about these packets is obtained from and stored in the fastest memory, and the current workload for processing other packets is in this patch operation. It will defend against latency incurred. After testing this shape for intersection, the shape data may be allowed to be overwritten.

여기에서 식별된 큐들 중 임의의 하나가, 링크된 리스트, 순환 버퍼, 메모리 직렬형 또는 스트링 메모리 위치와 같이, 또는 큐에 관하여 종래기술로 알려진 임의의 기능 유닛의 형태로 SRAM 내의 공유 메모리 자원에서 구현될 수 있다. 큐는 패킷의 순위를 유지하도록 동작하여, 먼저 동작한 패킷이 먼저 출력된다. 그러나 이것이 필수적인 사항은 아니다. 일부 예에서, 각각의 연산 자원에는, 순서와 다르게 패킷을 처리하는 것이 이익인지 여부를 결정하기 위해, 큐 내의 지정된 수의 패킷을 검사할 능력이 제공된다. 이러한 구현예는 순차적 동작 시스템보다 더 복잡하며 필요에 따라 제공될 수 있다. Any one of the queues identified herein is implemented in a shared memory resource in the SRAM, such as a linked list, circular buffer, memory serial or string memory location, or in the form of any functional unit known in the art with respect to the queue. Can be. The queue operates to maintain the priority of the packets, so that the first packet is output first. But this is not essential. In some examples, each computing resource is provided with the ability to examine a specified number of packets in a queue to determine whether it is beneficial to process the packets out of order. Such an implementation is more complex than a sequential operating system and can be provided as needed.

컴퓨터-실행형 명령은 예를 들어, 범용 컴퓨터, 특수 목적 컴퓨터, 또는 특수 목적 프로세싱 장치가 특정한 기능 또는 기능 그룹을 수행하도록 하는 명령 및 데이터를 포함한다. 컴퓨터 실행형 명령은 예를 들면, 어셈블리 언어 또는 소스 코드와 같은 바이너리, 중간 형식 명령일 수 있다. 일부 주된 목적이 구성적 특징 및.또는 방법적 단계의 실시예를 특정하는 언어로 설명되었으나, 첨부도니 청구항에 정의된 주된 목적이 설명된 특징 또는 동작에만 제한될 필요는 없다. 오히려, 설명된 특징 및 단계는, 첨부된 청구항의 범위 내에서 시스템 및 방법의 컴포넌트의 예로 개시된다.Computer-executable instructions include, for example, instructions and data that cause a general purpose computer, special purpose computer, or special purpose processing device to perform a particular function or group of functions. The computer executable instructions may be, for example, binary, intermediate format instructions such as assembly language or source code. Although some main objects have been described in language that specifies embodiments of constructive features and / or method steps, the main objects defined in the appended claims are not necessarily limited to the described features or operations. Rather, the described features and steps are disclosed as examples of components of systems and methods within the scope of the appended claims.

위에서, 연산 하드웨어 및/또는 소프트웨어 프로그램에 대한 다양한 예가 설명되었으며, 마찬가지로, 이러한 하드웨어/소프트웨어가 서로 통신하는 방법이 예시되었다. 소프트웨어와 이러한 통신 인터페이스로 구성된 하드웨어의 실시예는 이들 각각에 속한 기능을 달성하기 위한 수단을 제공한다. 예를 들어, 이 명세서의 일부 실시예에 따른 교차 테스트를 위한 수단은 다음 중 임의의 하나를 포함할 수 있다: (1) 복수의 독립적으로 동작하는 연산 자원(각각은 광선 정의 데이터의 로컬화된 저장영역을 가지며, 이러한 광선 및 모양 데이터에 관한 식별기의 제공에 응답하여 모양과 광선의 교차를 테스트하도록 동작).Above, various examples of computing hardware and / or software programs have been described, and likewise, how such hardware / software communicate with each other has been illustrated. Embodiments of software and hardware consisting of such communication interfaces provide a means for achieving the functions belonging to each of them. For example, the means for cross testing according to some embodiments of this specification may include any one of the following: (1) a plurality of independently operating computational resources, each of which is localized in ray definition data; Having a storage area and operative to test the intersection of a shape and a light beam in response to providing an identifier for such light and shape data.

예를 들면, 광선의 컬렉션을 관리하는 수단은, 광선 식별기의 그룹을 추적하고, 광선 식별기의 그룹과 연관된 모양에 의해 결정된 모양 또는 모양 데이터 및 광선 식별기를 가지는 패킷을 형성하기 위한 정보를 제공하는, 프로그래밍, FPGA나 ASIC, 이들의 일부로 구성된 연산 자원을 포함한다. For example, the means for managing the collection of rays tracks a group of ray identifiers and provides information for forming a packet having shape or shape data determined by the shape associated with the group of ray identifiers and the ray identifier, Contains computational resources consisting of programming, FPGAs or ASICs, and parts of them.

예를 들어, 위에 설명된 기능 유닛이 교차 테스트를 완료하고 원형과 교차된 광선에 대한 식별기를, 큐를 통해, 이러한 교차부를 세이딩 하도록 구성된 연산 자원에 프로세싱을 위해 전송하는 것을 포함한다. 일한 기능을 구현하는 수단은, 하드웨어 큐, 또는 큐나 리스트로 구조화된 공유 메모리 공간(가령 링 버퍼 또는 링크된 리스트로 구성된 메모리) 등을 포함할 수 있다. 따라서, 이러한 수단은 광선 식별기와 원형 식별기가 큐의 다음 또는 특정한 슬롯이나 메모리 내 위치로부터 획득되게 하는 프로그램 및/또는 로직을 포함할 수 있다. 컨트롤러는, 광선 및 원형 식별기를 출력 또는 입력하기 위한 다음 판독 위치 및 다음 기록 위치를 유지하도록, 큐 또는 메모리를 관리할 수 있다. 이러한 큐잉 수단은 또한, 이러한 자원이 광선 식별기 모양 데이터의 패킷을 서로에게 전달할 때, 교차 테스트 자원들이 함께 인터페이스하는데 사용될 수 있다. 이러한 큐잉 수단은 또한 교차 테스트의 시작을 기다리는 새로운 광선에 대한 광선 식별기를 수신하는 데 이용될 수 있다. 따라서, 더 상세한 큐잉 기능이 이러한 수단 또는 등가물에 의해 구현될 수 있다.For example, the functional unit described above includes completing the crossover test and sending an identifier for the ray intersected with the circle for processing to a computational resource configured to shade this crossover, via a queue. Means for implementing the functions may include hardware queues, or shared memory space structured as queues or lists (e.g., memory consisting of ring buffers or linked lists), and the like. Thus, such means may include programs and / or logic to cause the ray identifier and the circular identifier to be obtained from the next or specific slot or memory location in the queue. The controller may manage the queue or memory to maintain the next read position and the next write position for outputting or inputting the light and circular identifiers. This queuing means can also be used to interface cross test resources together when such resources deliver packets of ray identifier shape data to each other. This queuing means may also be used to receive a ray identifier for a new ray waiting for the start of the crossover test. Thus, more detailed queuing functionality can be implemented by such means or equivalent.

예를 들면, 위에 설명된 기능은 광선과 원형 사이의 식별된 교차부를 세이딩하는 것을 포함한다. 이러한 기능은 교차된 원형과 연관된 프로그래밍을 사용하여 구성된 연산 하드웨어를 포함하는 수단에 의해 구현될 수 있다. 프로그래밍은 연산 하드웨어가 이러한 텍스처, 절차상 기하학적 변경 등과 같은 데이터를 획득하여, 빛이 충돌된 원형에 미치는 영향을 결정하는데 어떠한 정보가 필요한지를 결정하도록 한다. 프로그래밍은 추가 교차부가 테스트되도록 새로운 광선을 발생하도록 할 수 있다(예, 세도우, 회절, 반산 광선). 프로그래밍은 이러한 광선을 발생시키기 위해 애플리케이션 프로그램 인터페이스와 인터페이스할 수 있다. 세이딩 프로그램에 의해 정의된 광선은 원점 및 방향 정의 정보를 포함하며, 컨트롤러가 이와 같이 정의된 광선에 대한 광선 식별기를 결정할 수 있다. 고정형 기능 하드웨어가 이러한 기능성의 일부를 구현하는데 사용될 수 있다. 그러나, 프로그램 가능한 세이딩이 원형 및/또는 그 외의 코드와 연관된 코드에 따라 구성될 수 있는 연산 자원을 필요에 따라 이용할 수 있도록 하는 것이 바람직하다.For example, the functionality described above includes shading the identified intersection between the ray and the circle. Such functionality may be implemented by means including computational hardware configured using programming associated with crossed primitives. Programming allows the computing hardware to acquire data such as textures, procedural geometric changes, etc., to determine what information is needed to determine the effect of light on the impacted prototype. Programming can cause additional intersections to be generated to generate new rays (eg shadows, diffractions, half-beams). Programming may interface with an application program interface to generate these rays. The rays defined by the shading program include origin and direction definition information, and the controller can determine the ray identifier for the rays thus defined. Fixed functional hardware can be used to implement some of this functionality. However, it is desirable to allow programmable shading to make use of computational resources that may be constructed in accordance with code associated with prototype and / or other code as needed.

예를 들어, 위에 설명된 다른 기능은 교차 테스트 중인 및/또는 교차 테스트를 기다리는 광선의 마스터 리스트를 보관하고, 마스터 광선의 서브세트를 교차 테스트를 위한 수단과 연관된 분산된 캐시 메모리들 사이에 분산시킨다. 이러한 기능은 이러한 기능을 구현하는 프로그래밍의 제어 하에 데이터를 저장하는 메모리와 인터페이스 하도록 통합되거나 분리된 메모리 제어 장치를 이용하는 프로세서 또는 프로세서 그룹을 포함하는 수단에 의해 구현될 수 있다. 이러한 프로그래밍은 교차 테스트 기능성과 연관되거나 이를 제어하는 드라이버에 부분적으로 포함될 수 있다.For example, another function described above maintains a master list of rays under cross testing and / or awaiting cross testing, and distributes a subset of the master rays between distributed cache memories associated with the means for cross testing. . Such functionality may be implemented by means including processors or groups of processors that utilize integrated or separate memory control devices to interface with memory for storing data under the control of programming that implements such functionality. Such programming can be partly included in drivers that are associated with or control cross test functionality.

설명 및/또는 청구된 기능 및 방법의 여러 측면들이 특정 목적 또는 범용 컴퓨터(컴퓨터 하드웨어 포함)에, 이에 상세히 설명될 바와 같이, 구현될 수 있다. 이러한 하드웨어, 펌웨어 및 소프트웨어는 또한 비디오 카드나, 외부 또는 내부 컴퓨터 시스템 주변회로에 실현될 수 있다. 다양한 기능성이 커스텀화 된 FPGA 또는 ASIC 또는 그 외의 구성형 프로세서에 제공될 수 있다. 반면, 일부 기능성은 관리 또는 호스트 프로세서에 제공된다. 이러한 프로세싱 기능성은 퍼스널 컴퓨터, 데스크 탑 컴퓨터, 랩탑 컴퓨터, 메시지 프로세서, 핸드-헬드 장치, 다중 프로세서 시스템, 마이크로프로세서 기반 또는 프로그램 가능형 소비자 전자장치, 게임 콘솔, 네트워크 PC, 미니컴퓨터, 메인프레임 컴퓨터, 모바일 전화, PDA, 호출기 등에 사용될 수 있다.Various aspects of the described and / or claimed functions and methods may be implemented on a specific purpose or general purpose computer (including computer hardware), as will be described in detail herein. Such hardware, firmware, and software can also be implemented on video cards or external or internal computer system peripherals. Various functionality can be provided in a customized FPGA or ASIC or other configurable processor. On the other hand, some functionality is provided to the management or host processor. This processing functionality includes personal computers, desktop computers, laptop computers, message processors, hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, game consoles, network PCs, minicomputers, mainframe computers, It can be used for mobile phones, PDAs, pagers, and the like.

나아가, 도 1의 링크(112, 121, 및 118) 및 그 외의 도면의 유사 링크과 같이, 도면에 도시된 통신 링크와 그 외의 데이터 플로우 구성은 식별된 기능의 구현예에 따라 여러 가지 방식으로 구현될 수 있다. 예를 들어, 교차 테스트 유닛(109)이 하나 이상의 CPU에서 실행되는 복수의 스레드를 포함하는 경우에, 링크(118)는 이러한 CPU의 물리적 메모리 액세스 자원과, 적합한 메모리 컨틀롤러 하드웨어/펌웨어/소프트웨어를 포함하여, 광선 데이터 저장장치(105)로의 액세스를 제공할 수 있다. 추가 예에 의하면, 교차 테스트 영역(140)이 PCI 익스프레스 버스에 의해 호스트(140)에 연결된 그래픽 카드에 위치하며, 링크(121, 112)가 PCI 익스프레스 버스를 사용하여 구현될 수 있다.Further, as with links 112, 121, and 118 in FIG. 1 and similar links in other figures, the communication links and other data flow configurations shown in the figures may be implemented in various ways depending on the implementation of the identified functionality. Can be. For example, if the cross test unit 109 includes a plurality of threads running on one or more CPUs, the link 118 may provide the physical memory access resources of such CPUs and the appropriate memory controller hardware / firmware / software. And provide access to the ray data storage 105. As a further example, cross-test area 140 is located on a graphics card connected to host 140 by a PCI express bus, and links 121 and 112 may be implemented using a PCI express bus.

이 명세서에 설명된 것과 같은 교차 테스트는 일반적으로 대형 시스템 및 시스템 컴포넌트에서 일어난다. 예를 들어, 프로세싱이 네트워크(가령, 로컬 또는 광역 네트워크)에 걸쳐 분포될 수 있으며, 피어 투 피어 기술 등을 사용하여 구현될 수 있다. 작업(task)의 분리는 원하는 제품의 성능 또는 시스템, 원하는 가격 포인트, 또는 이들의 조합에 근거하여 결정될 수 있다. 소프트웨어로 설명된 임의의 유닛의 일부 또는 전부를 구현하는 구현예에서, 단위 기능성을 나타내는 컴퓨터-실행형 명령이 컴퓨터-판독형 매체(예를 들면 자기 또는 광학 디스크, 플래시 메모리, USB 장치)에 저장되거나, NAS 또는 SAN 장치 등과 같은 저장 장치의 네트워크에 저장될 수 있다. 그 외의 적절한 정보(예를 들면 프로세싱을 위한 데이터)가 이러한 매체에 저장될 수 있다.Cross testing, as described in this specification, generally occurs in large systems and system components. For example, processing may be distributed across a network (eg, a local or wide area network), and may be implemented using peer to peer technology, and the like. Separation of tasks may be determined based on the performance or system of the desired product, the desired price point, or a combination thereof. In an embodiment that implements some or all of any unit described in software, computer-executable instructions representing unit functionality are stored on a computer-readable medium (eg, magnetic or optical disk, flash memory, USB device). Or stored on a network of storage devices such as NAS or SAN devices. Other suitable information (eg, data for processing) may be stored on this medium.

또한, 이 명세서에 용어가 사용된 일부의 경우에, 본 발명이 속하는 분야의 기술자에게 특징적인 포인트를 전달하는 것으로 여겨지기 때문에, 이러한 용어가 개시된 실시예에 및 발명의 여러 측면을 포괄하는 구현예의 범위를 은연중에 제한하는 것으로 이해되어서는 안된다. 예를 들어, 광선은 종종 원점과 방향을 가지는 것으로 언급되며, 이러한 개별적인 아이템이 각각, 개시물의 여러 측면에 대한 이해를 위해, 삼차원 공간 내의 포인트 및 삼차원 공간의 방향 벡터를 나타내는 것으로 도시될 수 있다. 그러나, 본 발명의 범위 내에서, 광선을 표현하는 여러 다양한 방식이 제공될 수 있다. 예를 들어, 광선 방향은 구면 좌표로 표현될 수 있다. 또한, 원래 표현된 데이터의 정보에 대한 의미를 유지하면서, 하나의 형식으로 제공된 데이터 다른 형식으로 변형되거나 매핑될 수 있다.In addition, in some instances where terms are used herein, they are believed to convey characteristic points to those skilled in the art to which such terms relate to embodiments disclosed and that encompass aspects of the invention. It should not be understood as limiting scope. For example, light rays are often referred to as having origins and directions, and each such individual item may be shown to represent a point vector in three-dimensional space and a direction vector in three-dimensional space, respectively, for understanding of various aspects of the disclosure. However, within the scope of the present invention, various different ways of expressing light rays can be provided. For example, the ray direction may be expressed in spherical coordinates. In addition, data provided in one format may be transformed or mapped to another format while maintaining the meaning of the information of the originally expressed data.

상술한 본 발명의 실시예들은 단지 예시와 설명을 위한 것일 뿐이며, 본 발명을 설명된 형태로 한정하려는 것이 아니다. 따라서, 다양한 변화 및 변경을 할 수 있음은 본 발명이 속하는 분야의 당업자에게 자명하다. 또한, 이 명세서의 상세한 설명이 본 발명의 범위를 제한하는 것은 아니다. 본 발명의 범위는 첨부된 청구항에 의해서 정의된다.
The above-described embodiments of the present invention are for illustration and description only, and are not intended to limit the present invention to the described form. Accordingly, various changes and modifications can be made to those skilled in the art to which the present invention pertains. In addition, the detailed description of this specification does not limit the scope of the present invention. The scope of the invention is defined by the appended claims.

Claims

In a system for controlling ray tracing in rendering a two-dimensional representation of a three-dimensional scene consisting of primitives, the system is:
One or more cross test resources having access to a separate cache memory that stores a subset of the master copy of the ray data;
In a separate test resource having access to definition data for a ray in the respective cache memory, the control logic for assigning an identifier for each ray and controlling a test of each ray, wherein the control comprises a plurality of individual ray identifiers. Control logic, characterized in that by providing as a cross test resource; And
Output queue that stores data for the crossed circle and data for identifying the ray that completed the cross test by crossing the circle.
Ray tracing control system comprising a.

The method of claim 1,
Further comprising a plurality of computational resources for executing shader code routines associated with the prototype,
And the control logic identifies a shader code routine to be executed based on data obtained from the output queue.

The method of claim 2,
Execution of the shader code routine generates a new beam of cross tests,
Further comprising an input queue to a plurality of cross test resources for receiving the new ray,
And the control logic starts the cross test of the new light beam as another light beam completes the cross test.

The method according to any one of claims 1 to 3,
Each cross test resource is configured to, in response to receipt of a ray identifier stored in a separate memory, test against the intersection of the identified rays with a set of geometric shapes on the data provided by the ray identifier. Control system.

The method according to any one of claims 1 to 4,
The crossover test is made between a ray and a geometric shape comprising an acceleration structural element selected from one or more of a cutting plane, an axially aligned bounding box and a sphere of the kD-tree.

6. The method according to any one of claims 1 to 5,
Each cross-test resource is configured to receive a packet of ray identifiers, determine whether individual memory stores definition data for the identified ray, and test the identified geometry and the identified ray in the packet. Ray tracing control system.

The method according to claim 6,
The plurality of cross test resources is configured to continuously deliver the packet to a next one of the plurality of cross test resources until all cross test resources storing data for light in the packet have received the packet. Ray tracing control system.

The method according to any one of claims 6 to 7,
Each cross test resource stores the results of an acceleration device test in the packet and stores the primary test results for each light in a separate cache memory of the cross test resources until the light beam completes the cross test in the scene. And a ray tracing control system.

A method of controlling ray tracing of a scene consisting of a plurality of primitives in a system having a plurality of computational resources, each computational resource coupled to a separate local memory and a shared main memory, wherein the main memory has a greater latency than the local memory. Wherein the method is:
Distributing data defining local subsets of rays to be cross-tested in the scene between local memories of the plurality of computational resources;
Determining a group of rays, the members of the group of rays collectively stored in a plurality of local memories, to test for intersection with a geometric shape;
Providing data for the geometric shape and the ray identifier to the one or more computational resources such that the one or more computational resources having local memory storing definition data for each ray of the group receive the geometric shape data and the ray identifiers; And
Receiving from the computing resource an identification result of the detected intersection between the group of light rays and the geometric shape,
And the identification result is a test result of each ray of the group in one or more arithmetic resources including a local memory for storing definition data for the ray.

The method of claim 9,
Fetching data defining the shape from the main memory, wherein providing data for the geometric shape provides the shape definition data to an arithmetic resource having the group of ray identifiers. Ray tracing control method.

The method according to any one of claims 9 to 10,
The identification result includes data on the intersection between the geometric acceleration element and the ray, the group of rays being formed by collecting the rays determined to intersect the same geometric acceleration element,
And deferring further testing of the geometric acceleration elements associated with the geometric acceleration elements until the required number of rays have been collected.

The method according to any one of claims 9 to 11,
And the local memory further comprises a cache member, and further comprising preventing a specified ray from being overwritten in the cache until the ray has completed the crossover test.

The method according to any one of claims 9 to 12,
Maintaining, in separate local memory, the current closest intersection detected for the ray having the ray definition data stored in the local memory, and each identification result for receiving in response to an identification result of the nearest intersection between the circular and the specified ray. Steps to generate
Ray tracing control method comprising a.

14. The method according to any one of claims 9 to 13,
Data for the geometric shape is selected from a set of references identifying one or more shapes to be tested and a set of data defining one or more shapes to be tested.

15. The method according to any one of claims 9 to 14,
The providing step includes queuing a ray identifier with a first queue to which the computational resource is connected for reception;
And the receiving step comprises receiving an identification result from a second queue.

The method according to any one of claims 9 to 15,
And keeping a master copy of the rays in said main memory.

A computer readable medium storing, in at least one computing resource, a module comprising computer executable code for implementing a method according to any one of claims 9 to 16.

In a system for rendering a two-dimensional representation of a three-dimensional scene consisting of a plurality of circles using ray tracing,
A memory for storing a plurality of prototypes constituting a three-dimensional scene;
A plurality of cross test resources for respectively testing at least one of the plurality of circles and at least one ray crossing the scene, and outputting identification results of the detected intersections;
A plurality of shader resources for driving a shading routine for the detected ray / circle intersections;
A first communication link for outputting display indices of intersections detected as the shader resources; And
A second communication link for transmitting a new light beam generated as a result of driving said shading routine to said cross test resource,
A new ray is sent to the cross test resource and completes the cross test in a different order than the ray sent relative order.

The method of claim 18,
Further comprising a channel for passing messages between a plurality of cross-test resources,
And the cross test resource is configured to interpret data in a received message as each includes a plurality of ray identifiers, and cross test the selected ray identified in the message.

The method of claim 18,
And wherein the cross test resource consists of a ring for passing packets of a ray identifier between the cross test resources.

The method of claim 18,
Each of the cross test resources may be configured to generate individual beams for testing according to a determination of whether the cache associated with the cross test resource stores definition data for one of the beams identified in the message passed between the cross test resources. Selecting rendering system.

The method of claim 18,
Wherein the plurality of cross-test resources are implemented as threads of computer-executable instructions executing on one or more computing cores, the computing cores each allowing localized cache memory access to a subset of rays traversing the scene. Rendering system.

The method of claim 18,
A memory for storing prototypes constituting a three-dimensional scene is implemented as a main memory for one or more computing cores, wherein the one or more computing cores collectively execute a plurality of threads simultaneously, and the plurality of threads are subject to time-varying changes. Accordingly, the rendering system is allocated between implementing the cross test resource and the shader resource.