TW202219890A

TW202219890A - Sparse image sensing and processing

Info

Publication number: TW202219890A
Application number: TW109139740A
Authority: TW
Inventors: 安德魯山謬爾博寇維奇; 里德皮克翰
Original assignee: 美商菲絲博克科技有限公司
Priority date: 2020-11-13
Filing date: 2020-11-13
Publication date: 2022-05-16

Abstract

In one example, an apparatus comprises: an image sensor comprising a plurality of pixel cells; a frame buffer; and a sensor compute circuit configured to: receive, from the frame buffer, a first image frame comprising first active pixels and first inactive pixels, the first active pixels being generated by a first subset of the pixel cells selected based on first programming data; perform an image-processing operation on a first subset of pixels of the first image frame, whereby a second subset of pixels of the first image frame are excluded from the image-processing operation, to generate a processing output; based on the processing output, generate second programming data; and transmit the second programming data to the image sensor to select a second subset of the pixel cells to generate second active pixels for a second image frame.

Description

Sparse Image Sensing and Processing

本發明係關於一種影像感測器。更具體言之並且非限制性地，本發明係關於執行稀疏影像感測及處理操作之技術。The present invention relates to an image sensor. More specifically, and not by way of limitation, this disclosure relates to techniques for performing sparse image sensing and processing operations.

典型影像感測器包括像素單元陣列。每一像素單元可包括光電二極體以藉由將光子轉換成電荷（例如，電子或電洞）來感測光。在每一像素單元處轉換之電荷可經量化以變為數位像素值，並且影像可自數位像素值之陣列產生。A typical image sensor includes an array of pixel cells. Each pixel cell may include a photodiode to sense light by converting photons into electrical charges (eg, electrons or holes). The charge converted at each pixel cell can be quantized to become a digital pixel value, and an image can be generated from an array of digital pixel values.

由影像感測器產生之影像可經處理以支援不同應用程式，諸如虛擬實境（virtual-reality；VR）應用程式、擴增實境（augmented-reality；AR），或混合實境（mixed reality；MR）應用程式。接著可對影像執行影像處理操作以偵測某一所關注物件及其在影像中之位置。基於物件以及其在影像中之位置的偵測，VR/AR/MR應用程式可產生並且更新例如用於經由顯示器向使用者顯示之虛擬影像資料、用於經由揚聲器輸出至使用者之音訊資料等，以向使用者提供互動式體驗。The images generated by the image sensor can be processed to support different applications, such as virtual-reality (VR) applications, augmented-reality (AR), or mixed reality (mixed reality) ; MR) application. Image processing operations can then be performed on the image to detect an object of interest and its location in the image. Based on the detection of the object and its position in the image, the VR/AR/MR application can generate and update, for example, virtual image data for display to the user via a display, audio data for output to the user via a speaker, etc. , to provide users with an interactive experience.

為了改善影像操作之空間及時間解析度，影像感測器通常包括大量像素單元並且以高圖框率產生影像。以高圖框率產生高解析度影像圖框以及傳輸且處理這些高解析度影像圖框會引起影像感測器及影像處理操作之巨大功率消耗。此外，鑒於通常僅像素單元之小子集自所關注物件接收光，因此在產生、傳輸及處理不適用於物件偵測/追蹤操作之像素資料時會浪費許多功率，從而降低影像感測及處理操作之總效率。To improve the spatial and temporal resolution of image manipulation, image sensors typically include a large number of pixel cells and generate images at high frame rates. Generating high-resolution image frames at high frame rates and transmitting and processing these high-resolution image frames can result in huge power consumption for image sensors and image processing operations. Furthermore, since typically only a small subset of pixel cells receive light from the object of interest, much power is wasted in generating, transmitting and processing pixel data that is not suitable for object detection/tracking operations, reducing image sensing and processing operations total efficiency.

在一個實例中，提供一種設備。該設備包含一影像感測器、一圖框緩衝器及一感測器計算電路。該影像感測器包含複數個像素單元，該影像感測器可由程式化資料進行配置以選擇該些像素單元之一子集以產生主動像素。該感測器計算電路被配置以：自該圖框緩衝器接收包含第一主動像素及第一非主動像素之一第一影像圖框，該些第一主動像素係由基於第一程式化資料所選擇之該些像素單元之一第一子集產生，該些第一非主動像素對應於未被選擇用於產生該些第一主動像素之該些像素單元之一第二子集；對該第一影像圖框之像素之一第一子集執行一影像處理操作，以產生一處理輸出，其中該第一影像圖框之像素之一第二子集被排除在該影像處理操作之外；基於該處理輸出，產生第二程式化資料；及將該第二程式化資料傳輸至該影像感測器以選擇該些像素單元之一第二子集以產生用於一第二影像圖框之第二主動像素。In one example, an apparatus is provided. The apparatus includes an image sensor, a frame buffer, and a sensor computing circuit. The image sensor includes a plurality of pixel cells, and the image sensor can be configured by programming data to select a subset of the pixel cells to generate active pixels. The sensor computing circuit is configured to: receive a first image frame including first active pixels and first inactive pixels from the frame buffer, the first active pixels being generated based on first programming data generating a first subset of the selected pixel units, the first inactive pixels corresponding to a second subset of the pixel units not selected for generating the first active pixels; a first subset of pixels of the first image frame performs an image processing operation to generate a processing output, wherein a second subset of pixels of the first image frame is excluded from the image processing operation; based on the processing output, generating second programming data; and transmitting the second programming data to the image sensor to select a second subset of the pixel cells to generate for a second image frame The second active pixel.

在一些態樣中，該影像處理操作包含一神經網路模型之一處理操作以偵測該第一影像圖框中之一所關注物件。像素之該第一子集對應於該所關注物件。In some aspects, the image processing operation includes a processing operation of a neural network model to detect an object of interest in the first image frame. The first subset of pixels corresponds to the object of interest.

在一些態樣中，該感測器計算電路與用以執行一應用程式之一主機裝置耦接，該應用程式使用該所關注物件之偵測的結果。該主機裝置被配置以將關於該所關注物件的資訊提供至該感測器計算電路。In some aspects, the sensor computing circuit is coupled to a host device for executing an application that uses the results of detection of the object of interest. The host device is configured to provide information about the object of interest to the sensor computing circuit.

在一些態樣中，該感測器計算電路包含：一計算記憶體，其被配置以儲存：至該神經網路的神經網路層之輸入資料、該神經網路層之權重資料以及該神經網路層之中間輸出資料；一資料處理電路，其被配置以對該輸入資料及該權重資料執行該神經網路層之算術運算以產生該中間輸出資料；及一計算控制器，其被配置以：自該計算記憶體提取該輸入資料之一第一子集以及對應於該輸入資料之該第一子集的該權重資料之一第一子集，該輸入資料之該第一子集對應於該些第一主動像素中之至少一些；控制該資料處理電路以對該輸入資料之該第一子集以及該權重資料之該第一子集執行該些算術運算以產生用於該第一影像圖框之該中間輸出資料之一第一子集，該中間輸出資料之該第一子集對應於該輸入資料之該第一子集；將用於該第一影像圖框之該中間輸出資料之該第一子集儲存在該計算記憶體中；及將用於該第一影像圖框之該中間輸出資料之一第二子集的一預定值儲存在該計算記憶體中，該中間輸出資料之該第二子集對應於該些非主動像素。In some aspects, the sensor computing circuit includes: a computing memory configured to store: input data to a neural network layer of the neural network, weight data for the neural network layer, and the neural network intermediate output data of the network layer; a data processing circuit configured to perform arithmetic operations of the neural network layer on the input data and the weight data to generate the intermediate output data; and a computation controller configured to: extract from the computing memory a first subset of the input data and a first subset of the weight data corresponding to the first subset of the input data, the first subset of the input data corresponding to on at least some of the first active pixels; controlling the data processing circuit to perform the arithmetic operations on the first subset of the input data and the first subset of the weight data to generate data for the first a first subset of the intermediate output data of the image frame, the first subset of the intermediate output data corresponding to the first subset of the input data; to be used for the intermediate output of the first image frame storing the first subset of data in the computational memory; and storing in the computational memory a predetermined value for a second subset of the intermediate output data of the first image frame, the intermediate The second subset of output data corresponds to the inactive pixels.

在一些態樣中，基於在該影像處理操作之前重置該計算記憶體來儲存該預定值。In some aspects, the predetermined value is stored based on resetting the computing memory prior to the image processing operation.

在一些態樣中，該計算控制器被配置以：自該計算記憶體提取該輸入資料；自經提取輸入資料識別該輸入資料之該第一子集；及將該輸入資料之經識別第一子集提供至該計算控制器。In some aspects, the computing controller is configured to: extract the input data from the computing memory; identify the first subset of the input data from the extracted input data; and identify the first subset of the input data A subset is provided to the compute controller.

在一些態樣中，該計算控制器被配置以：判定儲存該輸入資料之該第一子集之該計算記憶體的一位址區；及自該計算記憶體提取該輸入資料之該第一子集。In some aspects, the computing controller is configured to: determine an address region of the computing memory that stores the first subset of the input data; and retrieve the first region of the input data from the computing memory Subset.

在一些態樣中，該位址區係基於以下各者中之至少一者而判定：該第一程式化資料，或關於該神經網路模型之神經網路層之間的連接性之資訊。In some aspects, the address region is determined based on at least one of: the first programmed data, or information about connectivity between neural network layers of the neural network model.

在一些態樣中，該些第一主動像素包括靜態像素及非靜態像素；該些靜態像素對應於該些第一主動像素之一第一子集，對於該些第一主動像素之該第一子集，該第一影像圖框與一先前影像圖框之間的像素值之改變程度高於一改變臨限值；該些非靜態像素對應於該些第一主動像素之一第二子集，對於該些第一主動像素之該第二子集，該第一影像圖框與該先前影像圖框之間的該些像素值之改變程度低於該改變臨限值；且該計算控制器被配置以提取對應於該些第一主動像素之該些非靜態像素的該輸入資料之該第一子集。In some aspects, the first active pixels include static pixels and non-static pixels; the static pixels correspond to a first subset of the first active pixels, for the first one of the first active pixels a subset, the degree of change in pixel values between the first image frame and a previous image frame is higher than a change threshold; the non-static pixels correspond to a second subset of the first active pixels , for the second subset of the first active pixels, the degree of change of the pixel values between the first image frame and the previous image frame is lower than the change threshold; and the computing controller is configured to extract the first subset of the input data corresponding to the non-static pixels of the first active pixels.

在一些態樣中，該預定值為一第一預定值。該圖框緩衝器被配置以儲存用於該些靜態像素中之每一者之一第二預定值以標示該些靜態像素。該計算控制器被配置以基於偵測到該些靜態像素具有該第二預定值而將該些靜態像素排除在該資料處理電路之外。In some aspects, the predetermined value is a first predetermined value. The frame buffer is configured to store a second predetermined value for each of the static pixels to identify the static pixels. The computing controller is configured to exclude the static pixels from the data processing circuit based on detecting that the static pixels have the second predetermined value.

在一些態樣中，該圖框緩衝器被配置以基於判定橫跨一臨限數目個圖框之一像素之改變程度低於該改變臨限值而儲存用於該像素之該第二預定值。In some aspects, the frame buffer is configured to store the second predetermined value for a pixel across a threshold number of frames based on a determination that the degree of change of the pixel is below the change threshold .

在一些態樣中，該圖框緩衝器被配置以基於具有一時間常數之一漏泄積分器函數且基於一像素上次經歷大於該改變臨限值之一改變程度的時間來設定更新該像素之一像素值。In some aspects, the frame buffer is configured to set the time to update a pixel based on a leaky integrator function having a time constant and based on the last time a pixel experienced a degree of change greater than one of the change thresholds A pixel value.

在一些態樣中，該計算控制器被配置以：基於該神經網路模型之一拓樸結構來判定一資料改變傳播地圖，該資料改變傳播地圖指示該些非靜態像素之改變如何經由該神經網路模型之不同神經網路層傳播；基於該資料改變傳播地圖，判定該計算記憶體之一第一位址區，用以提取該輸入資料之該第一子集，且判定該計算記憶體之一第二位址區，用以儲存該中間輸出資料之該第一子集；自該第一位址區提取該輸入資料之該第一子集；及在該第二位址區儲存該中間輸出資料之該第一子集。In some aspects, the computational controller is configured to: determine a data change propagation map based on a topology of the neural network model, the data change propagation map indicating how changes to the non-static pixels are routed through the neural network Different neural network layers of the network model are propagated; the propagation map is changed based on the data, a first address region of the computing memory is determined, the first subset of the input data is extracted, and the computing memory is determined a second address area for storing the first subset of the intermediate output data; extracting the first subset of the input data from the first address area; and storing the first subset of the input data in the second address area The first subset of intermediate output data.

在一些態樣中，該計算控制器被配置以基於該神經網路模型之一深度及該神經網路模型之每一神經網路層處之一量化精確度來判定該改變臨限值。In some aspects, the computational controller is configured to determine the change threshold based on a depth of the neural network model and a quantitative accuracy at each neural network layer of the neural network model.

在一些態樣中，該改變臨限值為一第一改變臨限值。該計算控制器被配置以：追蹤兩個非連續圖框之間的該些第一主動像素之該些像素值的該改變程度；及基於該改變程度超過一第二改變臨限值而將該些第一主動像素之一第三子集判定為非靜態像素。In some aspects, the change threshold is a first change threshold. The computing controller is configured to: track the degree of change in the pixel values of the first active pixels between two non-consecutive frames; and based on the degree of change exceeding a second change threshold A third subset of the first active pixels is determined to be a non-static pixel.

在一些態樣中，該影像感測器實施於一第一半導體基板中。該圖框緩衝器及該感測器計算電路實施於一或多個第二半導體基板中。該第一半導體基板及該一或多個第二半導體基板形成一堆疊並且容納於一單個半導體封裝中。In some aspects, the image sensor is implemented in a first semiconductor substrate. The frame buffer and the sensor computing circuit are implemented in one or more second semiconductor substrates. The first semiconductor substrate and the one or more second semiconductor substrates form a stack and are housed in a single semiconductor package.

在一些實例中，提供一種方法。該方法包含：將第一程式化資料傳輸至包含複數個像素單元之一影像感測器以選擇該些像素單元之一第一子集以產生第一主動像素；自一圖框緩衝器接收包含該些第一主動像素及第一非主動像素之一第一影像圖框，該些第一非主動像素對應於未被選擇用於產生該些第一主動像素之該些像素單元之一第二子集；對該第一影像圖框之像素之一第一子集執行一影像處理操作，以產生一處理輸出，其中該第一影像圖框之像素之一第二子集被排除在該影像處理操作之外；基於該處理輸出，產生第二程式化資料；及將該第二程式化資料傳輸至該影像感測器以選擇該些像素單元之一第二子集以產生用於一第二影像圖框之第二主動像素。In some instances, a method is provided. The method includes: transmitting first programming data to an image sensor including a plurality of pixel units to select a first subset of the pixel units to generate a first active pixel; receiving from a frame buffer including A first image frame of the first active pixels and first inactive pixels, the first inactive pixels corresponding to a second one of the pixel units not selected for generating the first active pixels subset; performing an image processing operation on a first subset of pixels of the first image frame to generate a processing output in which a second subset of pixels of the first image frame is excluded from the image In addition to processing operations; based on the processing output, generating second programming data; and transmitting the second programming data to the image sensor to select a second subset of the pixel cells to generate for a first The second active pixel of the two image frames.

在一些態樣中，該影像處理操作包含一神經網路之一處理操作以偵測該第一影像圖框中之一所關注物件。像素之該第一子集對應於該所關注物件。In some aspects, the image processing operation includes a processing operation of a neural network to detect an object of interest in the first image frame. The first subset of pixels corresponds to the object of interest.

在一些態樣中，該方法進一步包含：在一計算記憶體中儲存至該神經網路的神經網路層之輸入資料、該神經網路層之權重資料；自該計算記憶體提取該輸入資料之一第一子集以及對應於該輸入資料之該第一子集的該權重資料之一第一子集，該輸入資料之該第一子集對應於該些第一主動像素中之至少一些；使用一資料處理電路對該輸入資料之該第一子集以及該權重資料之該第一子集執行算術運算以產生用於該第一影像圖框之中間輸出資料之一第一子集，該中間輸出資料之該第一子集對應於該輸入資料之該第一子集；在該計算記憶體中儲存用於該第一影像圖框之該中間輸出資料之該第一子集；及在該計算記憶體中儲存用於該第一影像圖框之該中間輸出資料之一第二子集的一預定值，該中間輸出資料之該第二子集對應於該些非主動像素。In some aspects, the method further includes: storing in a computational memory input data to a neural network layer of the neural network, weight data of the neural network layer; extracting the input data from the computational memory a first subset and a first subset of the weight data corresponding to the first subset of the input data corresponding to at least some of the first active pixels ; using a data processing circuit to perform arithmetic operations on the first subset of input data and the first subset of weight data to generate a first subset of intermediate output data for the first image frame, the first subset of the intermediate output data corresponds to the first subset of the input data; storing the first subset of the intermediate output data for the first image frame in the computing memory; and A predetermined value for a second subset of the intermediate output data of the first image frame is stored in the computing memory, the second subset of the intermediate output data corresponding to the inactive pixels.

在一些態樣中，該些第一主動像素包括靜態像素及非靜態像素。該些靜態像素對應於該些第一主動像素之一第一子集，對於該些第一主動像素之該第一子集，該第一影像圖框與一先前影像圖框之間的像素值之改變程度高於一改變臨限值。該些非靜態像素對應於該些第一主動像素之一第二子集，對於該些第一主動像素之該第二子集，該第一影像圖框與該先前影像圖框之間的該些像素值之改變程度低於該改變臨限值。該輸入資料之該第一子集對應於該些第一主動像素之該些非靜態像素。In some aspects, the first active pixels include static pixels and non-static pixels. The static pixels correspond to a first subset of the first active pixels, and for the first subset of the first active pixels, pixel values between the first image frame and a previous image frame The degree of change is higher than a change threshold. The non-static pixels correspond to a second subset of the first active pixels, for the second subset of the first active pixels, the distance between the first image frame and the previous image frame The degree of change of some pixel values is lower than the change threshold value. The first subset of the input data corresponds to the non-static pixels of the first active pixels.

在以下描述中，出於解釋之目的，闡述特定細節以便提供對某些發明性實例之透徹理解。然而，將顯而易見的，可在無該些特定細節之情況下實踐各種實例。諸圖及描述並不意欲為限制性的。In the following description, for purposes of explanation, specific details are set forth in order to provide a thorough understanding of certain inventive examples. It will be apparent, however, that various examples may be practiced without these specific details. The figures and descriptions are not intended to be limiting.

如上文所論述，影像感測器通常包括大量像素單元，且以高圖框率產生影像，以改善成像操作之空間及時間解析度。但以高圖框率產生高解析度影像圖框以及傳輸且處理該些高解析度影像圖框會引起影像感測器及影像處理操作之巨大功率消耗。此外，鑒於通常僅像素單元之小子集自所關注物件接收光，因此在產生、傳輸及處理不適用於物件偵測/追蹤操作之像素資料時會浪費許多功率，從而降低影像感測及處理操作之總效率。As discussed above, image sensors typically include a large number of pixel cells and produce images at high frame rates to improve the spatial and temporal resolution of imaging operations. However, generating high-resolution image frames and transmitting and processing these high-resolution image frames at high frame rates results in huge power consumption for image sensors and image processing operations. Furthermore, since typically only a small subset of pixel cells receive light from the object of interest, much power is wasted in generating, transmitting and processing pixel data that is not suitable for object detection/tracking operations, reducing image sensing and processing operations total efficiency.

本發明提出可解決以上問題中之至少一些的影像感測及處理技術。在一些實例中，設備包含影像感測器、圖框緩衝器及計算電路。影像感測器包含複數個像素單元，該影像感測器可由程式化資料進行配置以選擇像素單元之子集以產生主動像素。圖框緩衝器可儲存第一影像圖框，其包含由像素單元之第一子集產生的主動像素中之至少一些，該些像素單元由該影像感測器基於第一程式化資料選擇。第一影像圖框進一步包含對應於未被選擇用於產生主動像素之像素單元的第二子集之非主動像素。計算電路可自圖框緩衝器接收第一影像圖框。計算電路可包括影像處理器以對第一影像圖框之像素之第一子集執行影像處理操作，以產生處理輸出，其中第一影像圖框之像素之第二子集被排除在該影像處理操作之外。計算電路進一步包括程式化地圖產生器，其用以基於來自影像處理器之處理輸出產生第二程式化資料，及將第二程式化資料傳輸至影像感測器以選擇像素單元之第二子集以輸出用於第二影像圖框之像素資料。執行影像處理操作所針對之第一影像圖框之像素之第一子集可對應於例如主動像素，該些主動像素為經歷圖框之間的一定程度之改變的非靜態像素等。The present invention proposes image sensing and processing techniques that can solve at least some of the above problems. In some examples, the apparatus includes an image sensor, a frame buffer, and computing circuitry. An image sensor includes a plurality of pixel cells that can be configured by programming data to select a subset of pixel cells to generate active pixels. The frame buffer may store a first image frame including at least some of the active pixels generated by a first subset of pixel cells selected by the image sensor based on first programming data. The first image frame further includes inactive pixels corresponding to a second subset of pixel cells not selected for generating active pixels. The computing circuit may receive the first image frame from the frame buffer. The computing circuit may include an image processor to perform image processing operations on a first subset of pixels of the first image frame to generate a processing output, wherein a second subset of pixels of the first image frame are excluded from the image processing outside the operation. The computing circuit further includes a programmed map generator to generate second programmed data based on the processing output from the image processor, and to transmit the second programmed data to the image sensor to select a second subset of pixel cells to output pixel data for the second image frame. The first subset of pixels of the first image frame for which the image processing operation is performed may correspond, for example, to active pixels, which are non-static pixels that undergo some degree of change between frames, or the like.

在一些實例中，該設備可支援基於稀疏影像感測操作之物件偵測及追蹤操作。像素單元之第一子集可經選擇性地啟用以僅擷取與追蹤及偵測物件相關之像素資料作為主動像素，或僅將主動像素傳輸至圖框緩衝器，以支援稀疏影像感測操作。因為僅啟用像素單元之子集以產生及/或傳輸主動像素，所以可縮減針對影像圖框產生/傳輸之像素資料量，此可縮減影像感測器處之功率消耗。可基於物件偵測及追蹤操作之結果連續地調整稀疏影像感測操作，以顧及物件相對於影像感測器之相對移動，此可改善主動像素包括物件之影像資料的可能性並且改善依賴於物件偵測及追蹤操作之應用程式（例如，VR/AR/MR應用程式）之效能。另外，計算電路僅對可能包括物件之影像資料的主動像素或主動像素之子集執行影像處理操作，而非主動像素被排除在影像處理操作之外，此可進一步縮減影像處理操作之功率消耗。所有該些操作可改善影像感測及處理操作之總功率及計算效率以及效能。In some examples, the apparatus may support object detection and tracking operations based on sparse image sensing operations. A first subset of pixel cells can be selectively enabled to capture only pixel data related to tracking and detecting objects as active pixels, or to transfer only active pixels to a frame buffer to support sparse image sensing operations . Because only a subset of pixel cells are enabled to generate and/or transmit active pixels, the amount of pixel data generated/transmitted for an image frame can be reduced, which can reduce power consumption at the image sensor. Sparse image sensing operations can be continuously adjusted based on the results of object detection and tracking operations to account for relative movement of objects relative to the image sensor, which improves the likelihood that active pixels include object image data and improves object-dependent The performance of applications that detect and track actions (eg, VR/AR/MR applications). In addition, computing circuitry performs image processing operations on only active pixels or a subset of active pixels that may include image data of an object, while non-active pixels are excluded from image processing operations, which further reduces power consumption of image processing operations. All of these operations can improve the overall power and computational efficiency and performance of image sensing and processing operations.

在一些實例中，影像處理操作可包括神經網路操作。具體言之，影像處理器可包括資料處理電路，其用以為神經網路操作提供硬體加速度，該資料處理電路諸如包括輸入層及輸出層之多層卷積神經網路（convolutional neural network；CNN）。影像處理器可包括計算記憶體以儲存輸入影像圖框及與每一神經網路層相關聯的一組權重。該組權重可表示待偵測之物件的特徵。影像處理器可包括控制器，其控制資料處理電路以自計算記憶體提取輸入影像圖框資料及權重。該控制器可控制資料處理電路以在輸入影像圖框與權重之間執行算術運算，諸如乘加（multiply-and-accumulate；MAC）運算以產生用於輸入層之中間輸出資料。例如，基於激發函數、合併操作等對中間輸出資料進行後處理，且接著經後處理的中間輸出資料可儲存於計算記憶體中。經後處理的中間輸出資料可自計算記憶體提取並且經提供至下一神經網路層作為輸入。針對直至輸出層之所有層重複算術運算以及中間輸出資料之提取及儲存，以產生神經網路輸出。神經網路輸出可指示例如物件存在於輸入影像圖框中之可能性及物件在輸入影像圖框中之像素位置。In some examples, image processing operations may include neural network operations. Specifically, the image processor may include data processing circuitry for providing hardware acceleration for neural network operations, such as a multi-layer convolutional neural network (CNN) including an input layer and an output layer. . The image processor may include computational memory to store the input image frame and a set of weights associated with each neural network layer. The set of weights may represent characteristics of the object to be detected. The image processor may include a controller that controls the data processing circuit to extract input image frame data and weights from the computing memory. The controller can control the data processing circuitry to perform arithmetic operations, such as multiply-and-accumulate (MAC) operations, between the input image frames and the weights to generate intermediate output data for the input layer. For example, intermediate output data is post-processed based on excitation functions, merge operations, etc., and then the post-processed intermediate output data may be stored in computational memory. Post-processed intermediate output data can be extracted from computational memory and provided to the next neural network layer as input. The arithmetic operations and the extraction and storage of intermediate output data are repeated for all layers up to the output layer to generate the neural network output. The neural network output may indicate, for example, the likelihood of the object being present in the input image frame and the pixel location of the object in the input image frame.

該控制器可對資料處理電路進行配置以按高效方式處理稀疏影像資料。舉例而言，對於輸入層，該控制器可控制資料處理電路以僅自計算記憶體提取主動像素及對應的權重，並且僅對主動像素及對應的權重執行MAC運算以產生對應於用於輸入層之主動像素的中間輸出之子集。該控制器亦可基於神經網路之拓樸結構及後續的神經網路層當中之連接來判定每一後續的神經網路處之中間輸出資料之子集，其可經追蹤回至主動像素。該控制器可控制資料處理電路以執行MAC運算，以僅產生每一後續的神經網路層處之中間輸出資料之子集。另外，為了縮減對計算記憶體之存取，可在神經網路操作之前將用於每一層之中間輸出資料的預定值（例如，零）儲存於計算記憶體中。僅更新用於主動像素之中間輸出資料。所有該些操作可縮減針對稀疏影像資料之神經網路操作的功率消耗。The controller can configure the data processing circuitry to process sparse image data in an efficient manner. For example, for the input layer, the controller may control the data processing circuit to extract only active pixels and corresponding weights from computational memory, and perform MAC operations on only the active pixels and corresponding weights to generate corresponding values for the input layer. A subset of the intermediate outputs of the active pixels. The controller can also determine a subset of intermediate output data at each subsequent neural network, which can be traced back to the active pixel, based on the topology of the neural network and connections among subsequent neural network layers. The controller can control the data processing circuitry to perform MAC operations to generate only a subset of the intermediate output data at each subsequent neural network layer. Additionally, to reduce access to computational memory, predetermined values (eg, zeros) for the intermediate output data of each layer may be stored in computational memory prior to neural network operations. Only the intermediate output data for active pixels is updated. All of these operations can reduce power consumption for neural network operations on sparse image data.

在一些實例中，為了進一步縮減功率消耗並且改善功率及計算效率，圖框緩衝器及計算電路可支援時間稀疏操作。作為時間稀疏操作之部分，可識別靜態的像素及非靜態的像素。靜態像素可對應於由影像感測器擷取之場景的第一部分，其經歷第一影像圖框與先前影像圖框之間的小改變（或未經歷改變），而非靜態像素對應於場景之第二部分，其經歷第一影像圖框與先前影像圖框之間的大改變。若像素之改變程度低於臨限值，則可判定像素為靜態的。在一些實例中，可自主動像素識別非靜態像素，而可自主動像素以及非主動像素兩者識別靜態像素，該些非主動像素在圖框之間保持非主動（並且無改變）。In some examples, to further reduce power consumption and improve power and computational efficiency, frame buffers and computational circuits may support temporal sparse operation. As part of the temporal thinning operation, static pixels and non-static pixels can be identified. Static pixels may correspond to the first portion of the scene captured by the image sensor that undergoes small changes (or no changes) between the first image frame and the previous image frame, while non-static pixels correspond to portions of the scene. The second part, which undergoes a large change between the first image frame and the previous image frame. If the degree of change of the pixel is below a threshold value, the pixel can be determined to be static. In some examples, non-static pixels can be identified from active pixels, while static pixels can be identified from both active and inactive pixels, which remain inactive (and unchanged) between frames.

為了縮減功率消耗，資料處理電路可僅對第一影像圖框之非靜態像素執行影像處理操作（例如，神經網路操作）以產生用於非靜態像素之經更新輸出。對於靜態像素，可跳過影像處理操作，而可保留來自對先前影像圖框之影像處理操作之輸出。在影像處理操作包含神經網路操作之狀況下，該控制器可控制資料處理電路以自計算記憶體僅提取非靜態像素及對應的權重資料，以更新對應於用於輸入層之非靜態像素的中間輸出資料之子集。可針對輸入層保留計算記憶體中之對應於靜態像素（自先前影像圖框獲得）且對應於非主動像素（例如，具有諸如零之預定值）的其餘的中間輸出資料。該控制器亦可基於神經網路之拓樸結構及後續的神經網路層當中之連接來判定每一後續的神經網路處之可經追蹤回至非靜態像素之中間輸出資料之子集，並且僅更新中間輸出資料之子集，以縮減對計算記憶體之存取並且縮減功率消耗。To reduce power consumption, the data processing circuitry may perform image processing operations (eg, neural network operations) only on non-static pixels of the first image frame to generate updated outputs for the non-static pixels. For static pixels, the image processing operation can be skipped and the output from the image processing operation on the previous image frame can be retained. In the case where the image processing operations include neural network operations, the controller may control the data processing circuit to extract only non-static pixels and corresponding weight data from the computational memory to update the data corresponding to the non-static pixels used in the input layer. A subset of intermediate output data. The remaining intermediate output data in computational memory corresponding to static pixels (obtained from previous image frames) and corresponding to inactive pixels (eg, having predetermined values such as zero) may be retained for the input layer. The controller may also determine a subset of intermediate output data at each subsequent neural network that can be traced back to non-static pixels based on the topology of the neural network and connections among subsequent neural network layers, and Only a subset of the intermediate output data is updated to reduce access to computing memory and reduce power consumption.

在一些實例中，圖框緩衝器可自該影像感測器輸出之主動像素偵測靜態像素，並且儲存用於該些像素之像素值以向影像處理器標示該些像素為靜態像素。舉例而言，圖框緩衝器可將來自影像感測器之每一像素單元的最新像素資料（包括主動及非主動像素）儲存為第一影像圖框。對於主動像素中之每一像素，圖框緩衝器可判定像素相對於先前圖框（諸如緊接在第一影像圖框之前的影像圖框）之改變程度。圖框緩衝器可以各種方式設定像素值以指示靜態像素。舉例而言，圖框緩衝器可基於具有時間常數之漏泄積分器函數並且基於橫跨其之由該影像感測器輸出之像素保持靜態之多個連續影像圖框設定圖框緩衝器中之用於像素之像素值。若該像素針對大量連續影像圖框保持靜態，則像素之像素值可穩定在預定像素值。作為另一實例，若該像素針對臨限數目個連續影像圖框（例如，10個）保持靜態，則圖框緩衝器可針對圖框緩衝器中之像素設定預定像素值。預定像素值可對應於黑色（零）、白色（255）、灰色（128），或指示靜態像素之任一值。在所有該些狀況下，該影像處理器可基於識別標示靜態像素之像素值來區分靜態像素與非靜態像素，且僅對如上文所描述之非靜態像素執行影像處理操作。In some examples, a frame buffer may detect static pixels from active pixels output by the image sensor, and store pixel values for those pixels to indicate to the image processor that the pixels are static pixels. For example, the frame buffer may store the latest pixel data (including active and inactive pixels) from each pixel unit of the image sensor as the first image frame. For each of the active pixels, the frame buffer may determine how much the pixel has changed relative to a previous frame, such as the image frame immediately preceding the first image frame. The frame buffer may set pixel values in various ways to indicate static pixels. For example, a frame buffer may be based on a leaky integrator function with a time constant and based on a number of consecutive image frames across which the pixels output by the image sensor remain static for use in the frame buffer pixel value in pixel. If the pixel remains static for a large number of consecutive image frames, the pixel value of the pixel may stabilize at a predetermined pixel value. As another example, if the pixel remains static for a threshold number of consecutive image frames (eg, 10), the frame buffer may set a predetermined pixel value for the pixel in the frame buffer. The predetermined pixel value may correspond to black (zero), white (255), gray (128), or any value indicative of a static pixel. In all such cases, the image processor may distinguish static pixels from non-static pixels based on identifying pixel values that identify static pixels, and perform image processing operations only on non-static pixels as described above.

在一些實例中，該影像處理器亦可產生額外資訊以促進非靜態像素之處理。舉例而言，該影像處理器可基於模型之拓樸結構來判定資料改變傳播地圖，該資料改變傳播地圖追蹤資料改變自神經網路模型之輸入層至輸出層的傳播。基於該傳播地圖，以及來自圖框緩衝器之靜態像素，該影像處理器可針對每一神經網路識別非靜態的輸入資料，並且僅在每一層處提取用於神經網路操作之該些輸入資料。另外，該影像處理器亦可基於神經網路模型之拓樸結構判定用於靜態/非靜態像素判定之臨限改變程度，以確保經判定為非靜態之像素可引起輸出層處之必需的改變程度。另外，該影像處理器亦可追蹤連續圖框之間以及非連續圖框之間的像素之改變。該影像處理器可將展現連續圖框之間的小改變且亦識別非連續圖框之間的巨大改變之像素識別為非靜態像素，使得該影像處理器可對該些像素執行影像處理操作。In some examples, the image processor may also generate additional information to facilitate processing of non-static pixels. For example, the image processor may determine a data change propagation map based on the topology of the model that tracks the propagation of data changes from the input layer to the output layer of the neural network model. Based on the propagation map, and static pixels from the frame buffer, the image processor can identify non-static input data for each neural network and extract only those inputs for neural network operations at each layer material. In addition, the image processor can also determine the threshold change degree for static/non-static pixel determination based on the topology of the neural network model to ensure that pixels determined to be non-static can cause the necessary changes at the output layer degree. In addition, the image processor can also track pixel changes between consecutive frames and between non-consecutive frames. The image processor can identify pixels that exhibit small changes between consecutive frames and also identify large changes between non-consecutive frames as non-static pixels so that the image processor can perform image processing operations on those pixels.

在經揭示技術之情況下，影像感測器可被配置以執行稀疏影像感測操作以產生稀疏影像，此可縮減影像感測器處之功率消耗。此外，影像處理器可被配置以僅對主動及/或非靜態像素執行影像處理操作，同時跳過對非主動及/或靜態像素之影像處理操作，此可進一步縮減功率消耗。此外，選擇像素單元以產生主動像素可基於影像處理結果，以確保主動像素含有相關資訊（例如，所關注物件之影像）。所有該些操作可改善影像感測器及影像處理器之功率及計算效率。With the disclosed techniques, an image sensor can be configured to perform sparse image sensing operations to generate sparse images, which can reduce power consumption at the image sensor. Furthermore, the image processor can be configured to perform image processing operations only on active and/or non-static pixels, while skipping image processing operations on inactive and/or static pixels, which can further reduce power consumption. Additionally, the selection of pixel cells to generate active pixels may be based on image processing results to ensure that active pixels contain relevant information (eg, an image of the object of interest). All of these operations can improve the power and computational efficiency of image sensors and image processors.

經揭示技術可包括人工實境系統或結合人工實境系統實施。人工實境為在向使用者呈現之前已以某一方式調整的實境形式，其可包括例如虛擬實境（VR）、擴增實境（AR）、混合實境（MR）、混雜實境或其某一組合及/或衍生物。人工實境內容可包括完全產生內容或與經擷取之（例如，真實世界）內容組合之所產生內容。人工實境內容可包括視訊、音訊、觸覺反饋或其某一組合，其中之任一者可在單一通道中或在多個通道中（諸如，對檢視者產生三維效應之立體聲視訊）呈現。另外，在一些實例中，人工實境亦可與用以例如在人工實境中創建內容及/或以其他方式用於人工實境中（例如，在人工實境中執行活動）之應用、產品、配件、服務或其某一組合相關聯。提供人工實境內容之人工實境系統可實施於各種平台上，包括連接至主機電腦系統之頭戴式顯示器（head-mounted display；HMD）、獨立式HMD、行動裝置或計算系統，或能夠向一或多個檢視者提供人工實境內容的任何其他硬體平台。The disclosed techniques may include or be implemented in conjunction with an artificial reality system. Artificial reality is a form of reality that has been adjusted in some way before being presented to the user, which may include, for example, virtual reality (VR), augmented reality (AR), mixed reality (MR), mixed reality or a combination and/or derivative thereof. Artificial reality content may include generated content entirely or in combination with captured (eg, real-world) content. The artificial reality content may include video, audio, haptic feedback, or some combination thereof, any of which may be presented in a single channel or in multiple channels, such as stereo video that produces a three-dimensional effect on the viewer. Additionally, in some instances, an artificial environment may also be associated with applications, products, and/or applications that are used, for example, to create content in the artificial environment and/or otherwise be used in the artificial environment (eg, to perform activities in the artificial environment). , accessories, services, or some combination thereof. AR systems that provide AR content can be implemented on a variety of platforms, including head-mounted displays (HMDs) connected to host computer systems, stand-alone HMDs, mobile devices or computing systems, or Any other hardware platform where one or more viewers provide artificial reality content.

圖 1A為近眼顯示器100之實例的圖式。近眼顯示器100向使用者呈現媒體。由近眼顯示器100呈現之媒體的實例包括一或多個影像、視訊及/或音訊。在一些實例中，音訊經由外部裝置（例如，揚聲器及/或頭戴式耳機）呈現，外部裝置自近眼顯示器100、控制台或此兩者接收音訊資訊，且基於音訊資訊而呈現音訊資料。近眼顯示器100大體被配置以用作虛擬實境（VR）顯示器。在一些實例中，近眼顯示器100經修改以用作擴增實境（AR）顯示器及/或混合實境（MR）顯示器。 FIG. 1A is a diagram of an example of a near-eye display 100 . The near-eye display 100 presents media to the user. Examples of media presented by near-eye display 100 include one or more images, video, and/or audio. In some examples, audio is presented via external devices (eg, speakers and/or headphones) that receive audio information from near-eye display 100, a console, or both, and present audio data based on the audio information. The near-eye display 100 is generally configured to function as a virtual reality (VR) display. In some examples, the near-eye display 100 is modified for use as an augmented reality (AR) display and/or a mixed reality (MR) display.

近眼顯示器100包括框架105及顯示器110。框架105耦接至一或多個光學元件。顯示器110被配置以使使用者看見由近眼顯示器100呈現之內容。在一些實例中，顯示器110包含用於將來自一或多個影像之光引導至使用者之眼睛的波導顯示器總成。The near-eye display 100 includes a frame 105 and a display 110 . Frame 105 is coupled to one or more optical elements. Display 110 is configured to allow a user to see content presented by near-eye display 100 . In some examples, display 110 includes a waveguide display assembly for directing light from one or more images to the user's eye.

近眼顯示器100進一步包括影像感測器120a、120b、120c及120d。影像感測器120a、120b、120c及120d中之每一者可包括像素陣列，該像素陣列被配置以產生表示沿著不同方向之不同視場的影像資料。舉例而言，感測器120a及120b可被配置以提供表示朝向沿著Z軸之方向A之兩個視場的影像資料，而感測器120c可被配置以提供表示朝向沿著X軸之方向B之視場的影像資料，且感測器120d可被配置以提供表示朝向沿著X軸之方向C之視場的影像資料。The near-eye display 100 further includes image sensors 120a, 120b, 120c, and 120d. Each of image sensors 120a, 120b, 120c, and 120d may include an array of pixels configured to generate image data representing different fields of view along different directions. For example, sensors 120a and 120b can be configured to provide image data representing two fields of view oriented in direction A along the Z-axis, while sensor 120c can be configured to provide image data representing two fields of view oriented along the X-axis image data for the field of view in direction B, and sensor 120d may be configured to provide image data representing the field of view toward direction C along the x-axis.

在一些實例中，感測器120a至120d可被配置為輸入裝置以控制或影響近眼顯示器100之顯示內容，以向佩戴近眼顯示器100之使用者提供互動式VR/AR/MR體驗。舉例而言，感測器120a至120d可產生使用者所在之實體環境的實體影像資料。實體影像資料可提供至位置追蹤系統以追蹤使用者在實體環境中之位置及/或移動路徑。系統可接著基於例如使用者之位置及定向更新提供至顯示器110之影像資料以提供互動式體驗。在一些實例中，位置追蹤系統可操作同時定位及映射（simultaneous localization and mapping；SLAM）演算法以當使用者在實體環境內移動時追蹤實體環境中及使用者之視場內之一組物件。該位置追蹤系統可基於該組物件建構及更新實體環境之地圖，且追蹤使用者在該地圖內之位置。藉由提供對應於多個視場之影像資料，感測器120a至120d可向位置追蹤系統提供實體環境之較整體視圖，此可引起較多物件包括於地圖之構造及更新中。藉由此配置，可改善追蹤使用者在實體環境內之位置的準確性及穩固性。In some examples, the sensors 120a - 120d may be configured as input devices to control or affect the display content of the near-eye display 100 to provide an interactive VR/AR/MR experience to a user wearing the near-eye display 100 . For example, the sensors 120a to 120d may generate physical image data of the physical environment in which the user is located. The physical image data can be provided to a location tracking system to track the user's location and/or movement path in the physical environment. The system may then update the image data provided to display 110 based on, for example, the user's location and orientation to provide an interactive experience. In some examples, the location tracking system may operate a simultaneous localization and mapping (SLAM) algorithm to track a set of objects in the physical environment and within the user's field of view as the user moves within the physical environment. The location tracking system can construct and update a map of the physical environment based on the set of objects, and track the user's location within the map. By providing image data corresponding to multiple fields of view, sensors 120a-120d can provide the location tracking system with a more holistic view of the physical environment, which can result in more objects being included in the construction and updating of the map. With this configuration, the accuracy and stability of tracking the user's position within the physical environment can be improved.

在一些實例中，近眼顯示器100可進一步包括一或多個主動照明器130以將光投射至實體環境中。經投射之光可與不同頻譜（例如，可見光、紅外（infrared；IR）光、紫外光）相關聯，且可用於各種目的。舉例而言，照明器130可在黑暗環境中（或在具有低強度（IR）光、紫外光等之環境中）投射光，以輔助感測器120a至120d擷取不同物件在黑暗環境內之影像，以例如啟用使用者之位置追蹤。照明器130可將某些標記投射至環境內之物件上，以輔助位置追蹤系統識別物件以用於地圖建構/更新。In some examples, the near-eye display 100 may further include one or more active illuminators 130 to project light into the physical environment. The projected light can be associated with different spectrums (eg, visible light, infrared (IR) light, ultraviolet light) and can be used for various purposes. For example, the illuminator 130 may project light in a dark environment (or in an environment with low intensity (IR) light, ultraviolet light, etc.) to assist the sensors 120a - 120d in capturing the different objects within the dark environment images, to enable, for example, location tracking of users. Illuminators 130 may project certain markers onto objects within the environment to assist the location tracking system in identifying objects for map building/updating.

在一些實例中，照明器130亦可實現立體成像。舉例而言，感測器120a或120b中之一或多者可包括用於可見光感測之第一像素陣列及用於（IR）光感測之第二像素陣列兩者。第一像素陣列可覆疊有彩色濾光片（例如，拜耳濾光片），其中第一像素陣列中之每一像素被配置以量測與特定顏色（例如，紅色、綠色或藍色（red, green or blue；RGB）中之一者）相關聯之光的強度。第二像素陣列（用於IR光感測）亦可覆疊有僅允許IR光通過之濾光片，其中第二像素陣列中之每一像素被配置以量測IR光之強度。像素陣列可產生物件之RGB影像及IR影像，其中IR影像之每一像素經映射至RGB影像之每一像素。照明器130可將一組IR標誌投射於物件上，該物件之影像可由IR像素陣列擷取。基於如影像中所展示之物件之IR標記的分佈，系統可估計物件之不同部分距IR像素陣列的距離，且基於距離產生物件之立體影像。基於物件之立體影像，系統可判定例如物件相對於使用者之相對位置，且可基於相對位置資訊來更新提供至顯示器100之影像資料以提供互動式體驗。In some instances, the illuminator 130 may also enable stereoscopic imaging. For example, one or more of sensors 120a or 120b may include both a first pixel array for visible light sensing and a second pixel array for (IR) light sensing. The first pixel array may be overlaid with a color filter (eg, a Bayer filter), where each pixel in the first pixel array is configured to measure a specific color (eg, red, green, or blue). , green or blue; one of RGB)) the associated light intensity. The second pixel array (for IR light sensing) can also be covered with a filter that allows only IR light to pass through, wherein each pixel in the second pixel array is configured to measure the intensity of the IR light. The pixel array can generate an RGB image and an IR image of the object, where each pixel of the IR image is mapped to each pixel of the RGB image. The illuminator 130 can project a set of IR markers on the object, and the image of the object can be captured by the IR pixel array. Based on the distribution of the IR markers of the object as shown in the image, the system can estimate the distances of different parts of the object from the IR pixel array and generate a stereoscopic image of the object based on the distances. Based on the stereoscopic image of the object, the system can determine, for example, the relative position of the object relative to the user, and can update the image data provided to the display 100 based on the relative position information to provide an interactive experience.

如上文所論述，近眼顯示器100可在與寬範圍之光強度相關聯的環境中操作。舉例而言，近眼顯示器100可在室內環境中或在室外環境中及/或在當日之不同時間操作。近眼顯示器100亦可在主動照明器130接通或不接通之情況下操作。因而，影像感測器120a至120d會需要具有寬動態範圍以能夠橫跨與用於近眼顯示器100之不同操作環境相關聯的極寬範圍的光強度恰當地操作（例如，以產生與入射光之強度相關之輸出）。As discussed above, the near-eye display 100 can operate in environments associated with a wide range of light intensities. For example, the near-eye display 100 may operate in an indoor environment or in an outdoor environment and/or at different times of the day. The near-eye display 100 may also operate with the active illuminator 130 on or off. Thus, image sensors 120a-120d would need to have a wide dynamic range to be able to operate properly across the very wide range of light intensities associated with the different operating environments for near-eye display 100 (eg, to generate a intensity-dependent output).

圖 1B為近眼顯示器100之另一實例的圖式。圖 1B說明近眼顯示器100之面朝佩戴近眼顯示器100之使用者之眼球135的一側。如圖 1B中所展示，近眼顯示器100可進一步包括複數個照明器140a、140b、140c、140d、140e及140f。近眼顯示器100進一步包括複數個影像感測器150a及150b。照明器140a、140b及140c可朝向方向D（其與圖 1A之方向A相對）發射某一頻率範圍（例如，近紅外光（near-infrared；NIR））之光。經發射光可與某一圖案相關聯，且可由使用者之左眼球反射。感測器150a可包括像素陣列以接收經反射光且產生經反射圖案之影像。類似地，照明器140d、140e及140f可發射攜載圖案之NIR光。NIR光可由使用者之右眼球反射，且可由感測器150b接收。感測器150b亦可包括像素陣列以產生經反射圖案之影像。基於來自感測器150a及150b的經反射圖案之影像，系統可判定使用者之凝視點，並基於經判定凝視點而更新提供至顯示器100之影像資料以向使用者提供互動式體驗。 FIG. 1B is a diagram of another example of a near-eye display 100 . FIG. 1B illustrates the side of the near-eye display 100 that faces the eyeball 135 of the user wearing the near-eye display 100 . As shown in FIG. IB , the near-eye display 100 may further include a plurality of illuminators 140a, 140b, 140c, 140d, 140e, and 140f. The near-eye display 100 further includes a plurality of image sensors 150a and 150b. Illuminators 140a, 140b, and 140c may emit light in a certain frequency range (eg, near-infrared (NIR)) toward direction D, which is opposite direction A of FIG. 1A . The emitted light can be associated with a pattern and can be reflected by the user's left eyeball. Sensor 150a may include an array of pixels to receive reflected light and generate an image of the reflected pattern. Similarly, illuminators 140d, 140e, and 140f may emit pattern-bearing NIR light. The NIR light may be reflected by the user's right eyeball and may be received by the sensor 150b. The sensor 150b may also include an array of pixels to generate an image of the reflected pattern. Based on the images of the reflected patterns from sensors 150a and 150b, the system can determine the user's gaze point and update the image data provided to the display 100 based on the determined gaze point to provide an interactive experience to the user.

如上文所論述，為了避免損害使用者之眼球，照明器140a、140b、140c、140d、140e及140f通常被配置以輸出低強度之光。在影像感測器150a及150b包含與圖 1A之影像感測器120a至120d相同之感測器裝置的狀況下，當入射光之強度低時，影像感測器120a至120d可能需要能夠產生與入射光之強度相關的輸出，此可進一步增加影像感測器之動態範圍要求。 As discussed above, in order to avoid damage to the user's eye, the illuminators 140a, 140b, 140c, 140d, 140e, and 140f are typically configured to output low-intensity light. In the case where the image sensors 150a and 150b comprise the same sensor device as the image sensors 120a-120d of FIG. 1A , when the intensity of the incident light is low, the image sensors 120a-120d may need to be able to generate a The intensity-dependent output of the incident light can further increase the dynamic range requirements of the image sensor.

此外，影像感測器120a至120d會需要能夠以高速產生輸出以追蹤眼球之移動。舉例而言，使用者之眼球可執行極快移動（例如，掃視移動），其中可存在自一個眼球位置至另一眼球位置之快速跳轉。為了追蹤使用者之眼球的快速移動，影像感測器120a至120d需要以高速產生眼球之影像。舉例而言，影像感測器產生影像圖框之速率（圖框速率）需要至少匹配眼球之移動速度。高圖框速率需要在產生影像圖框中所涉及之所有像素單元的短的總曝光時間，以及用於將感測器輸出轉換成用於影像產生之數位值的高速度。此外，如上文所論述，影像感測器亦需要能夠在具有低光強度之環境下操作。In addition, the image sensors 120a-120d would need to be able to generate output at high speed to track eye movement. For example, a user's eyeballs may perform extremely fast movements (eg, saccade movements), where there may be rapid jumps from one eyeball position to another. In order to track the fast movement of the user's eyeballs, the image sensors 120a to 120d need to generate images of the eyeballs at high speed. For example, the rate at which an image sensor generates an image frame (frame rate) needs to at least match the movement speed of the eyeball. A high frame rate requires a short total exposure time for all pixel units involved in generating an image frame, as well as a high speed for converting the sensor output into digital values for image generation. Furthermore, as discussed above, image sensors also need to be able to operate in environments with low light intensities.

圖 2係圖 1中所說明之近眼顯示器100之橫截面200的實例。顯示器110包括至少一個波導顯示器總成210。出射光瞳230為在使用者佩戴近眼顯示器100時使用者之單個眼球220定位於眼眶區中的位置。出於說明之目的，圖 2展示與眼球220及單個波導顯示器總成210相關聯之橫截面200，但第二波導顯示器用於使用者之第二眼睛。 FIG. 2 is an example of a cross-section 200 of the near-eye display 100 illustrated in FIG. 1 . Display 110 includes at least one waveguide display assembly 210 . The exit pupil 230 is where the user's single eyeball 220 is positioned in the orbital region when the user wears the near-eye display 100 . For purposes of illustration, Figure 2 shows a cross-section 200 associated with an eyeball 220 and a single waveguide display assembly 210, but with a second waveguide display for the user's second eye.

波導顯示器總成210被配置以將影像光引導至位於出射光瞳230處之眼眶及引導至眼球220。波導顯示器總成210可由具有一或多個折射率之一或多種材料（例如，塑膠、玻璃）構成。在一些實例中，近眼顯示器100包括在波導顯示器總成210與眼球220之間的一或多個光學元件。The waveguide display assembly 210 is configured to direct image light to the orbit at the exit pupil 230 and to the eyeball 220 . The waveguide display assembly 210 may be constructed of one or more materials (eg, plastic, glass) having one or more indices of refraction. In some examples, near-eye display 100 includes one or more optical elements between waveguide display assembly 210 and eyeball 220 .

在一些實例中，波導顯示器總成210包括一或多個波導顯示器之堆疊，包括但不限於堆疊式波導顯示器、變焦波導顯示器等。堆疊式波導顯示器為多色顯示器（例如，RGB顯示器），其藉由堆疊各別單色源具有不同顏色之波導顯示器來創建。堆疊式波導顯示器亦為可投射於多個平面上之多色顯示器（例如，多平面彩色顯示器）。在一些組態中，堆疊式波導顯示器為可投射於多個平面上之單色顯示器（例如，多平面單色顯示器）。變焦波導顯示器為可調整自波導顯示器發射的影像光之聚焦位置之顯示器。在替代實例中，波導顯示器總成210可包括堆疊式波導顯示器及變焦波導顯示器。In some examples, waveguide display assembly 210 includes a stack of one or more waveguide displays, including but not limited to stacked waveguide displays, zoom waveguide displays, and the like. Stacked waveguide displays are multicolor displays (eg, RGB displays) created by stacking waveguide displays with different colors for each monochromatic source. Stacked waveguide displays are also multi-color displays that can be projected on multiple planes (eg, multi-plane color displays). In some configurations, the stacked waveguide display is a monochrome display that can be projected on multiple planes (eg, a multi-plane monochrome display). A zoom waveguide display is a display that can adjust the focus position of the image light emitted from the waveguide display. In an alternate example, the waveguide display assembly 210 may include a stacked waveguide display and a zoom waveguide display.

圖 3說明波導顯示器300之實例的等角視圖。在一些實例中，波導顯示器300為近眼顯示器100之組件（例如，波導顯示器總成210）。在一些實例中，波導顯示器300為將影像光引導至特定位置之某一其他近眼顯示器或其他系統的部分。 FIG. 3 illustrates an isometric view of an example of a waveguide display 300 . In some examples, waveguide display 300 is a component of near-eye display 100 (eg, waveguide display assembly 210). In some examples, waveguide display 300 is part of some other near-eye display or other system that directs image light to a particular location.

波導顯示器300包括源總成310、輸出波導320及控制器330。出於說明的目的，圖 3展示與單個眼球220相關聯之波導顯示器300，但在一些實例中，與波導顯示器300分開或部分分開之另一波導顯示器將影像光提供至使用者之另一眼睛。 The waveguide display 300 includes a source assembly 310 , an output waveguide 320 and a controller 330 . For purposes of illustration, FIG. 3 shows a waveguide display 300 associated with a single eyeball 220, but in some instances another waveguide display separate or partially separate from the waveguide display 300 provides image light to the other eye of the user .

源總成310產生影像光355。源總成310產生影像光355且將其輸出至位於輸出波導320之第一側370-1上的耦接元件350。輸出波導320為將擴展之影像光340輸出至使用者之眼球220的光波導。輸出波導320在位於第一側370-1上之一或多個耦接元件350處接收影像光355且將所接收之輸入影像光355導引至引導元件360。在一些實例中，耦接元件350將來自源總成310之影像光355耦接至輸出波導320中。耦接元件350可為例如繞射光柵、全訊光柵、一或多個級聯反射器、一或多個稜鏡表面元件及/或全像反射器陣列。Source assembly 310 produces image light 355 . Source assembly 310 generates image light 355 and outputs it to coupling element 350 on first side 370 - 1 of output waveguide 320 . The output waveguide 320 is an optical waveguide that outputs the expanded image light 340 to the eyeball 220 of the user. The output waveguide 320 receives the image light 355 at one or more coupling elements 350 located on the first side 370 - 1 and guides the received input image light 355 to the guide element 360 . In some examples, coupling element 350 couples image light 355 from source assembly 310 into output waveguide 320 . Coupling element 350 may be, for example, a diffraction grating, a holographic grating, one or more cascaded reflectors, one or more crystalline surface elements, and/or an array of holographic reflectors.

引導元件360將所接收之輸入影像光355重新引導至解耦元件365，使得所接收之輸入影像光355經由解耦元件365自輸出波導320解耦。導引元件360為輸出波導320之第一側370-1的部分或貼附至該第一側。解耦元件365為輸出波導320之第二側370-2的部分或貼附至該第二側，使得引導元件360與解耦元件365相對。引導元件360及/或解耦元件365可為例如繞射光柵、全訊光柵、一或多個級聯反射器、一或多個稜鏡表面元件及/或全像反射器陣列。The directing element 360 redirects the received input image light 355 to the decoupling element 365 such that the received input image light 355 is decoupled from the output waveguide 320 via the decoupling element 365 . The guide element 360 is part of or attached to the first side 370 - 1 of the output waveguide 320 . The decoupling element 365 is part of or attached to the second side 370 - 2 of the output waveguide 320 such that the guiding element 360 is opposite the decoupling element 365 . The guiding element 360 and/or the decoupling element 365 may be, for example, a diffraction grating, a holographic grating, one or more cascaded reflectors, one or more homing surface elements, and/or an array of holographic reflectors.

第二側370-2表示沿著x維度及y維度之平面。輸出波導320可由促進影像光355之全內反射之一或多種材料構成。輸出波導320可由例如矽、塑膠、玻璃及/或聚合物構成。輸出波導320具有相對較小的外觀尺寸。舉例而言，輸出波導320可沿著x維度為大約50 mm寬，沿著y維度為大約30 mm長，且沿著z維度為大約0.5至1 mm厚。The second side 370-2 represents a plane along the x- and y-dimensions. The output waveguide 320 may be constructed of one or more materials that promote total internal reflection of the image light 355 . The output waveguide 320 may be constructed of, for example, silicon, plastic, glass, and/or polymers. The output waveguide 320 has a relatively small external size. For example, the output waveguide 320 may be approximately 50 mm wide along the x dimension, approximately 30 mm long along the y dimension, and approximately 0.5 to 1 mm thick along the z dimension.

控制器330控制源總成310之掃描操作。控制器330判定用於源總成310之掃描指令。在一些實例中，輸出波導320以大視場（field of view；FOV）將擴展之影像光340輸出至使用者之眼球220。舉例而言，擴展之影像光340提供至具有60度及/或更大及/或150度及/或更小之一對角線FOV（在x及y上）的使用者之眼球220。輸出波導320被配置以提供具有20 mm或大於20 mm及/或等於或小於50 mm之長度；及/或10 mm或大於10 mm及/或等於或小於50 mm之寬度的眼眶。The controller 330 controls the scanning operation of the source assembly 310 . Controller 330 determines scan commands for source assembly 310 . In some examples, the output waveguide 320 outputs the expanded image light 340 to the eyeball 220 of the user with a large field of view (FOV). For example, the expanded image light 340 is provided to the user's eyeball 220 having a diagonal FOV (in x and y) of 60 degrees and/or more and/or 150 degrees and/or less. The output waveguide 320 is configured to provide an eye socket having a length of 20 mm or more and/or equal to or less than 50 mm; and/or a width of 10 mm or more and/or equal to or less than 50 mm.

此外，控制器330亦基於由影像感測器370提供之影像資料控制由源總成310產生之影像光355。影像感測器370可位於第一側370-1上且可包括例如圖 1A之影像感測器120a至120d。可操作影像感測器120a至120d以執行例如使用者前方（例如，面向第一側370-1）的物件372之2D感測及3D感測。對於2D感測，可操作影像感測器120a至120d之每一像素單元以產生表示藉由光源376產生並自物件372反射的光374之強度的像素資料。對於3D感測，可操作影像感測器120a至120d之每一像素單元以產生表示藉由照明器325產生的光378之飛行時間量測的像素資料。舉例而言，影像感測器120a至120d之每一像素單元可判定使照明器325能夠投射光378之第一時間及像素單元偵測到自物件372反射之光378之第二時間。第一時間與第二時間之間的差可指示影像感測器120a至120d與物件372之間的光378之飛行時間，且飛行時間資訊可用於判定影像感測器120a至120d與物件372之間的距離。可操作影像感測器120a至120d以在不同時間執行2D及3D感測，並且該些影像感測器將2D及3D影像資料提供至遠端控制台390，該遠端控制台可（或可不）位於波導顯示器300內。遠端控制台可組合2D及3D影像以例如產生使用者所位於之環境的3D模型，以追蹤使用者之位置及/或定向等。遠端控制台可基於自2D及3D影像導出之資訊來判定待顯示給使用者之影像的內容。該遠端控制台可將與經判定內容相關之指令傳輸至控制器330。基於該些指令，控制器330可控制由源總成310進行的影像光355之產生及輸出，以向使用者提供互動式體驗。 In addition, controller 330 also controls image light 355 generated by source assembly 310 based on the image data provided by image sensor 370 . Image sensor 370 may be located on first side 370-1 and may include, for example, image sensors 120a-120d of FIG. 1A . Image sensors 120a-120d are operable to perform, for example, 2D sensing and 3D sensing of objects 372 in front of the user (eg, facing first side 370-1). For 2D sensing, each pixel cell of image sensors 120a-120d may be operated to generate pixel data representing the intensity of light 374 generated by light source 376 and reflected from object 372. For 3D sensing, each pixel cell of image sensors 120a-120d may be operated to generate pixel data representing a time-of-flight measurement of light 378 produced by illuminator 325. For example, each pixel unit of image sensors 120a-120d can determine a first time when illuminator 325 is enabled to project light 378 and a second time when the pixel unit detects light 378 reflected from object 372. The difference between the first time and the second time may indicate the time of flight of light 378 between image sensors 120a-120d and object 372, and the time-of-flight information may be used to determine the distance between image sensors 120a-120d and object 372. distance between. Image sensors 120a-120d are operable to perform 2D and 3D sensing at different times, and the image sensors provide 2D and 3D image data to a remote console 390, which may (or may not) ) within the waveguide display 300. The remote console can combine 2D and 3D images to, for example, generate a 3D model of the environment in which the user is located, to track the user's position and/or orientation, and the like. The remote console can determine the content of the image to be displayed to the user based on information derived from the 2D and 3D images. The remote console may transmit instructions related to the determined content to the controller 330 . Based on the instructions, the controller 330 can control the generation and output of the image light 355 by the source assembly 310 to provide an interactive experience to the user.

圖 4說明波導顯示器300之橫截面400的實例。橫截面400包括源總成310、輸出波導320及影像感測器370。在圖 4之實例中，影像感測器370可包括位於第一側370-1上之一組像素單元402，以產生在使用者前方之實體環境的影像。在一些實例中，可存在插入於該組組像素單元402與實體環境之間的機械快門404及光學濾光片陣列406。機械快門404可控制該組像素單元402的曝光。在一些實例中，機械快門404可由電子快門閘代替，如下文將論述。光學濾光片陣列406可控制該組像素單元402曝光於的光之一光學波長範圍，如下文將論述。像素單元402中之每一者可對應於該影像之一個像素。儘管圖 4中未展示，但應理解，像素單元402中之每一者亦可與濾光片重疊以控制待由像素單元感測的光之光學波長範圍。 FIG. 4 illustrates an example of a cross-section 400 of waveguide display 300 . Cross section 400 includes source assembly 310 , output waveguide 320 and image sensor 370 . In the example of FIG. 4 , the image sensor 370 may include a set of pixel cells 402 on the first side 370-1 to generate an image of the physical environment in front of the user. In some examples, there may be a mechanical shutter 404 and an optical filter array 406 interposed between the set of pixel cells 402 and the physical environment. The mechanical shutter 404 can control the exposure of the group of pixel units 402 . In some examples, the mechanical shutter 404 may be replaced by an electronic shutter, as will be discussed below. Optical filter array 406 can control an optical wavelength range of light to which the set of pixel elements 402 are exposed, as will be discussed below. Each of pixel cells 402 may correspond to a pixel of the image. Although not shown in Figure 4 , it should be understood that each of the pixel cells 402 may also overlap a filter to control the optical wavelength range of light to be sensed by the pixel cells.

在接收到來自遠端控制台之指令之後，機械快門404可打開且將在一曝光週期中曝光該組像素單元402。在曝光週期期間，影像感測器370可獲得入射於該組像素單元402上之光的樣本，並基於由該組像素單元402偵測到的入射光樣本之強度分佈而產生影像資料。影像感測器370可接著將影像資料提供至遠端控制台，該遠端控制台判定顯示內容，且將顯示內容資訊提供至控制器330。控制器330可接著基於顯示內容資訊判定影像光355。After receiving a command from the remote console, the mechanical shutter 404 can be opened and the set of pixel elements 402 will be exposed for an exposure cycle. During the exposure period, the image sensor 370 obtains samples of light incident on the set of pixel units 402 and generates image data based on the intensity distribution of the incident light samples detected by the set of pixel units 402 . Image sensor 370 may then provide image data to a remote console, which determines the display content and provides display content information to controller 330 . Controller 330 may then determine image light 355 based on the display content information.

源總成310根據來自控制器330之指令產生影像光355。源總成310包括源410及光學系統415。源410為產生相干或部分相干光之光源。源410可為例如雷射二極體、垂直共振腔面射型雷射及/或發光二極體。Source assembly 310 generates image light 355 according to instructions from controller 330 . Source assembly 310 includes source 410 and optical system 415 . Source 410 is a light source that produces coherent or partially coherent light. The source 410 can be, for example, a laser diode, a vertical cavity surface emitting laser, and/or a light emitting diode.

光學系統415包括調節來自源410之光的一或多個光學組件。調節來自源410之光可包括例如根據來自控制器330之指令而擴展、準直及/或調整定向。一或多個光學組件可包括一或多個透鏡、液體透鏡、鏡面、光圈及/或光柵。在一些實例中，光學系統415包括具有複數個電極之液體透鏡，其允許掃描具有掃描角度之臨限值的光束以使光束移位至液體透鏡外部的區。自光學系統415（以及源總成310）發射之光被稱作影像光355。Optical system 415 includes one or more optical components that condition light from source 410 . Adjusting the light from source 410 may include spreading, collimating, and/or adjusting orientation, eg, according to instructions from controller 330 . The one or more optical components may include one or more lenses, liquid lenses, mirrors, apertures and/or gratings. In some examples, the optical system 415 includes a liquid lens with a plurality of electrodes that allow scanning of the beam with a threshold value of scan angle to shift the beam to a region outside the liquid lens. Light emitted from optical system 415 (and source assembly 310 ) is referred to as image light 355 .

輸出波導320接收影像光355。耦接元件350將來自源總成310之影像光355耦接至輸出波導320中。在耦接元件350為繞射光柵之實例中，選擇繞射光柵之間距使得在輸出波導320中發生全內反射，且影像光355在輸出波導320內部（例如，藉由全內反射）朝向解耦元件365傳播。The output waveguide 320 receives the image light 355 . Coupling element 350 couples image light 355 from source assembly 310 into output waveguide 320 . In instances where the coupling element 350 is a diffraction grating, the diffraction grating spacing is selected such that total internal reflection occurs in the output waveguide 320 and the image light 355 is directed toward the solution inside the output waveguide 320 (eg, by total internal reflection) Coupling element 365 propagates.

引導元件360將影像光355重新引導朝向解耦元件365以與輸出波導320解耦。在引導元件360為繞射光柵之實例中，選擇繞射光柵之間距以使入射影像光355相對於解耦元件365之表面以傾斜角度射出輸出波導320。Guiding element 360 redirects image light 355 towards decoupling element 365 to decouple from output waveguide 320 . In instances where the guiding elements 360 are diffraction gratings, the spacing between the diffraction gratings is selected so that the incident image light 355 exits the output waveguide 320 at an oblique angle relative to the surface of the decoupling element 365 .

在一些實例中，引導元件360及/或解耦元件365在結構上類似。射出輸出波導320的擴展之影像光340沿著一或多個維度擴展（例如，可沿著x維度為細長的）。在一些實例中，波導顯示器300包括複數個源總成310及複數個輸出波導320。源總成310中之每一者發射具有對應於原色（例如，紅色、綠色或藍色）之特定波長帶的單色影像光。輸出波導320中之每一者可藉由一分隔距離堆疊在一起以輸出多色的擴展之影像光340。In some examples, guide element 360 and/or decoupling element 365 are similar in structure. Expanded image light 340 exiting output waveguide 320 expands along one or more dimensions (eg, may be elongated along the x-dimension). In some examples, waveguide display 300 includes a plurality of source assemblies 310 and a plurality of output waveguides 320 . Each of the source assemblies 310 emits monochromatic image light having a particular wavelength band corresponding to a primary color (eg, red, green, or blue). Each of the output waveguides 320 can be stacked together by a separation distance to output multi-colored expanded image light 340 .

圖 5為包括近眼顯示器100之系統500之實例的方塊圖。系統500包含各自耦接至控制電路510之近眼顯示器100、成像裝置535、輸入/輸出介面540以及影像感測器120a至120d及150a至150b。系統500可被配置為頭戴式裝置、行動裝置、可佩戴式裝置等。 5 is a block diagram of an example of a system 500 including a near-eye display 100. System 500 includes near-eye display 100, imaging device 535, input/output interface 540, and image sensors 120a-120d and 150a-150b, each coupled to control circuit 510. System 500 may be configured as a head-mounted device, a mobile device, a wearable device, or the like.

近眼顯示器100為向使用者呈現媒體的顯示器。由近眼顯示器100呈現的媒體之實例包括一或多個影像、視訊及/或音訊。在一些實例中，音訊經由外部裝置（例如，揚聲器及/或頭戴式耳機）呈現，外部裝置自近眼顯示器100及/或控制電路510接收音訊資訊，且將基於音訊資訊之音訊資料向使用者呈現。在一些實例中，近眼顯示器100亦可充當AR眼鏡。在一些實例中，近眼顯示器100藉由電腦產生之元素（例如，影像、視訊、聲音）擴增實體真實世界環境之視圖。The near-eye display 100 is a display that presents media to a user. Examples of media presented by near-eye display 100 include one or more images, video, and/or audio. In some examples, the audio is presented via an external device (eg, speakers and/or headphones) that receives the audio information from the near-eye display 100 and/or the control circuit 510 and presents the audio data based on the audio information to the user render. In some instances, the near-eye display 100 may also function as AR glasses. In some examples, the near-eye display 100 augments the view of the physical real-world environment with computer-generated elements (eg, images, video, sound).

近眼顯示器100包括波導顯示器總成210、一或多個位置感測器525及/或慣性量測單元（inertial measurement unit；IMU）530。波導顯示器總成210包括源總成310、輸出波導320及控制器330。The near-eye display 100 includes a waveguide display assembly 210 , one or more position sensors 525 and/or an inertial measurement unit (IMU) 530 . The waveguide display assembly 210 includes a source assembly 310 , an output waveguide 320 and a controller 330 .

IMU 530為基於自位置感測器525中之一或多者接收到的量測信號而產生快速校準資料的電子裝置，該快速校準資料指示近眼顯示器100相對於近眼顯示器100之初始位置的估計位置。The IMU 530 is an electronic device that generates fast calibration data based on measurement signals received from one or more of the position sensors 525 , the fast calibration data indicating the estimated position of the near-eye display 100 relative to the initial position of the near-eye display 100 .

成像裝置535可生成用於各種應用程式之影像資料。舉例而言，成像裝置535可產生影像資料，以根據自控制電路510接收之校準參數提供緩慢校準資料。成像裝置535可包括例如圖 1A之影像感測器120a至120d，該些影像感測器用於產生使用者所位於之實體環境的影像資料，從而用於執行對使用者之位置追蹤。成像裝置535可進一步包括例如圖 1B之影像感測器150a至150b以用於產生用於判定使用者之凝視點之影像資料，以識別使用者之所關注物件。 Imaging device 535 may generate image data for various applications. For example, imaging device 535 may generate image data to provide slow calibration data based on calibration parameters received from control circuit 510 . Imaging device 535 may include, for example, image sensors 120a to 120d of FIG. 1A for generating image data of the physical environment in which the user is located for performing position tracking of the user. The imaging device 535 may further include, for example, the image sensors 150a to 150b of FIG. 1B for generating image data for determining the user's gaze point to identify the user's object of interest.

輸入/輸出介面540為允許使用者將動作請求發送至控制電路510之裝置。動作請求係執行特定動作之請求。舉例而言，動作請求可係開始或結束應用程式或執行該應用程式內之特定動作。The input/output interface 540 is a device that allows the user to send action requests to the control circuit 510 . An action request is a request to perform a specific action. For example, an action request may be to start or end an application or to perform a specific action within the application.

控制電路510根據自以下各者中之一或多者接收到的資訊將媒體提供至近眼顯示器100以供呈現給使用者：成像裝置535、近眼顯示器100及輸入/輸出介面540。在一些實例中，控制電路系統510可容納於被配置為頭戴式裝置之系統500內。在一些實例中，控制電路510可為與系統500之其他組件以通信方式耦接的單獨控制台裝置。在圖 5中所展示之實例中，控制電路510包括應用程式儲存器545、追蹤模組550及引擎555。 Control circuitry 510 provides media to near-eye display 100 for presentation to the user based on information received from one or more of: imaging device 535 , near-eye display 100 , and input/output interface 540 . In some examples, control circuitry 510 may be housed within system 500 configured as a head-mounted device. In some examples, control circuit 510 may be a separate console device communicatively coupled with other components of system 500 . In the example shown in FIG. 5 , control circuit 510 includes application storage 545 , tracking module 550 , and engine 555 .

應用程式儲存器545儲存用於由控制電路510執行之一或多個應用程式。應用程式為在由處理器執行時產生供呈現給使用者之內容的一組指令。應用程式之實例包括：遊戲應用程式、會議應用程式、視訊播放應用程式或其他合適的應用程式。Application storage 545 stores one or more applications for execution by control circuit 510 . An application is a set of instructions that, when executed by a processor, produce content for presentation to a user. Examples of applications include: gaming applications, conferencing applications, video playback applications, or other suitable applications.

追蹤模組550使用一或多個校準參數來校準系統500，且可調整一或多個校準參數以減小在近眼顯示器100之位置之判定中的誤差。The tracking module 550 uses one or more calibration parameters to calibrate the system 500 and can adjust the one or more calibration parameters to reduce errors in the determination of the position of the near-eye display 100 .

追蹤模組550使用來自成像裝置535之緩慢校準資訊來追蹤近眼顯示器100之移動。追蹤模組550亦使用來自快速校準資訊之位置資訊來判定近眼顯示器100之參考點的位置。The tracking module 550 uses the slow calibration information from the imaging device 535 to track the movement of the near-eye display 100 . The tracking module 550 also uses the position information from the quick calibration information to determine the position of the reference point of the near-eye display 100 .

引擎555執行系統500內之應用程式，且自追蹤模組550接收近眼顯示器100之位置資訊、加速度資訊、速度資訊及/或預測之未來位置。在一些實例中，由引擎555接收之資訊可用於產生至波導顯示器總成210之信號（例如，顯示指令），其判定呈現給使用者之內容的類型。舉例而言，為了提供互動式體驗，引擎555可基於使用者之位置（例如，由追蹤模組550提供）或使用者之凝視點（例如，基於由成像裝置535提供之影像資料）或物件與使用者之間的距離（例如，基於由成像裝置535提供之影像資料）判定待呈現給使用者之內容。Engine 555 executes applications within system 500 and receives position information, acceleration information, velocity information, and/or predicted future position of near-eye display 100 from tracking module 550 . In some examples, information received by engine 555 may be used to generate signals (eg, display commands) to waveguide display assembly 210 that determine the type of content presented to the user. For example, to provide an interactive experience, the engine 555 may be based on the user's location (eg, provided by the tracking module 550 ) or the user's gaze point (eg, based on image data provided by the imaging device 535 ) or object and The distance between users (eg, based on image data provided by imaging device 535 ) determines the content to be presented to the user.

圖 6A 、圖 6B 、圖 6C及圖 6D說明影像感測器600及其操作之實例。如圖 6A中所展示，影像感測器600可包括像素單元陣列，包括像素單元601，且可產生對應於影像之像素的數位強度資料。像素單元601可為圖 4之像素單元402之部分。如圖 6A中所展示，像素單元601可包括光電二極體602、電子快門開關603、轉換開關604、電荷儲存裝置605、緩衝器606，及量化器607。光電二極體602可包括例如P-N二極體、P-I-N二極體、固定二極體等，而電荷儲存裝置605可為轉換開關604之浮動汲極節點。光電二極體602可在曝光週期內接收光後產生並且累積殘餘電荷。在曝光週期內藉由殘餘電荷飽和後，光電二極體602可經由轉換開關604將溢出電荷輸出至電荷儲存裝置605。電荷儲存裝置605可將溢出電荷轉換為電壓，其可藉由緩衝器606緩衝。經緩衝電壓可藉由量化器607量化以產生量測資料608，以表示例如由光電二極體602在曝光週期內接收之光的強度。 6A , 6B , 6C , and 6D illustrate examples of image sensor 600 and its operation. As shown in FIG. 6A , image sensor 600 may include an array of pixel cells, including pixel cell 601, and may generate digital intensity data corresponding to pixels of an image. Pixel unit 601 may be part of pixel unit 402 of FIG. 4 . As shown in FIG. 6A , pixel cell 601 may include photodiode 602 , electronic shutter switch 603 , transfer switch 604 , charge storage device 605 , buffer 606 , and quantizer 607 . Photodiode 602 may include, for example, a PN diode, a PIN diode, a fixed diode, etc., while charge storage device 605 may be the floating drain node of transfer switch 604 . The photodiode 602 may generate and accumulate residual charge upon receiving light during an exposure period. After being saturated by the residual charge during the exposure period, the photodiode 602 can output the overflow charge to the charge storage device 605 via the switch 604 . Charge storage device 605 can convert the overflow charge to a voltage, which can be buffered by buffer 606 . The buffered voltage can be quantized by quantizer 607 to produce measurement data 608 representing, for example, the intensity of light received by photodiode 602 during an exposure period.

量化器607可包括比較器，其用以比較經緩衝電壓與用於與不同強度範圍相關聯的不同量化操作之不同臨限值。舉例而言，對於由光電二極體602產生的溢出電荷之數量超過電荷儲存裝置605之飽和限制的高強度範圍，量化器607可藉由偵測經緩衝電壓是否超過表示飽和限制之靜態臨限值且若經緩衝電壓超過該靜態臨限值則量測經緩衝電壓超過靜態臨限值所花費之時間而執行飽和時間（time-to-saturation；TTS）量測操作。經量測時間可與光強度成反比。又，對於光電二極體藉由殘餘電荷飽和但溢出電荷保持低於電荷儲存裝置605之飽和限制之中等強度範圍，量化器607可執行全部數位類比至數位轉換器（fully digital analog to digital converter；FD ADC）操作以量測儲存於電荷儲存裝置605中之溢出電荷的數量。此外，對於光電二極體不藉由殘餘電荷飽和並且無溢出電荷累積於電荷儲存裝置605中之低強度範圍，量化器607可執行用於類比感測器之數位處理儀錶（PD ADC）操作以量測累積於光電二極體602中之殘餘電荷的數量。TTS、FD ADC、或PD ADC操作中之一者的輸出可經輸出作為量測資料608以表示光的強度。Quantizer 607 may include comparators to compare the buffered voltages to different thresholds for different quantization operations associated with different intensity ranges. For example, for high intensity ranges where the amount of overflow charge generated by photodiode 602 exceeds the saturation limit of charge storage device 605, quantizer 607 can detect whether the buffered voltage exceeds a static threshold representing the saturation limit by detecting whether the buffered voltage exceeds the saturation limit. value and if the buffered voltage exceeds the static threshold, the time-to-saturation (TTS) measurement operation is performed by measuring the time it takes for the buffered voltage to exceed the static threshold. The measured time may be inversely proportional to the light intensity. Also, quantizer 607 can perform a fully digital analog to digital converter for photodiodes that saturate with residual charge but the overflow charge remains below the saturation limit of charge storage device 605 in the mid-intensity range; FD ADC) operates to measure the amount of overflow charge stored in the charge storage device 605 . Additionally, for low intensity ranges where the photodiode is not saturated by residual charge and no overflow charge is accumulated in charge storage device 605, quantizer 607 may perform digital processing instrumentation (PD ADC) operation for analog sensors to The amount of residual charge accumulated in the photodiode 602 is measured. The output of one of the TTS, FD ADC, or PD ADC operations may be output as measurement data 608 to represent the intensity of the light.

圖 6B說明像素單元601之操作的實例序列。如圖 6B中所展示，可基於控制電子快門開關603之AB信號的定時並且基於控制轉換開關604之TG信號的定時來定義曝光週期，該電子快門開關在經啟用時可將由光電二極體602產生的電荷轉向離開，該轉換開關可經控制以將溢出電荷且接著將殘餘電荷傳送至電荷儲存裝置605以用於讀出。舉例而言，參考圖 6B，可在時間T0處對AB信號撤銷確證以允許光電二極體602產生電荷。T0可標記曝光週期的開始。在曝光週期內，TG信號可將轉換開關604設定為部分接通狀態以允許光電二極體602累積電荷中之至少一些作為殘餘電荷，直至光電二極體602飽和為止，其後，溢出電荷可經傳送至電荷儲存裝置605。在時間T0與T1與之間，量化器607可執行TTS操作以判定電荷儲存裝置605處之溢出電荷是否超過飽和限制，且接著在時間T1與T2之間，量化器607可執行FD ADC操作以量測電荷儲存裝置605處之溢出電荷的數量。在時間T2與T3之間，TG信號可經確證以在完全接通狀態中偏置轉換開關604以將殘餘電荷傳送至電荷儲存裝置605。在時間T3處，TG信號可經撤銷確證以將電荷儲存裝置605與光電二極體602隔離，而AB信號可經確證以將由光電二極體602產生的電荷轉向離開。時間T3可標記曝光週期的結束。在時間T3與T4之間，量化器607可執行PD操作以量測殘餘電荷的數量。 6B illustrates an example sequence of operation of pixel cell 601. As shown in FIG. 6B , the exposure period may be defined based on the timing of the AB signal that controls the electronic shutter switch 603, which, when enabled, may be controlled by the photodiode 602, and based on the timing of the TG signal that controls the transfer switch 604. The generated charge is diverted away, and the switch can be controlled to transfer the overflow charge and then the residual charge to the charge storage device 605 for readout. For example, referring to Figure 6B , the AB signal may be deasserted at time TO to allow photodiode 602 to generate charge. T0 marks the start of an exposure period. During the exposure period, the TG signal may set the transfer switch 604 to a partially on state to allow the photodiode 602 to accumulate at least some of the charge as residual charge until the photodiode 602 is saturated, after which the overflow charge may is transferred to the charge storage device 605 . Between times T0 and T1, quantizer 607 may perform TTS operations to determine whether the overflow charge at charge storage device 605 exceeds the saturation limit, and then between times T1 and T2, quantizer 607 may perform FD ADC operations to The amount of overflow charge at the charge storage device 605 is measured. Between times T2 and T3, the TG signal may be asserted to bias transfer switch 604 in the fully on state to transfer residual charge to charge storage device 605. At time T3, the TG signal may be deasserted to isolate the charge storage device 605 from the photodiode 602, and the AB signal may be asserted to divert the charge generated by the photodiode 602 away. Time T3 may mark the end of the exposure period. Between times T3 and T4, the quantizer 607 may perform a PD operation to measure the amount of residual charge.

AB及TG信號可由控制器（圖 6A中未展示）產生，該控制器可為像素單元601之部分以控制曝光週期之持續時間及量化操作之序列。該控制器亦可偵測電荷儲存裝置605是否飽和且光電二極體602是否飽和以自TTS、FD ADC或PD ADC操作中之一者選擇輸出作為量測資料608。舉例而言，若電荷儲存裝置605為飽和的，則該控制器可提供TTS輸出作為量測資料608。若電荷儲存裝置605不飽和但光電二極體602飽和，則該控制器可提供FD ADC輸出作為量測資料608。若光電二極體602不飽和，則該控制器可提供PD ADC輸出作為量測資料608。來自影像感測器600之每一像素單元的在曝光週期內產生之量測資料608可形成影像圖框。該控制器可在隨後的曝光週期中重複圖 6B中之操作序列以產生後續的影像圖框。 The AB and TG signals may be generated by a controller (not shown in FIG. 6A ), which may be part of pixel cell 601 to control the duration of the exposure period and the sequence of quantization operations. The controller can also detect whether the charge storage device 605 is saturated and whether the photodiode 602 is saturated to select the output as measurement data 608 from one of TTS, FD ADC or PD ADC operation. For example, if the charge storage device 605 is saturated, the controller may provide the TTS output as the measurement data 608 . If the charge storage device 605 is not saturated but the photodiode 602 is saturated, the controller may provide the FD ADC output as measurement data 608 . If the photodiode 602 is not saturated, the controller can provide the PD ADC output as measurement data 608 . The measurement data 608 from each pixel unit of the image sensor 600 generated during the exposure period can form an image frame. The controller may repeat the sequence of operations in Figure 6B in subsequent exposure cycles to generate subsequent image frames.

來自影像感測器600之影像圖框資料可經傳輸至主機處理器（圖 6A 及圖 6B中未展示）以支援不同應用程式，諸如追蹤一或多個物件、偵測運動（例如，作為動態視覺感測（dynamic vision sensing；DVS）操作之部分）等。圖 7A 至圖 7D說明可由來自影像感測器600之影像圖框資料所支援的應用程式之實例。圖 7A說明基於來自影像感測器600之影像圖框的物件追蹤操作之實例。如圖 7A中所展示，在主機處理器處操作之應用程式可自在時間T0處擷取之影像圖框700識別對應於物件704之所關注區（region of interest；ROI）702中之像素群組。該應用程式可繼續追蹤物件704在後續的影像圖框中之位置，並且識別對應於物件704之ROI 712中之像素群組，後續的影像圖框包括在時間T1處擷取之影像圖框710。可執行追蹤物件704在影像圖框內之影像位置以支援SLAM演算法，其可基於追蹤物件704在由影像感測器600擷取之場景中之影像位置來建構/更新影像感測器600（及包括影像感測器600之行動裝置，諸如近眼顯示器100）所位於之環境的地圖。 Image frame data from image sensor 600 may be transmitted to a host processor (not shown in FIGS. 6A and 6B ) to support various applications, such as tracking one or more objects, detecting motion (eg, as dynamic Visual sensing (dynamic vision sensing; DVS) operation part) and so on. 7A - 7D illustrate examples of applications that may be supported by image frame data from image sensor 600. FIG. 7A illustrates an example of an object tracking operation based on image frames from image sensor 600 . As shown in FIG. 7A , an application operating at the host processor can identify a group of pixels in a region of interest (ROI) 702 corresponding to object 704 from an image frame 700 captured at time T0 . The application can continue to track the position of object 704 in subsequent image frames, including image frame 710 captured at time T1, and identify groups of pixels in ROI 712 corresponding to object 704 . The image position of the tracking object 704 within the image frame can be performed to support the SLAM algorithm, which can construct/update the image sensor 600 based on the image position of the tracking object 704 in the scene captured by the image sensor 600 ( and a map of the environment in which the mobile device including the image sensor 600, such as the near-eye display 100, is located.

圖 7B說明對來自影像感測器600之影像圖框之物件偵測操作的實例。如在圖 7B的左側所展示，主機處理器可識別在影像圖框720中擷取之場景中之一或多個物件，諸如車輛722及人724。如在圖 7B的右側所展示，基於識別，主機處理器可判定像素726之群組對應於車輛722，而像素728之群組對應於人724。可執行車輛722及人724的識別以支援各種應用程式，諸如車輛722及人724為監視目標之監視應用程式、車輛722及人724用虛擬物件替換之MR應用程式、用以出於隱私降低某些影像（例如，車輛722之車牌、人724之臉）之解析度的注視點成像操作等。 7B illustrates an example of an object detection operation on an image frame from image sensor 600. As shown on the left side of FIG. 7B , the host processor may identify one or more objects in the scene captured in image frame 720, such as vehicle 722 and person 724. As shown on the right side of FIG. 7B , based on the identification, the host processor may determine that the group of pixels 726 corresponds to a vehicle 722 and the group of pixels 728 corresponds to a person 724. Recognition of vehicle 722 and person 724 can be performed to support various applications such as surveillance applications where vehicle 722 and person 724 are targeted for surveillance, MR applications where vehicle 722 and person 724 are replaced with virtual objects, to reduce certain Gaze imaging operations at the resolution of some images (eg, license plate of vehicle 722, face of person 724), etc.

圖 7C說明對來自影像感測器600之影像圖框的眼睛追蹤操作之實例。如圖 7C中所展示，主機處理器可自眼球之影像730及732識別對應於瞳孔738及閃爍739之像素734及736之群組。可執行瞳孔738及閃爍739之識別以支援眼睛追蹤操作。舉例而言，基於瞳孔738及閃爍739之影像位置，該應用程式可判定使用者在不同時間之凝視方向，其可作為輸入經提供至該系統以判定例如待顯示給使用者之內容。 7C illustrates an example of an eye tracking operation on an image frame from image sensor 600. As shown in FIG. 7C , the host processor can identify the group of pixels 734 and 736 corresponding to pupil 738 and glint 739 from images 730 and 732 of the eyeball. Pupil 738 and blink 739 identification may be performed to support eye tracking operations. For example, based on the image positions of pupil 738 and blink 739, the application can determine the user's gaze direction at different times, which can be provided as input to the system to determine, for example, what to display to the user.

圖 7D說明對來自影像感測器600之影像圖框的動態視覺感測（dynamic vision sensing；DVS）操作之實例。在DVS操作中，影像感測器600可僅輸出經歷亮度之預定程度的改變（在像素值上反映）之像素，而沒經歷改變程度之像素不會由影像感測器600輸出。可執行DVS操作以偵測物件之運動及/或縮減經輸出之像素資料量。舉例而言，參考圖 7D，在時間T0處擷取影像740，其含有光源之像素742之群組及人之像素744之群組。像素742及744兩者之群組可在時間T0處作為影像740之部分輸出。在時間T1處擷取影像750。對應於光源之像素742之群組之像素值在時間T0與T1之間保持相同，且像素742之群組不作為影像750之部分輸出。另一方面，人在時間T0與T1之間自站立變成行走，從而引起像素744之群組之像素值在時間T0與T1之間的改變。因而，人之像素744之群組作為影像750之部分輸出。 FIG. 7D illustrates an example of dynamic vision sensing (DVS) operations on image frames from image sensor 600 . In DVS operation, image sensor 600 may only output pixels that undergo a predetermined degree of change in luminance (reflected in pixel value), and pixels that have not undergone a degree of change will not be output by image sensor 600 . DVS operations may be performed to detect motion of objects and/or reduce the amount of pixel data output. For example, referring to FIG. 7D , an image 740 is captured at time T0 containing a group of pixels 742 for light sources and a group of pixels 744 for people. The group of both pixels 742 and 744 may be output as part of image 740 at time TO. Image 750 is captured at time T1. The pixel values of the group of pixels 742 corresponding to the light source remain the same between times T0 and T1 , and the group of pixels 742 is not output as part of the image 750 . On the other hand, a person changes from standing to walking between times T0 and T1 , causing the pixel values of the group of pixels 744 to change between times T0 and T1 . Thus, the group of people's pixels 744 is output as part of the image 750 .

在圖 7A 至圖 7D之操作中，可控制影像感測器600以執行稀疏擷取操作，其中僅選擇像素單元之子集以將所關注像素資料輸出至主機處理器。所關注像素資料可包括支援主機處理器處之特定操作所需之像素資料。舉例而言，在圖 7A之物件追蹤操作中，可控制影像感測器600以僅傳輸分別在影像圖框700及710中之物件704的ROI 702及712中之像素的群組。在圖 7B之物件偵測操作中，可控制影像感測器600以僅分別傳輸車輛722及人724之像素726及728的群組。另外，在圖 7C之眼睛追蹤操作中，可控制影像感測器600以僅傳輸含有瞳孔738及閃爍739之像素734及736之群組。此外，在圖 7D之DVS操作中，可控制影像感測器600以僅傳輸移動的人在時間T1處之像素744之群組而非靜態光源之像素742之群組。所有該些配置可允許產生並且傳輸更高解析度的影像，且不會相應地增加功率及頻寬。舉例而言，包括較多像素單元之較大像素單元陣列可包括於影像感測器600中以改善影像解析度，而當僅像素單元之子集以高解析度產生所關注像素資料並且將高解析度像素資料傳輸至主機處理器而其餘的像素單元不產生/傳輸像素資料或以低解析度產生/傳輸像素資料時可縮減提供經改善影像解析度所需之頻寬及功率。此外，雖然可操作影像感測器600以按較高圖框速率產生影像，但當每一影像僅包括呈高解析度並且由大量位元表示之像素值之小集合而其餘的像素值不被傳輸或由較少數目個位元表示時，可縮減頻寬及功率之增加。 In the operations of FIGS. 7A - 7D , image sensor 600 may be controlled to perform a sparse extraction operation in which only a subset of pixel cells are selected to output pixel data of interest to the host processor. The pixel data of interest may include pixel data required to support a particular operation at the host processor. For example, in the object tracking operation of Figure 7A , image sensor 600 may be controlled to transmit only the group of pixels in ROIs 702 and 712 of object 704 in image frames 700 and 710, respectively. In the object detection operation of FIG. 7B , image sensor 600 may be controlled to transmit only groups of pixels 726 and 728 of vehicle 722 and person 724, respectively. Additionally, in the eye tracking operation of FIG. 7C , image sensor 600 may be controlled to transmit only the group of pixels 734 and 736 containing pupil 738 and glint 739. Furthermore, in the DVS operation of Figure 7D , the image sensor 600 can be controlled to transmit only the group of pixels 744 of the moving person at time Tl rather than the group of pixels 742 of the static light source. All of these configurations may allow higher resolution images to be generated and transmitted without a corresponding increase in power and bandwidth. For example, larger pixel cell arrays including more pixel cells may be included in image sensor 600 to improve image resolution, while only a subset of pixel cells generate pixel data of interest at high resolution and will Transmission of high-resolution pixel data to the host processor while the remaining pixel units do not generate/transmit pixel data or generate/transmit pixel data at low resolution can reduce the bandwidth and power required to provide improved image resolution. Furthermore, although image sensor 600 can be operated to generate images at higher frame rates, when each image includes only a small set of pixel values at high resolution and represented by a large number of bits and the remaining pixel values are not When transmitted or represented by a smaller number of bits, the increase in bandwidth and power can be reduced.

在3D感測之狀況下，亦可縮減像素資料傳輸的量。舉例而言，參考圖 6D，照明器640可將結構化光之圖案642投射至物件650上。結構化光可經反射在物件650之表面上，並且反射光之圖案652可由影像感測器600擷取以產生影像。主機處理器可將圖案652與圖案642匹配，並基於圖案652在影像中之影像位置來判定物件650相對於影像感測器600之深度。對於3D感測，僅像素單元660、662、664及666之群組含有相關資訊（例如，圖案652之像素資料）。為了縮減經傳輸之像素資料量，影像感測器600可被配置以僅發送來自像素單元660、662、664及666之群組的像素資料，或發送呈高解析度之來自像素單元660、662、664及666之群組的像素資料至主機處理器，而其餘的像素資料呈低解析度。 In the case of 3D sensing, the amount of pixel data transmission can also be reduced. For example, referring to FIG. 6D , illuminator 640 may project a pattern 642 of structured light onto object 650. The structured light can be reflected on the surface of the object 650, and the pattern 652 of the reflected light can be captured by the image sensor 600 to generate an image. The host processor may match pattern 652 to pattern 642 and determine the depth of object 650 relative to image sensor 600 based on the image position of pattern 652 in the image. For 3D sensing, only the group of pixel cells 660, 662, 664, and 666 contains relevant information (eg, pixel data for pattern 652). To reduce the amount of pixel data transmitted, image sensor 600 may be configured to send pixel data only from groups of pixel cells 660, 662, 664, and 666, or to send high resolution from pixel cells 660, 662 , 664, and 666 groups of pixel data to the host processor, while the rest of the pixel data is low-resolution.

圖 8A 及圖 8B說明可執行稀疏擷取操作以支援圖 7A 至圖 7D中所說明的操作之成像系統800之實例。如圖 8A中所展示，成像系統800包括影像感測器802及主機處理器804。影像感測器802包括感測器計算電路806及像素單元陣列808。感測器計算電路806包括影像處理器810及程式化地圖產生器812。在一些實例中，感測器計算電路806可實施為特定應用積體電路（application specific integrated circuit；ASIC）、場可程式閘極陣列（field programmable gate array；FPGA）或執行指令以實施影像處理器810及程式化地圖產生器812之功能的硬體處理器。另外，主機處理器804包括可執行應用程式814之通用中央處理單元（central processing unit；CPU）。 Figures 8A and 8B illustrate an example of an imaging system 800 that may perform sparse extraction operations to support the operations illustrated in Figures 7A - 7D . As shown in FIG. 8A , imaging system 800 includes image sensor 802 and host processor 804. The image sensor 802 includes a sensor computing circuit 806 and a pixel unit array 808 . The sensor computing circuit 806 includes an image processor 810 and a programmed map generator 812 . In some examples, the sensor computing circuit 806 may be implemented as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), or execute instructions to implement an image processor 810 and a hardware processor that programs the functions of the map generator 812. Additionally, the host processor 804 includes a general-purpose central processing unit (CPU) that can execute the application program 814 .

像素單元陣列808中之每一像素單元或像素單元之區塊可個別地程式化以例如啟用/停用像素值之輸出、設定由像素單元輸出之像素值的解析度等。像素單元陣列808可自感測器計算電路806之程式化地圖產生器812接收第一程式化信號820，其可呈程式化地圖之形式，該程式化地圖含有用於每一像素單元之程式化資料。像素單元陣列808可基於第一程式化信號820感測來自場景之光並且產生場景之第一影像圖框822。具體言之，像素單元陣列808可由第一程式化信號820控制以在不同稀疏模式中操作，諸如在第一影像圖框822包括像素之完整影像圖框的全圖框模式中，及/或在第一影像圖框822僅包括藉由程式化地圖規定之像素的子集之稀疏模式中操作。像素單元陣列808可將第一影像圖框822輸出至主機處理器804及感測器計算電路806兩者。在一些實例中，像素單元陣列808亦可將具有不同像素稀疏性之第一影像圖框822輸出至主機處理器804及感測器計算電路806。舉例而言，像素單元陣列808可將具有像素之完整影像圖框之第一影像圖框822輸出回至感測器計算電路806，並且將具有由第一程式化信號820定義之稀疏像素的第一影像圖框822輸出至主機處理器804。Each pixel cell or block of pixel cells in pixel cell array 808 can be individually programmed to, for example, enable/disable the output of pixel values, set the resolution of pixel values output by the pixel cells, and the like. Pixel cell array 808 may receive first programmed signal 820 from programmed map generator 812 of sensor computing circuit 806, which may be in the form of a programmed map containing programming for each pixel cell material. The pixel cell array 808 can sense light from the scene based on the first programming signal 820 and generate a first image frame 822 of the scene. In particular, pixel cell array 808 may be controlled by first programming signal 820 to operate in different thinning modes, such as in full frame mode where first image frame 822 includes a complete image frame of pixels, and/or in The first image frame 822 operates in a sparse mode that includes only a subset of the pixels specified by the stylized map. The pixel cell array 808 can output the first image frame 822 to both the host processor 804 and the sensor computing circuit 806 . In some examples, pixel cell array 808 may also output first image frames 822 with different pixel sparsities to host processor 804 and sensor computing circuit 806 . For example, pixel cell array 808 may output a first image frame 822 with a full image frame of pixels back to sensor computing circuit 806 and a first image frame with sparse pixels defined by first programming signal 820 An image frame 822 is output to the host processor 804 .

感測器計算電路806及主機處理器804連同影像感測器802可基於第一影像圖框822形成雙層回饋系統，以控制影像感測器以產生後續的影像圖框824。在雙層回饋操作中，感測器計算電路806之影像處理器810可對第一影像圖框822執行影像處理操作以獲得處理結果，且接著程式化地圖產生器812可基於處理結果更新第一程式化信號820。影像處理器810處之影像處理操作可基於自應用程式814接收到之第二程式化信號832而被導引/配置，從而可基於第一影像圖框822產生第二程式化信號832。像素單元陣列808接著可基於經更新第一程式化信號820產生後續的影像圖框824。主機處理器804及感測器計算電路806接著可基於後續的影像圖框824分別更新第一程式化信號820及第二程式化信號832。The sensor computing circuit 806 and the host processor 804 together with the image sensor 802 can form a two-layer feedback system based on the first image frame 822 to control the image sensor to generate subsequent image frames 824 . In a two-layer feedback operation, the image processor 810 of the sensor computing circuit 806 may perform image processing operations on the first image frame 822 to obtain processing results, and then the programmed map generator 812 may update the first image frame 822 based on the processing results. Stylized signal 820. Image processing operations at the image processor 810 may be directed/configured based on the second stylized signal 832 received from the application 814 , which may generate the second stylized signal 832 based on the first image frame 822 . The pixel cell array 808 may then generate a subsequent image frame 824 based on the updated first programming signal 820 . The host processor 804 and the sensor computing circuit 806 can then update the first programming signal 820 and the second programming signal 832, respectively, based on the subsequent image frame 824.

在前述的雙層回饋系統中，來自主機處理器804之第二程式化信號832可呈教示/導引信號、神經網路訓練操作之結果（例如，反向傳播結果）等之形式，以影響感測器計算電路806處之影像處理操作及/或程式化地圖產生。主機處理器804可不僅基於第一影像圖框且亦基於其他感測器資料（例如，由其他影像感測器擷取之其他影像圖框、音訊資訊、運動感測器輸出、來自使用者之輸入）產生教示/導引信號以判定影像感測器802之光感測操作之情境，且接著判定該教示/導引信號。該情境可包括例如影像感測器802進行操作所在之環境條件、影像感測器802之位置或應用程式814之任何其他要求。鑒於該情境通常以比圖框速率低得多的速率改變，可基於該情境以相對較低速率（例如，低於圖框速率）更新該教示/導引信號，而感測器計算電路806處之影像處理操作及程式化地圖之更新可以相對較高速率（例如，以圖框速率）進行，以適應由像素單元陣列808擷取之影像。In the aforementioned two-layer feedback system, the second programmed signal 832 from the host processor 804 may be in the form of a teach/pilot signal, the result of a neural network training operation (eg, backpropagation results), etc., to affect Image processing operations and/or stylized map generation at sensor computing circuit 806 . The host processor 804 may be based not only on the first image frame but also on other sensor data (eg, other image frames captured by other image sensors, audio information, motion sensor output, input) to generate a teach/guidance signal to determine the context of the light sensing operation of the image sensor 802, and then to determine the teach/guidance signal. The context may include, for example, the environmental conditions in which the image sensor 802 is operating, the location of the image sensor 802 , or any other requirements of the application 814 . Given that the context typically changes at a much slower rate than the frame rate, the teach/pilot signal may be updated at a relatively low rate (eg, lower than the frame rate) based on the context, while the sensor computing circuit 806 The image processing operations and updating of the stylized map can be performed at a relatively high rate (eg, at the frame rate) to accommodate the image captured by the pixel cell array 808 .

儘管圖 8A說明像素單元陣列808將第一影像圖框822及第二影像圖框824傳輸至主機處理器804及感測器計算電路806兩者，但在一些狀況下，像素單元陣列808可將具有不同稀疏性之影像圖框傳輸至主機處理器804及感測器計算電路806。舉例而言，像素單元陣列808可將具有完整像素之第一影像圖框822及第二影像圖框824傳輸至影像處理器810，而各自包括基於第一程式化信號820選擇之像素之子集的兩個影像圖框之稀疏版本經發送至主機處理器804。 Although FIG. 8A illustrates pixel cell array 808 transmitting first image frame 822 and second image frame 824 to both host processor 804 and sensor computing circuit 806, in some cases pixel cell array 808 may Image frames with different sparsity are transmitted to host processor 804 and sensor computing circuit 806 . For example, pixel cell array 808 may transmit a first image frame 822 and a second image frame 824 with full pixels to image processor 810 , each including a subset of pixels selected based on first programming signal 820 . The sparse versions of the two image frames are sent to the host processor 804 .

圖 8B說明用以支援圖 7A之物件追蹤操作的成像系統800之操作的實例。具體言之，在時間T0處，像素單元陣列808（圖 8B中未展示）基於指示將產生像素之全圖框的第一程式化信號820而產生第一影像圖框822，並且將第一影像圖框822傳輸至主機處理器804及影像處理器810兩者，該第一影像圖框包括場景之完整像素，該場景包括物件704。基於執行應用程式814，主機處理器804可判定將追蹤物件704。此類判定可基於例如使用者輸入、應用程式814之要求等。主機處理器804亦可處理第一影像圖框822以提取物件704之空間特徵，諸如特徵840及842。基於處理結果，主機處理器804可判定包括第一影像圖框822中之物件704（或其他物件，諸如圖 7C之瞳孔738及閃爍739）的像素之所關注區（region of interest；ROI）850之近似位置、大小及形狀。另外，基於來自其他感測器（例如，IMU）之其他輸出，主機處理器804亦判定影像感測器802正以某一速度相對於物件704移動，且可估計ROI 852在後續的影像圖框中之新的位置。主機處理器804接著可傳輸物件704之目標特徵（例如，特徵840及842）、ROI之資訊（例如，ROI 850之初始位置、形狀、大小）、速度等作為第二程式化信號832之部分，以對處理器810及程式化地圖產生器812進行成像。 8B illustrates an example of the operation of imaging system 800 to support the object tracking operation of FIG. 7A . Specifically, at time TO, pixel cell array 808 (not shown in FIG. 8B ) generates first image frame 822 based on first programmed signal 820 indicating a full frame of pixels to be generated, and the first image Frame 822 is transmitted to both host processor 804 and image processor 810, the first image frame including complete pixels of the scene including object 704. Based on executing application 814, host processor 804 may determine that object 704 is to be tracked. Such determinations may be based on, for example, user input, requirements of the application 814, and the like. Host processor 804 may also process first image frame 822 to extract spatial features of object 704, such as features 840 and 842. Based on the processing results, host processor 804 may determine a region of interest (ROI) 850 that includes pixels of object 704 (or other objects, such as pupil 738 and glint 739 of FIG. 7C ) in first image frame 822 approximate location, size and shape. Additionally, based on other outputs from other sensors (eg, IMUs), the host processor 804 also determines that the image sensor 802 is moving relative to the object 704 at a certain speed, and can estimate the ROI 852 in subsequent image frames new location in. Host processor 804 may then transmit target features of object 704 (eg, features 840 and 842), ROI information (eg, initial position, shape, size of ROI 850), velocity, etc. as part of second programming signal 832, To image the processor 810 and the stylized map generator 812.

基於第二程式化信號832，影像處理器810可處理第一影像圖框822以偵測物件704之目標影像特徵，並且基於偵測結果判定ROI 852之精確位置、大小及形狀。影像處理器810接著可將包括第一影像圖框822中之ROI 850的精確位置、大小及形狀之ROI資訊854傳輸至程式化地圖產生器812。基於ROI資訊854，以及第二程式化信號832，程式化地圖產生器812可估計待在時間T1處擷取之後續的影像圖框中之ROI 852的預期位置、大小及形狀。舉例而言，基於包括於第二程式化信號832中之速度資訊，程式化地圖產生器812可判定ROI 850將在時間T0與T1之間移動距離d以變成ROI 852，並且基於該距離d判定ROI 852 在時間T1處之位置。作為另一實例，在作為眼睛追蹤操作之部分而追蹤圖 7C之瞳孔738及閃爍739之狀況下，程式化地圖產生器812可獲得關於使用者之凝視改變之資訊，且基於凝視改變判定包括瞳孔738及閃爍739之ROI（例如，ROI 852）在時間T1處之預期位置。程式化地圖產生器812接著可在時間T1處更新第一程式化信號820以選擇ROI 852內之像素單元，以輸出用於後續的影像圖框之物件704（或瞳孔738及閃爍739，或其他物件）之像素資料。 Based on the second programmed signal 832, the image processor 810 may process the first image frame 822 to detect the target image feature of the object 704 and determine the precise location, size and shape of the ROI 852 based on the detection results. The image processor 810 may then transmit the ROI information 854 including the precise location, size and shape of the ROI 850 in the first image frame 822 to the stylized map generator 812. Based on the ROI information 854, and the second stylized signal 832, the stylized map generator 812 can estimate the expected location, size, and shape of the ROI 852 to be in the subsequent image frame to be captured at time T1. For example, based on the velocity information included in the second stylized signal 832, the stylized map generator 812 may determine that the ROI 850 will move a distance d between times T0 and T1 to become the ROI 852, and based on the distance d determine Location of ROI 852 at time T1. As another example, where pupil 738 and flicker 739 of FIG. 7C are tracked as part of an eye tracking operation, stylized map generator 812 may obtain information about the user's gaze change and determine to include pupil based on the gaze change 738 and the expected location of the ROI (eg, ROI 852) of blink 739 at time T1. Stylized map generator 812 may then update first stylized signal 820 at time T1 to select pixel cells within ROI 852 to output object 704 (or pupil 738 and blink 739, or other) for subsequent image frames object) pixel data.

圖 9A 、圖 9B 、圖 9C 及圖 9D說明圖 8A的成像系統800之內部組件之實例。圖 9A說明像素單元陣列808之實例。如圖 9A中所展示，像素單元陣列808可包括行控制器904、列控制器906及程式化信號剖析器920。行控制器904與行匯流排908（例如，908a、908b、908c...908n）連接，而列控制器906與列匯流排910（例如，910a、910b...908n）連接。行控制器904或列控制器906中之一者亦與程式化匯流排912連接以傳輸以特定像素單元或像素單元之群組為目標之像素級程式化信號926。每一經標記為P ₀₀、P ₀₁、P _0j等之框可表示像素單元或像素單元之群組（例如，299' 1×2像素單元之群組）。每一像素單元或像素單元之群組可連接至行匯流排908中之一者、列匯流排910中之一者、程式化匯流排912及輸出資料匯流排以輸出像素資料（圖 9A中未展示）。每一像素單元（或像素單元之每一群組）可藉由行控制器904提供之行匯流排908上之行位址信號930及列控制器906提供之列匯流排910上之列位址信號932個別地定址，以經由像素級程式化匯流排912一次性接收像素級程式化信號926。行位址信號930、列位址信號932以及像素級程式化信號926可基於來自程式化地圖產生器812之第一程式化信號820產生。 9A , 9B , 9C , and 9D illustrate examples of internal components of the imaging system 800 of FIG. 8A . FIG. 9A illustrates an example of an array 808 of pixel cells. As shown in FIG. 9A , pixel cell array 808 may include row controller 904 , column controller 906 , and programming signal parser 920 . Row controllers 904 are connected to row buses 908 (eg, 908a, 908b, 908c... 908n), and column controllers 906 are connected to column buses 910 (eg, 910a, 910b... 908n). One of the row controller 904 or the column controller 906 is also connected to the programming bus 912 to transmit pixel-level programming signals 926 targeted to a particular pixel cell or group of pixel cells. Each box labeled P ₀₀ , P ₀₁ , P _0j , etc. may represent a pixel cell or a group of pixel cells (eg, a group of 299′ 1×2 pixel cells). Each pixel cell or group of pixel cells may be connected to one of row bus 908, one of column bus 910, programming bus 912, and output data bus to output pixel data (not shown in FIG. 9A ). exhibit). Each pixel cell (or each group of pixel cells) can be accessed by row address signals 930 on row bus 908 provided by row controller 904 and column addresses on column bus 910 provided by column controller 906 Signals 932 are individually addressed to receive pixel-level programming signals 926 once via pixel-level programming bus 912 . Row address signals 930 , column address signals 932 , and pixel-level programming signals 926 may be generated based on the first programming signal 820 from the programming map generator 812 .

另外，圖 9A包括程式化信號剖析器920，其可自第一程式化信號820提取像素級程式化信號。在一些實例中，第一程式化信號820可包括程式化地圖，其可包括用於像素單元陣列808之每一像素單元或像素單元之每一群組的程式化資料。圖 9B說明像素陣列程式化地圖940之實例。如圖 9B中所展示，像素陣列程式化地圖940可包括像素級程式化資料之二維陣列，其中二維陣列之每一像素級程式化資料以像素單元陣列808之像素單元或像素單元之群組為目標。舉例而言，在每一像素級程式化資料以像素單元為目標之狀況下，並且假設像素單元陣列808具有M個像素（例如，像素之M行）之寬度及N個像素（例如，像素之N列）之高度，像素陣列程式化地圖940亦可具有M個條目（例如，條目之M行）之寬度及N個條目（例如，條目之N列）之高度，其中每一條目儲存用於對應的像素單元之像素級程式化資料。舉例而言，像素陣列程式化地圖940之條目(0, 0)處之像素級程式化資料A ₀₀以像素單元陣列808之像素位置(0, 0)處之像素單元P ₀₀為目標，而像素陣列程式化地圖940之條目(0, 1)處之像素級程式化資料A ₀₁以像素單元陣列808之像素位置(0, 1)處之像素單元P ₀₁為目標。在像素級程式化資料以像素單元之群組為目標之狀況下，像素陣列程式化地圖940之沿著高度及寬度之條目之數目可基於每一群組中之像素單元之數目而按比例調整。 Additionally, FIG. 9A includes a programming signal parser 920 that can extract pixel-level programming signals from the first programming signal 820 . In some examples, the first programming signal 820 may include a programming map, which may include programming data for each pixel cell or each group of pixel cells of the pixel cell array 808 . 9B illustrates an example of a pixel array stylized map 940. As shown in FIG. 9B , pixel array stylization map 940 may include a two-dimensional array of pixel-level stylization data, where each pixel-level stylization data of the two-dimensional array is in the form of a pixel cell or group of pixel cells of pixel cell array 808 group as the target. For example, where each pixel-level programming data targets a pixel cell, and assume that pixel cell array 808 has a width of M pixels (eg, M rows of pixels) and N pixels (eg, pixels of N columns), the pixel array stylized map 940 may also have a width of M entries (eg, M rows of entries) and a height of N entries (eg, N columns of entries), where each entry stores a value for Pixel-level programming data for the corresponding pixel unit. For example, pixel-level programming data A ₀₀ at entry (0, 0) of pixel array programming map 940 targets pixel cell P ₀₀ at pixel position (0, 0) of pixel cell array 808 , while pixel Pixel-level programming data A ₀₁ at entry (0, 1) of array programming map 940 targets pixel cell P ₀₁ at pixel position (0, 1) of pixel cell array 808 . Where pixel-level stylized data targets groups of pixel cells, the number of entries along height and width of pixel array stylized map 940 can be scaled based on the number of pixel cells in each group .

像素陣列程式化地圖940可被配置以支援圖 9B中所描述之回饋操作。舉例而言，儲存於每一條目處之像素級程式化資料可個別地處理每一像素單元（或像素單元之每一群組）以例如通電或斷電，以啟用或停用像素資料的輸出，以設定量化解析度、設定經輸出像素資料之精確度、選擇量化操作（例如，TTS、FD ADC、PD ADC中之一者）、設定圖框速率等。如上文所描述，程式化地圖產生器812可基於例如一或多個ROI的預測產生像素陣列程式化地圖940，其中用於ROI內之像素單元的像素級程式化資料不同於用於ROI外部之像素單元之像素級程式化資料。舉例而言，像素陣列程式化地圖940可使得像素單元之子集（或像素單元之群組）能夠輸出像素資料，而其餘的像素單元不輸出像素資料。作為另一實例，像素陣列程式化地圖940可控制像素單元之子集以按較高解析度（例如，使用較大量位元來表示每一像素）輸出像素資料，而其餘的像素單元以較低解析度輸出像素資料。 The pixel array stylized map 940 can be configured to support the feedback operation described in FIG. 9B . For example, the pixel-level programming data stored at each entry can process each pixel cell (or each group of pixel cells) individually to, for example, power up or down, to enable or disable the output of pixel data , to set the quantization resolution, set the accuracy of the output pixel data, select the quantization operation (eg, one of TTS, FD ADC, PD ADC), set the frame rate, etc. As described above, the stylized map generator 812 may generate a pixel-array stylized map 940 based on predictions of, for example, one or more ROIs, where the pixel-level stylized data for pixel cells within the ROI is different from that used outside the ROI Pixel-level stylized data for pixel cells. For example, the pixel array programming map 940 may enable a subset of pixel cells (or groups of pixel cells) to output pixel data, while the remaining pixel cells do not output pixel data. As another example, the pixel array programmed map 940 may control a subset of pixel cells to output pixel data at a higher resolution (eg, using a larger number of bits to represent each pixel), while the remaining pixel cells are at a lower resolution Output pixel data in degrees.

返回參考圖 9A，程式化地圖剖析器920可解析像素陣列程式化地圖940，以識別用於每一像素單元（或像素單元之每一群組）的像素級程式化資料，該像素陣列程式化地圖可呈串列資料流。像素級程式化資料之識別可基於例如二維像素陣列程式化地圖轉換成串列格式所藉以之預定掃描圖案，以及藉由程式化信號剖析器920自串列資料流接收像素級程式化資料所按照之次序。對於程式化資料之每一條目，程式化信號剖析器920可產生列位址信號930及行位址信號832，並且將列位址信號830及行位址信號832分別傳輸至列感測器計算電路806及行控制器904以選擇像素單元並且將像素級程式化信號826傳輸至選定的像素單元（或像素單元之群組）。 Referring back to FIG. 9A , stylized map parser 920 may parse pixel array stylized map 940 to identify pixel-level stylization data for each pixel cell (or each group of pixel cells) whose pixel array stylizes A map can be a serial data stream. The identification of pixel-level programming data can be based on, for example, a predetermined scan pattern by which a two-dimensional pixel array programmed map is converted into a serial format, and by the programming signal parser 920 receiving pixel-level programming data from a serial data stream. according to the order. For each entry of programming data, programming signal parser 920 may generate column address signal 930 and row address signal 832, and transmit column address signal 830 and row address signal 832, respectively, to column sensor computing Circuitry 806 and row controller 904 to select pixel cells and transmit pixel-level programming signals 826 to the selected pixel cell (or group of pixel cells).

圖 9C說明像素單元陣列808之像素單元950的實例內部組件，其可包括圖 6A之像素單元601的組件中之至少一些。像素單元950可包括一或多個光電二極體，包括光電二極體952a、952b等。在一些實例中，像素單元950之光電二極體中之一或多者可被配置以偵測不同頻率範圍之光。舉例而言，光電二極體952a可偵測可見光（例如，單色，或紅色、綠色或藍色中之一者），而光電二極體952b可偵測紅外光。在一些實例中，像素單元950之光電二極體中之一些或全部可偵測相同波長之光。像素單元950進一步包括開關954（例如，電晶體、控制器障壁層）以控制哪一光電二極體輸出電荷以用於像素資料產生。在光電二極體偵測不同頻率範圍之光之狀況下，來自每一光電二極體之輸出可對應於一像素以支援並置的2D/3D感測。在光電二極體偵測相同頻率範圍之光之狀況下，來自光電二極體之輸出可在類比分組操作下組合以例如增加在量測低強度之光的狀況下的信號雜訊比（signal-to-noise ratio；SNR）。 Figure 9C illustrates example internal components of pixel cell 950 of pixel cell array 808, which may include at least some of the components of pixel cell 601 of Figure 6A . Pixel unit 950 may include one or more photodiodes, including photodiodes 952a, 952b, and the like. In some examples, one or more of the photodiodes of pixel unit 950 may be configured to detect light in different frequency ranges. For example, photodiode 952a can detect visible light (eg, monochromatic, or one of red, green, or blue), while photodiode 952b can detect infrared light. In some examples, some or all of the photodiodes of pixel unit 950 may detect light of the same wavelength. Pixel unit 950 further includes switches 954 (eg, transistors, controller barrier layers) to control which photodiode outputs charge for pixel data generation. In situations where the photodiodes detect light in different frequency ranges, the output from each photodiode can correspond to a pixel to support juxtaposed 2D/3D sensing. In situations where the photodiodes detect light of the same frequency range, the outputs from the photodiodes can be combined in analog grouping operation to, for example, increase the signal-to-noise ratio in situations where low intensity light is measured -to-noise ratio; SNR).

另外，像素單元950進一步包括如圖 6A中所展示之電子快門開關603、轉換開關604、電荷儲存裝置605、緩衝器606、量化器607，以及重置開關951及記憶體955。電荷儲存裝置605可具有可組態電容以設定電荷至電壓轉換增益。在一些實例中，電荷儲存裝置605之電容可增加以儲存用於中等光強度之FD ADC操作的溢出電荷以降低電荷儲存裝置605藉由溢出電荷飽和之可能性。亦可減小電荷儲存裝置605之電容以增加用於低光強度之PD ADC操作的電荷至電壓轉換增益。電荷至電壓轉換增益之增加可縮減量化誤差並且增加量化解析度。在一些實例中，電荷儲存裝置605之電容亦可在FD ADC操作期間減小以增加量化解析度。重置開關951可在擷取影像圖框之前及/或在FD ADC與PD ADC操作之間重置電荷儲存裝置605。緩衝器606包括其電流可藉由偏置信號BIAS1設定之電流源956，以及可受PWR_GATE信號控制以接通/關斷緩衝器606的功率閘極958。作為停用像素單元950之部分，可關斷緩衝器606。 In addition, pixel unit 950 further includes electronic shutter switch 603, transfer switch 604, charge storage device 605, buffer 606, quantizer 607, and reset switch 951 and memory 955 as shown in FIG. 6A . The charge storage device 605 may have a configurable capacitance to set the charge-to-voltage conversion gain. In some examples, the capacitance of the charge storage device 605 can be increased to store overflow charge for medium light intensity FD ADC operation to reduce the likelihood of the charge storage device 605 being saturated by overflow charge. The capacitance of the charge storage device 605 can also be reduced to increase the charge-to-voltage conversion gain for low light intensity PD ADC operation. The increase in charge-to-voltage conversion gain reduces quantization error and increases quantization resolution. In some examples, the capacitance of charge storage device 605 may also be reduced during FD ADC operation to increase quantization resolution. The reset switch 951 can reset the charge storage device 605 before capturing an image frame and/or between FD ADC and PD ADC operations. The buffer 606 includes a current source 956 whose current can be set by the bias signal BIAS1, and a power gate 958 that can be controlled by the PWR_GATE signal to turn the buffer 606 on/off. As part of disabling pixel cell 950, buffer 606 may be turned off.

另外，量化器607包括比較器960及輸出邏輯962。比較器960可比較緩衝器的輸出與參考電壓（VREF）以產生輸出。取決於量化操作（例如，TTS、FD ADC及PD ADC操作），比較器960可比較經緩衝電壓與不同VREF電壓以產生輸出，且該輸出進一步由輸出邏輯962處理以使記憶體955儲存來自自由運行計數器或數位斜坡之值作為像素輸出。比較器960之偏置電流可受偏置信號BIAS2控制，該偏置信號可設定比較器960之頻寬，該頻寬可基於待由像素單元950支援之圖框速率來設定。此外，比較器960的增益可受增益控制信號GAIN控制。可基於待由像素單元950支援之量化解析度來設定比較器960的增益。比較器960進一步包括電源開關961a及電源開關961b，其亦可受PWR_GATE信號控制以分別接通/關斷比較器960及記憶體955。作為停用像素單元950之部分，可關斷比較器960。Additionally, quantizer 607 includes comparator 960 and output logic 962 . Comparator 960 may compare the output of the buffer to a reference voltage (VREF) to generate an output. Depending on the quantization operation (eg, TTS, FD ADC, and PD ADC operation), comparator 960 may compare the buffered voltage to different VREF voltages to generate an output that is further processed by output logic 962 to cause memory 955 to store the output from free The value of the running counter or digital ramp is output as a pixel. The bias current of comparator 960 can be controlled by bias signal BIAS2, which can set the bandwidth of comparator 960, which can be set based on the frame rate to be supported by pixel unit 950. In addition, the gain of the comparator 960 may be controlled by the gain control signal GAIN. The gain of comparator 960 may be set based on the quantization resolution to be supported by pixel unit 950 . The comparator 960 further includes a power switch 961a and a power switch 961b, which are also controlled by the PWR_GATE signal to turn on/off the comparator 960 and the memory 955, respectively. As part of disabling pixel cell 950, comparator 960 may be turned off.

另外，輸出邏輯962可選擇TTS、FD ADC或PD ADC操作中之一者的輸出，並且基於該選擇，判定是否將比較器960的輸出轉發至記憶體955以儲存來自計數器/數位斜坡之值。輸出邏輯962可包括內部記憶體以儲存基於比較器960的輸出之指示，該些指示係關於光電二極體952（例如，光電二極體952a）是否藉由殘餘電荷飽和且電荷儲存裝置605是否藉由溢出電荷飽和。若電荷儲存裝置605藉由溢出電荷飽和，則輸出邏輯962可選擇待儲存在記憶體955中之TTS輸出並且阻止記憶體955藉由FD ADC/PD ADC輸出覆寫TTS輸出。若電荷儲存裝置605不飽和但光電二極體952飽和，則輸出邏輯962可選擇待儲存在記憶體955中之FD ADC輸出；否則，輸出邏輯962可選擇待儲存在記憶體955中之PD ADC輸出。在一些實例中，代替計數器值，關於光電二極體952是否藉由殘餘電荷飽和且電荷儲存裝置605是否藉由溢出電荷飽和之指示可儲存於記憶體955中以提供最低精確度像素資料。Additionally, output logic 962 may select the output of one of TTS, FD ADC, or PD ADC operation, and based on the selection, determine whether to forward the output of comparator 960 to memory 955 to store the value from the counter/digital ramp. Output logic 962 may include internal memory to store indications based on the output of comparator 960 as to whether photodiode 952 (eg, photodiode 952a) is saturated with residual charge and whether charge storage device 605 is Saturation by overflow charge. If charge storage device 605 is saturated by overflow charge, output logic 962 selects the TTS output to be stored in memory 955 and prevents memory 955 from overwriting the TTS output with the FD ADC/PD ADC output. If charge storage device 605 is not saturated but photodiode 952 is saturated, output logic 962 may select the FD ADC output to be stored in memory 955; otherwise, output logic 962 may select the PD ADC to be stored in memory 955 output. In some examples, instead of a counter value, an indication of whether the photodiode 952 is saturated with residual charge and whether the charge storage device 605 is saturated with overflow charge can be stored in memory 955 to provide the lowest accuracy pixel data.

另外，像素單元950可包括像素單元控制器970，其可包括邏輯電路以產生控制信號，諸如AB、TG、BIAS1、BIAS2、GAIN、VREF、PWR_GATE等。像素單元控制器970亦可藉由像素級程式化信號926經程式化。舉例而言，為了停用像素單元950，像素單元控制器970可藉由像素級程式化信號926經程式化以對PWR_GATE撤銷確證，以關斷緩衝器606及比較器960。此外，為了增加量化解析度，像素單元控制器970可藉由像素級程式化信號926經程式化以縮減電荷儲存裝置605之電容，經由GAIN信號增加比較器960之增益等等。為了增加圖框速率，像素單元控制器970可藉由像素級程式化信號926經程式化以增加BIAS1信號及BIAS2信號，以分別增加緩衝器606及比較器960之頻寬。此外，為了控制藉由像素單元950輸出之像素資料的精確度，像素單元控制器970可藉由像素級程式化信號926經程式化以例如僅將計數器之位元（例如，最高有效位元）之子集連接至記憶體955，使得記憶體955僅儲存位元之子集，或將儲存於輸出邏輯962中之指示儲存至記憶體955作為像素資料。另外，像素單元控制器970可藉由像素級程式化信號926經程式化以控制AB及TG信號之序列及定時以例如調節曝光週期及/或選擇特定量化操作（例如，TTS、FD ADC或PD ADC中之一者）同時基於操作條件跳過其他量化操作，如上文所描述。Additionally, pixel unit 950 may include a pixel unit controller 970, which may include logic circuits to generate control signals, such as AB, TG, BIAS1, BIAS2, GAIN, VREF, PWR_GATE, and the like. Pixel cell controller 970 may also be programmed by pixel-level programming signal 926 . For example, to disable pixel cell 950, pixel cell controller 970 may be programmed to deassert PWR_GATE via pixel-level programming signal 926 to turn off buffer 606 and comparator 960. Additionally, to increase the quantization resolution, the pixel cell controller 970 can be programmed by the pixel-level programming signal 926 to reduce the capacitance of the charge storage device 605, increase the gain of the comparator 960 by the GAIN signal, and so on. To increase the frame rate, pixel cell controller 970 may be programmed by pixel-level programming signal 926 to increase the BIAS1 and BIAS2 signals to increase the bandwidth of buffer 606 and comparator 960, respectively. Furthermore, in order to control the accuracy of the pixel data output by pixel cell 950, pixel cell controller 970 may be programmed by pixel-level programming signal 926 to, for example, only bits of a counter (eg, the most significant bit) A subset of are connected to memory 955 so that memory 955 stores only a subset of the bits, or the indication stored in output logic 962 is stored to memory 955 as pixel data. Additionally, pixel cell controller 970 may be programmed by pixel-level programming signal 926 to control the sequence and timing of the AB and TG signals to, for example, adjust exposure periods and/or select specific quantization operations (eg, TTS, FD ADC, or PD) one of the ADCs) while skipping other quantization operations based on operating conditions, as described above.

圖 9D說明影像處理器810之內部組件之實例。如圖 9D中所展示，影像處理器810可包括特徵提取電路972及記憶體976。待由影像處理器810提取/偵測之特徵可包括例如預定物件（例如，人臉、身體部分、場景中之某些實體物件）之空間特徵及關鍵點、時間對比度等。在一些實例中，特徵提取電路972可實施機器學習模型973，諸如卷積神經網路（convolutional neural network；CNN）、遞歸神經網路（recurring neural network；RNN）等，其可經訓練以對由像素單元陣列808產生的輸入影像圖框（例如，第一影像圖框822）執行影像特徵操作。在一些實例中，特徵提取電路972亦可包括比較電路975，其用以比較像素資料與臨限值以識別具有預定時間對比度之像素。特徵提取電路972可包括其他電路，諸如數位信號處理器（digital signal processor；DSP）、線性求解器單元、微控制器、算術電路等，以執行特徵提取操作。影像處理器810可接收目標特徵/臨限值、機器學習參數（例如，權重、反向傳播梯度）或其他組態參數作為來自主機處理器804之第二程式化信號832之部分，以支援機器學習模型973之特徵提取操作及/或訓練操作。由於該特徵提取操作，特徵提取電路972可輸出例如經偵測特徵在輸入影像圖框中之像素位置，其接著可經饋送至程式化地圖產生器812以產生像素陣列程式化地圖940。 9D illustrates an example of the internal components of image processor 810. As shown in FIG. 9D , image processor 810 may include feature extraction circuitry 972 and memory 976. Features to be extracted/detected by the image processor 810 may include, for example, spatial features and key points, temporal contrast, etc. of predetermined objects (eg, human faces, body parts, certain physical objects in a scene). In some examples, feature extraction circuitry 972 may implement a machine learning model 973, such as a convolutional neural network (CNN), recurrent neural network (RNN), etc., which may be trained to The input image frame (eg, first image frame 822 ) generated by pixel cell array 808 performs image feature operations. In some examples, feature extraction circuitry 972 may also include comparison circuitry 975 that compares pixel data to threshold values to identify pixels with predetermined temporal contrast. Feature extraction circuitry 972 may include other circuitry, such as a digital signal processor (DSP), linear solver unit, microcontroller, arithmetic circuitry, etc., to perform feature extraction operations. Image processor 810 may receive target features/thresholds, machine learning parameters (eg, weights, back-propagation gradients) or other configuration parameters as part of second programmed signal 832 from host processor 804 to support the machine Feature extraction operations and/or training operations of the learning model 973 . As a result of this feature extraction operation, feature extraction circuit 972 may output, for example, the pixel locations of the detected features in the input image frame, which may then be fed to stylized map generator 812 to generate pixel array stylized map 940.

另外，記憶體976可提供晶載記憶體以儲存輸入影像圖框之像素資料、用於特徵提取操作之各種組態資料，以及特徵提取電路972之輸出（例如，像素位置）。在一些實例中，經提供至特徵提取電路972之當前輸入影像圖框可僅包括稀疏像素資料，而非像素資料之全圖框。在此狀況下，記憶體976亦可儲存先前輸入影像圖框之像素資料，其可經饋送至特徵提取電路972並且與當前輸入影像組合以產生像素資料之經重建構全圖框。特徵提取電路972接著可基於像素資料之經重建構全圖框而執行特徵提取操作。記憶體976可包括例如自旋隧穿隨機存取記憶體（spin tunneling random access memory；STRAM）、非揮發性隨機存取記憶體（non-volatile random access memory；NVRAM）等。在一些實例中，影像處理器810亦可包括介面至晶片外記憶體（例如，動態隨機存取記憶體）以支援特徵提取電路880處之特徵提取操作。Additionally, memory 976 may provide on-chip memory to store pixel data of the input image frame, various configuration data for feature extraction operations, and the output of feature extraction circuit 972 (eg, pixel locations). In some examples, the current input image frame provided to feature extraction circuit 972 may include only sparse pixel data, rather than a full frame of pixel data. In this case, memory 976 may also store pixel data of previously input image frames, which may be fed to feature extraction circuit 972 and combined with the current input image to generate a reconstructed full frame of pixel data. Feature extraction circuitry 972 may then perform feature extraction operations based on the reconstructed full frame of pixel data. The memory 976 may include, for example, spin tunneling random access memory (STRAM), non-volatile random access memory (NVRAM), and the like. In some examples, image processor 810 may also include an interface to off-chip memory (eg, dynamic random access memory) to support feature extraction operations at feature extraction circuit 880 .

特徵提取電路972可採用各種技術以執行該特徵提取操作。在一個實例中，特徵提取電路972可使用機器學習模型973，諸如CNN，以執行像素資料之區塊與濾光片之間的卷積運算。濾光片可包括表示待提取之目標特徵的一組權重。作為卷積運算之部分，濾光片與像素資料之區塊的一部分在特定步幅位置處疊加，並且可判定濾光片之每一元素與該部分內之每一像素的乘積之總和。當濾光片在像素之區塊內四處移位時，可將相對於不同步幅位置之乘積的總和之分佈判定為卷積輸出。卷積輸出可指示例如特定像素擷取目標特徵之機率、該像素屬於目標物件之機率等。基於該些機率，特徵提取電路972可輸出經判定有可能包括目標特徵或為目標物件之部分的像素之像素位置。像素位置接著可經輸出作為圖 8B之ROI資訊852之部分以調節如上文所描述之像素單元陣列808的稀疏擷取操作。 Feature extraction circuitry 972 may employ various techniques to perform this feature extraction operation. In one example, feature extraction circuitry 972 may use a machine learning model 973, such as a CNN, to perform convolution operations between blocks of pixel data and filters. The filter may include a set of weights representing the target features to be extracted. As part of the convolution operation, the filter is superimposed with a portion of the block of pixel data at a particular stride position, and the sum of the products of each element of the filter and each pixel within the portion can be determined. When the filter is shifted around within a block of pixels, the distribution relative to the sum of the products of different stride positions can be determined as the convolution output. The convolution output may indicate, for example, the probability that a particular pixel captures the target feature, the probability that the pixel belongs to the target object, and the like. Based on these probabilities, feature extraction circuit 972 may output pixel locations of pixels determined to likely include the target feature or be part of the target object. The pixel locations may then be output as part of the ROI information 852 of Figure 8B to adjust the sparse extraction operation of the pixel cell array 808 as described above.

卷積運算之濾光片權重可自訓練程序獲得，可離線、在線上或在兩者之組合下執行該訓練程序。在離線訓練程序中，權重可在該特徵提取操作之前預儲存在記憶體976中。可基於訓練資料集自訓練程序獲得權重，該訓練資料集涵蓋預期由影像處理器810處理之一系列影像資料。訓練資料集可儲存於雲環境中，且該訓練亦可在雲環境中作為離線訓練程序執行。自離線訓練程序獲得之權重對於不同成像系統800之所有影像處理器810可為共同的。The filter weights for the convolution operation can be obtained from a training procedure, which can be performed offline, online, or a combination of the two. In an offline training procedure, the weights may be pre-stored in memory 976 prior to the feature extraction operation. The weights may be obtained from the training program based on a training data set covering a series of image data expected to be processed by the image processor 810 . The training data set can be stored in the cloud environment, and the training can also be performed in the cloud environment as an offline training program. The weights obtained from the offline training procedure may be common to all image processors 810 of different imaging systems 800 .

在線上訓練程序中，可當影像處理器810接收待偵測之實際物件的影像資料時獲得由影像處理器810使用之權重。實例應用程式可為眼睛追蹤（例如，基於由影像感測器擷取之眼睛的影像）。作為線上訓練程序之部分，影像處理器810可在訓練模式中操作，在該訓練模式中，當要求使用者觀看空間中之特定目標或位置時，影像處理器接收使用者之眼睛的像素資料。貫穿該訓練程序，影像處理器810可調節權重以最大化正確地識別使用者之眼睛的可能性。在此狀況下，由特定成像系統800之影像處理器810使用的權重可不同於由另一成像系統800之影像處理器810使用之權重，此係因為針對特定使用者及/或針對特定操作條件最佳化權重。在一些實例中，可藉由離線及線上訓練程序之組合獲得由影像處理器810使用之權重。舉例而言，由第一神經網路層使用之權重可為用於提取物件之一般特徵的通用權重，而上部神經網路層之權重可在線上訓練程序中經訓練以變成特定針對於使用者及/或特定操作條件。In the online training procedure, the weights used by the image processor 810 may be obtained when the image processor 810 receives image data of the actual object to be detected. An example application may be eye tracking (eg, based on images of eyes captured by an image sensor). As part of the online training program, the image processor 810 may operate in a training mode in which the image processor receives pixel data of the user's eyes when the user is asked to look at a particular object or location in space. Throughout the training procedure, the image processor 810 may adjust the weights to maximize the likelihood of correctly identifying the user's eyes. In this case, the weights used by the image processor 810 of a particular imaging system 800 may be different from the weights used by the image processor 810 of another imaging system 800 because of specific user and/or specific operating conditions Optimization weights. In some examples, the weights used by image processor 810 may be obtained through a combination of offline and online training procedures. For example, the weights used by the first neural network layer may be generic weights used to extract general characteristics of objects, while the weights of the upper neural network layers may be trained in an online training process to become user-specific and/or specific operating conditions.

另外，為了支援動態視覺感測（dynamic vision sensing；DVS）操作，特徵提取電路972可使用比較電路975以比較輸入影像圖框中之像素與儲存於記憶體976中之先前影像圖框中之對應的像素以獲得用於該些像素之時間對比度。比較電路975亦可比較時間對比度與目標臨限值（作為第二程式化信號832之部分接收）以輸出具有（或超過）時間對比度的預定臨限值之像素的像素位置。In addition, in order to support dynamic vision sensing (DVS) operations, the feature extraction circuit 972 can use the comparison circuit 975 to compare the pixels in the input image frame with the correspondence of the previous image frame stored in the memory 976 pixels to obtain the temporal contrast for those pixels. The comparison circuit 975 may also compare the temporal contrast to a target threshold value (received as part of the second programming signal 832) to output pixel locations of pixels having (or exceeding) the predetermined threshold value of the temporal contrast.

特徵提取電路972處之特徵提取操作可基於第二程式化信號832被配置。舉例而言，主機處理器804可將待提取之目標特徵編碼為濾光片權重，並且將濾光片權重供應至CNN模型以執行卷積運算。另外，主機處理器804可設定用於DVS操作之時間對比度臨限值，並且發送該些時間對比度臨限值作為第二程式化信號832之部分。像素位置接著可經輸出作為圖 8B之ROI資訊852之部分以調節如上文所描述之像素單元陣列808的稀疏擷取操作。 Feature extraction operations at feature extraction circuit 972 may be configured based on second programmed signal 832 . For example, the host processor 804 may encode the target features to be extracted as filter weights, and supply the filter weights to a CNN model to perform a convolution operation. Additionally, the host processor 804 may set temporal contrast thresholds for DVS operation and send the temporal contrast thresholds as part of the second programming signal 832 . The pixel locations may then be output as part of the ROI information 852 of Figure 8B to adjust the sparse extraction operation of the pixel cell array 808 as described above.

除目標特徵及臨限值之外，主機處理器804可基於包括於第二程式化信號832中之其他組態參數而影響特徵提取電路972處之特徵提取操作。舉例而言，主機處理器804可為線上訓練操作之部分，且可基於涉及自單個成像系統800或多個成像系統800接收到之影像的訓練操作判定反向傳播梯度。主機處理器804接著可將反向傳播梯度提供回至每一成像系統800作為第二程式化信號832之部分，以在每一成像系統處局部地調節權重。作為另一實例，主機處理器804可將作為第二程式化信號832之部分的影像處理操作之中間結果（諸如較低層級神經網路層之輸出）提供至特徵提取電路972，其接著可使用該些輸出以執行較高層級神經網路層處之神經網路計算。作為另一實例，主機處理器804可提供藉由神經網路執行之影像處理操作之經預測準確度作為回饋，此允許特徵提取電路972之神經網路更新權重以改善影像處理操作之經預測準確度。In addition to target features and thresholds, the host processor 804 may affect feature extraction operations at the feature extraction circuit 972 based on other configuration parameters included in the second programming signal 832 . For example, the host processor 804 may be part of an online training operation and may determine back-propagated gradients based on training operations involving images received from a single imaging system 800 or multiple imaging systems 800. The host processor 804 may then provide the back-propagating gradients back to each imaging system 800 as part of the second programmed signal 832 to adjust the weights locally at each imaging system. As another example, host processor 804 may provide intermediate results of image processing operations (such as outputs of lower level neural network layers) as part of second programming signal 832 to feature extraction circuitry 972, which may then use These outputs are used to perform neural network computations at higher level neural network layers. As another example, host processor 804 may provide as feedback the predicted accuracy of image processing operations performed by the neural network, which allows the neural network of feature extraction circuit 972 to update weights to improve the predicted accuracy of image processing operations Spend.

作為另一實例，主機處理器804可提供初始ROI（例如，圖 8B之ROI 850）之位置。影像處理器810可以兩步法執行特徵提取操作（例如，卷積運算、動態感測操作）。舉例而言，影像處理器810可首先對藉由初始ROI識別之像素執行特徵提取操作。若提取結果指示初始ROI斷開（例如，經識別像素不類似目標物件之形狀），則影像處理器810可使用初始ROI作為基線以在第二步驟中搜索可包括目標特徵之額外像素。在第二步驟結束時，影像處理器810可判定經改進像素位置以提供改進更多之ROI。 As another example, host processor 804 may provide the location of an initial ROI (eg, ROI 850 of Figure 8B ). The image processor 810 may perform feature extraction operations (eg, convolution operations, motion sensing operations) in a two-step method. For example, the image processor 810 may first perform a feature extraction operation on the pixels identified by the initial ROI. If the extraction results indicate that the initial ROI is broken (eg, the identified pixels do not resemble the shape of the target object), the image processor 810 may use the initial ROI as a baseline to search for additional pixels that may include target features in a second step. At the end of the second step, the image processor 810 may determine improved pixel locations to provide a more improved ROI.

另外，主機處理器804亦可執行特徵提取操作之評估，並且將評估結果提供回至特徵提取電路972。主機處理器804可提供評估結果作為回饋以影響特徵提取電路972處之特徵提取操作。該評估結果可包括例如關於藉由像素單元陣列808輸出之稀疏像素是否含有應用程式814所需之資料之指示（及/或百分比）。在基於由於特徵提取操作產生之第一程式化信號820中定義之ROI輸出稀疏像素之狀況下，特徵提取電路972可基於評估結果調節ROI及/或特徵提取操作。舉例而言，在物件追蹤/偵測操作之狀況下，主機處理器804可評估藉由像素單元陣列808輸出之影像圖框中之稀疏像素是否含有目標物件之所有像素，並且將評估結果提供回至特徵提取電路972。特徵提取電路972接著可調節例如像素之選擇以基於評估結果執行特徵提取操作。在評估結果指示稀疏像素不含有目標物件之所有像素之狀況下，特徵提取電路972可擴展ROI以處理更多像素，或甚至放棄ROI並且處理輸入影像圖框之所有像素以提取/偵測目標特徵。In addition, host processor 804 may also perform evaluation of feature extraction operations and provide evaluation results back to feature extraction circuit 972 . Host processor 804 may provide evaluation results as feedback to influence feature extraction operations at feature extraction circuit 972 . The evaluation results may include, for example, an indication (and/or percentage) as to whether the sparse pixels output by the pixel cell array 808 contain the data required by the application 814 . In the case where sparse pixels are output based on the ROI defined in the first stylized signal 820 resulting from the feature extraction operation, the feature extraction circuit 972 may adjust the ROI and/or the feature extraction operation based on the evaluation results. For example, in the case of object tracking/detection operations, the host processor 804 can evaluate whether the sparse pixels in the image frame output by the pixel cell array 808 contain all the pixels of the target object, and provide the evaluation results back to to feature extraction circuit 972. Feature extraction circuitry 972 may then adjust, for example, the selection of pixels to perform feature extraction operations based on the evaluation results. In the case where the evaluation result indicates that the sparse pixels do not contain all the pixels of the target object, the feature extraction circuit 972 can expand the ROI to process more pixels, or even discard the ROI and process all pixels of the input image frame to extract/detect the target feature .

來自影像感測器600之影像圖框資料可經傳輸至主機處理器（圖 6A 及圖 6B中未展示）以支援不同應用程式，諸如追蹤一或多個物件、偵測運動（例如，作為動態視覺感測（DVS）操作之部分）等。圖 7A 至圖 7D說明可由來自影像感測器600之影像圖框資料支援的應用程式之實例。圖 7A說明基於來自影像感測器600之影像圖框的物件追蹤操作之實例。如圖 7A中所展示，在主機處理器處操作之應用程式可自在時間T0處擷取之影像圖框700識別對應於物件704之所關注區（region of interest；ROI）702中之像素群組。該應用程式可繼續追蹤物件704在後續的影像圖框中之位置，並且識別對應於物件704之ROI 712中之像素群組，後續的影像圖框包括在時間T1處擷取之影像圖框710。可執行追蹤物件704在影像圖框內之影像位置以支援SLAM演算法，其可基於追蹤物件704在由影像感測器600擷取之場景中之影像位置來建構/更新影像感測器600（及包括影像感測器600之行動裝置，諸如近眼顯示器100）所位於之環境的地圖。 Image frame data from image sensor 600 may be transmitted to a host processor (not shown in FIGS. 6A and 6B ) to support various applications, such as tracking one or more objects, detecting motion (eg, as dynamic Visual Sensing (DVS) operation part) etc. 7A - 7D illustrate examples of applications that may be supported by image frame data from image sensor 600. FIG. 7A illustrates an example of an object tracking operation based on image frames from image sensor 600 . As shown in FIG. 7A , an application operating at the host processor can identify a group of pixels in a region of interest (ROI) 702 corresponding to object 704 from an image frame 700 captured at time T0 . The application can continue to track the position of object 704 in subsequent image frames, including image frame 710 captured at time T1, and identify groups of pixels in ROI 712 corresponding to object 704 . The image position of the tracking object 704 within the image frame can be performed to support the SLAM algorithm, which can construct/update the image sensor 600 based on the image position of the tracking object 704 in the scene captured by the image sensor 600 ( and a map of the environment in which the mobile device including the image sensor 600, such as the near-eye display 100, is located.

圖 7B說明對來自影像感測器600之影像圖框之物件偵測操作的實例。如在圖 7B的左側所展示，主機處理器可識別在影像圖框720中擷取之場景中之一或多個物件，諸如車輛722及人724。如在圖 7B的右側所展示，基於識別，主機處理器可判定像素726之群組對應於車輛722，而像素728之群組對應於人724。可執行車輛722及人724的識別以支援各種應用程式，諸如車輛722及人724為監視目標之監視應用程式、車輛722及人724用虛擬物件替換之混合實境（MR）應用程式、用以出於隱私降低某些影像（例如，車輛722之車牌、人724之臉）之解析度的注視點成像操作等。 7B illustrates an example of an object detection operation on an image frame from image sensor 600. As shown on the left side of FIG. 7B , the host processor may identify one or more objects in the scene captured in image frame 720, such as vehicle 722 and person 724. As shown on the right side of FIG. 7B , based on the identification, the host processor may determine that the group of pixels 726 corresponds to a vehicle 722 and the group of pixels 728 corresponds to a person 724. The identification of vehicles 722 and people 724 can be performed to support various applications, such as surveillance applications in which vehicles 722 and people 724 are targeted for surveillance, mixed reality (MR) applications in which vehicles 722 and people 724 are replaced with virtual objects, for Gaze imaging operations that reduce the resolution of certain images (eg, the license plate of a vehicle 722, the face of a person 724) for privacy, and the like.

圖 7D說明對來自影像感測器600之影像圖框的動態視覺感測（dynamic vision sensing；DVS）操作之實例。在DVS操作中，影像感測器600可僅輸出經歷亮度之預定程度的改變（在像素值上反映）之像素，而沒經歷改變程度之像素不會由影像感測器600輸出。可執行DVS操作以偵測物件之運動及/或縮減經輸出之像素資料量。舉例而言，參考圖 7D，在時間T0處擷取影像740，其含有光源之像素742之群組及人之像素744之群組。像素742及744兩者之群組可在時間T0處作為影像740之部分輸出。在時間T1處擷取影像750。對應於光源之像素742之群組之像素值在時間T0與T1之間保持相同，且像素742之群組不作為影像750之部分輸出。另一方面，人在時間T0與T1之間自站立變成行走，從而引起像素744之群組之像素值在時間T0與T1之間的改變。因而，人之像素744之群組作為影像750之部分輸出。 FIG. 7D illustrates an example of dynamic vision sensing (DVS) operations on image frames from image sensor 600 . In DVS operation, image sensor 600 may only output pixels that undergo a predetermined degree of change in luminance (reflected in pixel value), and pixels that have not undergone a degree of change will not be output by image sensor 600 . DVS operations may be performed to detect motion of objects and/or reduce the amount of pixel data output. For example, referring to FIG. 7D , an image 740 is captured at time T0 containing a group of pixels 742 for light sources and a group of pixels 744 for people. The group of both pixels 742 and 744 may be output as part of image 740 at time TO. Image 750 is captured at time T1. The pixel values of the group of pixels 742 corresponding to the light source remain the same between times T0 and T1 , and the group of pixels 742 is not output as part of the image 750 . On the other hand, a person changes from standing to walking between times T0 and T1 , causing the pixel values of the group of pixels 744 to change between times T0 and T1 . Thus, the group of human pixels 744 is output as part of the image 750 .

圖 8A 及圖 8B說明可執行稀疏擷取操作以支援圖 7A 至圖 7D中所說明的操作之成像系統800之實例。如圖 8A中所展示，成像系統800包括影像感測器802及主機處理器804。影像感測器802包括感測器計算電路806、像素單元陣列808及圖框緩衝器809。感測器計算電路806包括影像處理器810及程式化地圖產生器812。在一些實例中，感測器計算電路806可實施為特定應用積體電路（application specific integrated circuit；ASIC）、場可程式閘極陣列（field programmable gate array；FPGA）或執行指令以實施影像處理器810及程式化地圖產生器812之功能的硬體處理器。圖框緩衝器809可包括記憶體以儲存藉由像素單元陣列808輸出之影像圖框，並且將影像圖框提供至感測器計算電路806以用於處理。圖框緩衝器809可包括與感測器計算電路806在同一晶圓上整合之晶載記憶體（例如，靜態隨機存取記憶體（static random access memory；SRAM）），或晶片外記憶體（例如，電阻式隨機存取記憶體（resistive random access memory；ReRAM），動態隨機存取記憶體（dynamic random access memory；DRAM））。另外，主機處理器804包括可執行應用程式814之通用中央處理單元（central processing unit；CPU）。 Figures 8A and 8B illustrate an example of an imaging system 800 that may perform sparse extraction operations to support the operations illustrated in Figures 7A - 7D . As shown in FIG. 8A , imaging system 800 includes image sensor 802 and host processor 804. The image sensor 802 includes a sensor computing circuit 806 , a pixel cell array 808 and a frame buffer 809 . The sensor computing circuit 806 includes an image processor 810 and a programmed map generator 812 . In some examples, the sensor computing circuit 806 may be implemented as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), or execute instructions to implement an image processor 810 and a hardware processor that programs the functions of the map generator 812. Frame buffer 809 may include memory to store image frames output by pixel cell array 808 and provide the image frames to sensor computing circuit 806 for processing. Frame buffer 809 may include on-chip memory (eg, static random access memory (SRAM)) integrated on the same wafer as sensor computing circuit 806, or off-chip memory ( For example, resistive random access memory (ReRAM), dynamic random access memory (DRAM)). Additionally, the host processor 804 includes a general-purpose central processing unit (CPU) that can execute the application program 814 .

除了產生第一程式化信號820之外，感測器計算電路806亦可產生經發送至像素單元陣列808之每一像素單元的全域信號。全域信號可包括例如用於TTS、FD ADC及PD ADC操作中之量化操作的臨限電壓（例如，用於FD ADC及PD ADC操作之全域電壓斜坡、用於TTS操作之平坦電壓等），以及諸如圖 6B的AB及TG信號之全域控制信號。 In addition to generating the first programming signal 820 , the sensor computing circuit 806 may also generate a global signal that is sent to each pixel cell of the pixel cell array 808 . The global signal may include, for example, threshold voltages for quantization operations in TTS, FD ADC, and PD ADC operations (eg, global voltage ramps for FD ADC and PD ADC operations, flat voltages for TTS operations, etc.), and Global control signals such as the AB and TG signals of Figure 6B .

圖 8B說明用以支援圖 7A之物件追蹤操作的成像系統800之操作的實例。具體言之，在時間T0處，像素單元陣列808（圖 8B中未展示）基於指示將產生像素之全圖框的第一程式化信號820而產生第一影像圖框822，並且將第一影像圖框822傳輸至主機處理器804及影像處理器810兩者，該第一影像圖框包括場景之完整像素，該場景包括物件704。基於執行應用程式814，主機處理器804可判定將追蹤物件704。此類判定可基於例如使用者輸入、應用程式814之要求等。主機處理器804亦可處理第一影像圖框822以提取物件704之空間特徵，諸如特徵840及842。基於處理結果，主機處理器804可判定包括第一影像圖框822中之物件704（或其他物件，諸如圖 7C之瞳孔738及閃爍739）的像素之ROI 850之近似位置、大小及形狀。另外，基於來自其他感測器（例如，IMU）之其他輸出，主機處理器804亦判定影像感測器802正以某一速度相對於物件704移動，且可估計ROI 852在後續的影像圖框中之新的位置。主機處理器804接著可傳輸物件704之目標特徵（例如，特徵840及842）、ROI之資訊（例如，ROI 850之初始位置、形狀、大小）、速度等作為第二程式化信號832之部分，以對處理器810及程式化地圖產生器812進行成像。 8B illustrates an example of the operation of imaging system 800 to support the object tracking operation of FIG. 7A . Specifically, at time TO, pixel cell array 808 (not shown in FIG. 8B ) generates first image frame 822 based on first programmed signal 820 indicating a full frame of pixels to be generated, and the first image Frame 822 is transmitted to both host processor 804 and image processor 810, the first image frame including complete pixels of the scene including object 704. Based on executing application 814, host processor 804 may determine that object 704 is to be tracked. Such determinations may be based on, for example, user input, requirements of the application 814, and the like. Host processor 804 may also process first image frame 822 to extract spatial features of object 704, such as features 840 and 842. Based on the processing results, the host processor 804 can determine the approximate location, size and shape of the ROI 850 including the pixels of the object 704 in the first image frame 822 (or other objects, such as the pupil 738 and glint 739 of FIG. 7C ). Additionally, based on other outputs from other sensors (eg, IMUs), the host processor 804 also determines that the image sensor 802 is moving relative to the object 704 at a certain speed, and can estimate the ROI 852 in subsequent image frames new location in. Host processor 804 may then transmit target features of object 704 (eg, features 840 and 842), ROI information (eg, initial position, shape, size of ROI 850), velocity, etc. as part of second programming signal 832, To image the processor 810 and the stylized map generator 812.

圖 9A 、圖 9B 及圖 9C說明圖 8A之像素單元陣列808的內部組件之實例。如圖 9A中所展示，像素單元陣列808可包括行控制器904、列控制器906及程式化信號剖析器920。行控制器904與行匯流排908（例如，908a、908b、908c...908n）連接，而列控制器906與列匯流排910（例如，910a、910b...908n）連接。行控制器904或列控制器906中之一者亦與程式化匯流排912連接以傳輸以特定像素單元或像素單元之群組為目標之像素級程式化信號926。每一經標記為P ₀₀、P ₀₁、P _0j等之框可表示像素單元或像素單元之群組（例如，2×2像素單元之群組）。每一像素單元或像素單元之群組可連接至行匯流排908中之一者、列匯流排910中之一者、程式化匯流排912及輸出資料匯流排以輸出像素資料（圖 9A中未展示）。每一像素單元（或像素單元之每一群組）可藉由行控制器904提供之行匯流排908上之行位址信號930及列控制器906提供之列匯流排910上之列位址信號932個別地定址，以經由像素級程式化匯流排912一次性接收像素級程式化信號926。行位址信號930、列位址信號932以及像素級程式化信號926可基於來自程式化地圖產生器812之第一程式化信號820產生。 Figures 9A , 9B , and 9C illustrate examples of internal components of the pixel cell array 808 of Figure 8A . As shown in FIG. 9A , pixel cell array 808 may include row controller 904 , column controller 906 , and programming signal parser 920 . Row controllers 904 are connected to row buses 908 (eg, 908a, 908b, 908c... 908n), and column controllers 906 are connected to column buses 910 (eg, 910a, 910b... 908n). One of the row controller 904 or the column controller 906 is also connected to the programming bus 912 to transmit pixel-level programming signals 926 targeted to a particular pixel cell or group of pixel cells. Each box labeled P ₀₀ , P ₀₁ , P _0j , etc. may represent a pixel cell or a group of pixel cells (eg, a group of 2×2 pixel cells). Each pixel cell or group of pixel cells may be connected to one of row bus 908, one of column bus 910, programming bus 912, and output data bus to output pixel data (not shown in FIG. 9A ). exhibit). Each pixel cell (or each group of pixel cells) can be accessed by row address signals 930 on row bus 908 provided by row controller 904 and column addresses on column bus 910 provided by column controller 906 Signals 932 are individually addressed to receive pixel-level programming signals 926 once via pixel-level programming bus 912 . Row address signals 930 , column address signals 932 , and pixel-level programming signals 926 may be generated based on the first programming signal 820 from the programming map generator 812 .

圖 9C說明像素單元陣列808之像素單元950的實例內部組件，其可包括圖 6A之像素單元601的組件中之至少一些。像素單元950可包括一或多個光電二極體，其包括光電二極體952a、952b等，該些光電二極體各自可被配置以偵測不同頻率範圍之光。舉例而言，光電二極體952a可偵測可見光（例如，單色，或紅色、綠色或藍色中之一者），而光電二極體952b可偵測紅外光。像素單元950進一步包括開關954（例如，電晶體、控制器障壁層）以控制哪一光電二極體輸出電荷以用於像素資料產生。 Figure 9C illustrates example internal components of pixel cell 950 of pixel cell array 808, which may include at least some of the components of pixel cell 601 of Figure 6A . Pixel unit 950 may include one or more photodiodes, including photodiodes 952a, 952b, etc., each of which may be configured to detect light in different frequency ranges. For example, photodiode 952a can detect visible light (eg, monochromatic, or one of red, green, or blue), while photodiode 952b can detect infrared light. Pixel unit 950 further includes switches 954 (eg, transistors, controller barrier layers) to control which photodiode outputs charge for pixel data generation.

另外，像素單元950進一步包括如圖 6A中所展示之電子快門開關603、轉換開關604、電荷儲存裝置605、緩衝器606、量化器607，以及記憶體955。在一些實例中，像素單元950可包括用於每一光電二極體之單獨的轉換開關604及/或單獨的電荷儲存裝置605。電荷儲存裝置605可具有可組態電容以設定電荷至電壓轉換增益。在一些實例中，電荷儲存裝置605之電容可增加以儲存用於中等光強度之FD ADC操作的溢出電荷以降低電荷儲存裝置605藉由溢出電荷飽和之可能性。亦可減小電荷儲存裝置605之電容以增加用於低光強度之PD ADC操作的電荷至電壓轉換增益。電荷至電壓轉換增益之增加可縮減量化誤差並且增加量化解析度。在一些實例中，電荷儲存裝置605之電容亦可在FD ADC操作期間減小以增加量化解析度。緩衝器606包括其電流可藉由偏置信號BIAS1設定之電流源956，以及可受PWR_GATE信號控制以接通/關斷緩衝器606的功率閘極958。作為停用像素單元950之部分，可關斷緩衝器606。 In addition, pixel unit 950 further includes electronic shutter switch 603, transfer switch 604, charge storage device 605, buffer 606, quantizer 607, and memory 955 as shown in FIG. 6A . In some examples, pixel cell 950 may include a separate switch 604 and/or a separate charge storage device 605 for each photodiode. The charge storage device 605 may have a configurable capacitance to set the charge-to-voltage conversion gain. In some examples, the capacitance of the charge storage device 605 can be increased to store overflow charge for medium light intensity FD ADC operation to reduce the likelihood of the charge storage device 605 being saturated by overflow charge. The capacitance of the charge storage device 605 can also be reduced to increase the charge-to-voltage conversion gain for low light intensity PD ADC operation. The increase in charge-to-voltage conversion gain reduces quantization error and increases quantization resolution. In some examples, the capacitance of charge storage device 605 may also be reduced during FD ADC operation to increase quantization resolution. The buffer 606 includes a current source 956 whose current can be set by the bias signal BIAS1, and a power gate 958 that can be controlled by the PWR_GATE signal to turn the buffer 606 on/off. As part of disabling pixel cell 950, buffer 606 may be turned off.

另外，量化器607包括比較器960及輸出邏輯962。比較器960可比較緩衝器的輸出與參考電壓（VREF）以產生輸出。取決於量化操作（例如，TTS、FD ADC及PD ADC操作），比較器960可比較經緩衝電壓與不同VREF電壓以產生輸出，且該輸出進一步由輸出邏輯962處理以使記憶體955儲存來自自由運行計數器之值作為像素輸出。比較器960之偏置電流可受偏置信號BIAS2控制，該偏置信號可設定比較器960之頻寬，該頻寬可基於待由像素單元950支援之圖框速率來設定。此外，比較器960的增益可受增益控制信號GAIN控制。可基於待由像素單元950支援之量化解析度來設定比較器960的增益。比較器960進一步包括電源開關961，其亦可受PWR_GATE信號控制以接通/關斷比較器960。作為停用像素單元950之部分，可關斷比較器960。Additionally, quantizer 607 includes comparator 960 and output logic 962 . Comparator 960 may compare the output of the buffer to a reference voltage (VREF) to generate an output. Depending on the quantization operation (eg, TTS, FD ADC, and PD ADC operation), comparator 960 may compare the buffered voltage to different VREF voltages to generate an output that is further processed by output logic 962 to cause memory 955 to store the output from free The value of the running counter is output as a pixel. The bias current of comparator 960 can be controlled by bias signal BIAS2, which can set the bandwidth of comparator 960, which can be set based on the frame rate to be supported by pixel unit 950. In addition, the gain of the comparator 960 may be controlled by the gain control signal GAIN. The gain of comparator 960 may be set based on the quantization resolution to be supported by pixel unit 950 . The comparator 960 further includes a power switch 961, which can also be controlled by the PWR_GATE signal to turn the comparator 960 on/off. As part of disabling pixel cell 950, comparator 960 may be turned off.

另外，輸出邏輯962可選擇TTS、FD ADC或PD ADC操作中之一者的輸出，並且基於該選擇，判定是否將比較器960的輸出轉發至記憶體955以儲存來自計數器之值。輸出邏輯962可包括內部記憶體以儲存基於比較器960的輸出之指示，該些指示係關於光電二極體952（例如，光電二極體952a）是否藉由殘餘電荷飽和且電荷儲存裝置605是否藉由溢出電荷飽和。若電荷儲存裝置605藉由溢出電荷飽和，則輸出邏輯962可選擇待儲存在記憶體955中之TTS輸出並且阻止記憶體955藉由FD ADC/PD ADC輸出覆寫TTS輸出。若電荷儲存裝置605不飽和但光電二極體952飽和，則輸出邏輯962可選擇待儲存在記憶體955中之FD ADC輸出；否則，輸出邏輯962可選擇待儲存在記憶體955中之PD ADC輸出。在一些實例中，代替計數器值，關於光電二極體952是否藉由殘餘電荷飽和且電荷儲存裝置605是否藉由溢出電荷飽和之指示可儲存於記憶體955中以提供最低精確度像素資料。Additionally, output logic 962 may select the output of one of TTS, FD ADC, or PD ADC operation, and based on that selection, determine whether to forward the output of comparator 960 to memory 955 to store the value from the counter. Output logic 962 may include internal memory to store indications based on the output of comparator 960 as to whether photodiode 952 (eg, photodiode 952a) is saturated with residual charge and whether charge storage device 605 is Saturation by overflow charge. If charge storage device 605 is saturated by overflow charge, output logic 962 selects the TTS output to be stored in memory 955 and prevents memory 955 from overwriting the TTS output with the FD ADC/PD ADC output. If charge storage device 605 is not saturated but photodiode 952 is saturated, output logic 962 may select the FD ADC output to be stored in memory 955; otherwise, output logic 962 may select the PD ADC to be stored in memory 955 output. In some examples, instead of a counter value, an indication of whether the photodiode 952 is saturated with residual charge and whether the charge storage device 605 is saturated with overflow charge can be stored in memory 955 to provide the lowest accuracy pixel data.

圖 10A 、圖 10B 及圖 10C說明影像處理器810之內部組件之實例。如圖 10A中所展示，影像處理器810可包括計算記憶體1002、控制器1004及資料處理電路1006。計算記憶體1002可儲存待由影像處理器810處理之影像圖框（諸如圖 8A的影像圖框822/824）之像素。控制器1004可接收影像處理組態參數作為來自主機處理器804之第二程式化信號832之部分。控制器1004接著可控制資料處理電路1006以自計算記憶體1002提取影像圖框以基於組態參數執行影像處理操作。舉例而言，在影像處理操作將偵測所關注物件並且追蹤其在影像圖框中之位置之狀況下，第二程式化信號832可包括所關注物件的影像特徵。資料處理電路1006可產生影像處理輸出1008，其指示例如物件在影像圖框中之像素位置。 10A , 10B , and 10C illustrate examples of the internal components of image processor 810. As shown in FIG. 10A , image processor 810 may include computing memory 1002 , controller 1004 , and data processing circuitry 1006 . Computing memory 1002 may store pixels of an image frame (such as image frames 822/824 of Figure 8A ) to be processed by image processor 810. The controller 1004 may receive the image processing configuration parameters as part of the second programming signal 832 from the host processor 804 . The controller 1004 can then control the data processing circuit 1006 to extract image frames from the computing memory 1002 to perform image processing operations based on the configuration parameters. For example, where the image processing operation is to detect the object of interest and track its position in the image frame, the second programmed signal 832 may include image characteristics of the object of interest. Data processing circuitry 1006 may generate image processing output 1008 that indicates, for example, the pixel location of an object within an image frame.

在一些實例中，資料處理電路1006可實施機器學習模型，諸如卷積神經網路（CNN）模型，以執行物件偵測及追蹤操作。圖 10B說明可由資料處理電路1006實施的CNN 1020之實例架構。參考圖 10B，CNN 1020可包括四個主要操作：（1）卷積；（2）藉由激發函數（例如，ReLU）進行處理；（3）合併或子取樣；及（4）分類（全連接層）。該些操作可為每一卷積神經網路之基本構建區塊。不同CNN可具有該些四個主要操作之不同組合。 In some examples, data processing circuitry 1006 may implement a machine learning model, such as a convolutional neural network (CNN) model, to perform object detection and tracking operations. FIG. 10B illustrates an example architecture of CNN 1020 that may be implemented by data processing circuit 1006 . Referring to Figure 10B , CNN 1020 may include four main operations: (1) convolution; (2) processing by excitation functions (eg, ReLU); (3) binning or subsampling; and (4) classification (fully connected Floor). These operations can be the basic building blocks of every convolutional neural network. Different CNNs can have different combinations of these four main operations.

待分類之影像，諸如輸入影像1022，可由像素值之矩陣表示。輸入影像1022可包括多個頻道，每一頻道表示影像之某一分量。舉例而言，來自數位相機之影像可具有紅色頻道、綠色頻道及藍色頻道。每一頻道可由像素之2-D矩陣表示，該些像素具有在0至255（亦即，8位元）範圍內之像素值。灰階影像可僅具有一個頻道。在以下描述中，描述使用CNN 1020之單個影像頻道的處理。可類似地處理其他頻道。An image to be classified, such as input image 1022, may be represented by a matrix of pixel values. The input image 1022 may include multiple channels, each channel representing a certain component of the image. For example, an image from a digital camera may have a red channel, a green channel, and a blue channel. Each channel can be represented by a 2-D matrix of pixels having pixel values in the range 0 to 255 (ie, 8 bits). Grayscale images can have only one channel. In the following description, the processing of a single video channel using CNN 1020 is described. Other channels can be handled similarly.

如圖 10B中所展示，可使用第一權重陣列（在圖 10B中經標記為[W ₀]）藉由第一卷積層（例如，輸入層）1024處理輸入影像1022。第一卷積層1024可包括多個節點，其中每一節點經指派以將輸入影像1022之像素與第一權重陣列中之對應的權重相乘。作為卷積運算之部分，在乘加（MAC）運算中，輸入影像1022之像素區塊可與第一權重陣列相乘以產生乘積，且該些乘積接著經累加以產生總和。每一總和接著可藉由激發函數後處理以產生中間輸出。激發函數可模擬神經網路中之線性感知器的行為。激發函數可包括線性函數或非線性函數（例如，ReLU，softmax）。中間輸出可形成中間輸出張量1026。第一權重陣列可用於例如自輸入影像1022提取某些基本特徵（例如，邊緣），並且中間輸出張量1026可將基本特徵之分佈表示為基本特徵圖。中間輸出張量1026可經傳遞至合併層1028，其中中間輸出張量1026可藉由合併層1028經次取樣或下取樣以產生中間輸出張量1030。 As shown in FIG . 10B , the input image 1022 may be processed by a first convolutional layer (eg, input layer) 1024 using a first weight array (labeled [W ₀ ] in FIG. 10B ). The first convolutional layer 1024 may include a plurality of nodes, where each node is assigned to multiply a pixel of the input image 1022 with a corresponding weight in the first weight array. As part of the convolution operation, in a multiply-add (MAC) operation, a block of pixels of the input image 1022 may be multiplied by a first array of weights to generate products, and the products are then accumulated to generate a sum. Each summation can then be post-processed by an excitation function to produce an intermediate output. The excitation function simulates the behavior of a linear perceptron in a neural network. The excitation function may include a linear function or a nonlinear function (eg, ReLU, softmax). The intermediate outputs may form an intermediate output tensor 1026 . The first weight array can be used, for example, to extract certain basic features (eg, edges) from the input image 1022, and the intermediate output tensor 1026 can represent the distribution of basic features as a basic feature map. Intermediate output tensors 1026 may be passed to binning layer 1028 , where intermediate output tensors 1026 may be subsampled or downsampled by binning layer 1028 to produce intermediate output tensors 1030 .

可使用第二權重陣列（在圖 10B中標記為[W ₁]）藉由第二卷積層1032處理中間輸出張量1030。第二權重陣列可用於例如自中間輸出張量1030識別特定針對於諸如手之物件的特徵之圖案。作為卷積運算之部分，張量1030之像素區塊可與第二權重陣列相乘以產生乘積，且該些乘積可累加以產生總和。每一總和接著亦可藉由激發函數處理以產生中間輸出，且中間輸出可形成中間輸出張量1034。中間輸出張量1034可表示特徵的分佈，該些特徵表示手。中間輸出張量1034可經傳遞至合併層1036，其中中間輸出張量1034可經次取樣或下取樣以產生中間輸出張量1038。 The intermediate output tensor 1030 may be processed by the second convolutional layer 1032 using a second weight array (labeled [W ₁ ] in FIG. 10B ). The second array of weights can be used, for example, to identify patterns from the intermediate output tensors 1030 that are specific to features of objects such as hands. As part of the convolution operation, the pixel blocks of tensor 1030 can be multiplied by the second weight array to produce products, and the products can be accumulated to produce a sum. Each sum can then also be processed by an excitation function to produce an intermediate output, and the intermediate output can form an intermediate output tensor 1034 . Intermediate output tensors 1034 may represent a distribution of features that represent hands. Intermediate output tensors 1034 may be passed to merge layer 1036 , where intermediate output tensors 1034 may be subsampled or downsampled to produce intermediate output tensors 1038 .

中間輸出張量1038接著可穿過全連接層1040，該全連接層可包括多層感知器（multi-layer perceptron；MLP）。全連接層1040可基於中間輸出張量1038執行分類操作以例如對影像1022中之物件是否表示手、手在影像1022中之可能的像素位置等進行分類。全連接層1040可將中間輸出張量1038與第三權重陣列（在圖 10B中經標記為[W ₂]）相乘以產生總和，且該些總和可藉由激發函數處理以產生神經網路輸出1042。神經網路輸出1042可指示例如所關注物件（例如，手）是否存在於影像圖框中及其像素位置以及大小。 The intermediate output tensors 1038 may then pass through a fully connected layer 1040, which may include a multi-layer perceptron (MLP). The fully connected layer 1040 may perform classification operations based on the intermediate output tensors 1038 to, for example, classify whether an object in the image 1022 represents a hand, possible pixel locations of the hand in the image 1022, and the like. The fully connected layer 1040 can multiply the intermediate output tensor 1038 by a third weight array (labeled [W ₂ ] in FIG. 10B ) to generate sums, and these sums can be processed by the excitation function to generate the neural network output 1042. The neural network output 1042 may indicate, for example, whether an object of interest (eg, a hand) is present in the image frame and its pixel location and size.

圖 10C說明資料處理電路1006之內部組件及其實施CNN 1020的操作之實例。如圖 10C中所展示，資料處理電路1006可包括算術電路1050之陣列。每一算術電路，諸如1050a至1050f，可包括用以將輸入資料元素（用「i」表示）與權重資料元素（用「w」表示）相乘以產生局域部分總和的倍增器1054。輸入資料元素可對應於例如影像圖框中之像素，而權重資料元素可為神經網路層之權重矩陣（例如，[W ₀]、[W ₁]、[W ₂]）的對應的權重。每一算術電路1050亦可包括用以將局域部分總和與自相鄰算術電路接收到之輸入部分總和（經標記為「p_in」）相加並且產生輸出部分總和（經標記為「p_out」）之加法器1052。該輸出部分總和接著經輸入至另一相鄰算術電路。舉例而言，算術電路1050a可自算術電路1050b接收輸入部分總和，將其局域部分總和與輸入部分總和相加以產生輸出部分總和，並且將該輸出部分總和提供至算術電路1050c。因而，每一算術電路產生局域部分總和，且該些局域部分總和在算術電路之陣列中累積以形成中間輸出。資料處理電路1006進一步包括用以對中間輸出執行後處理（例如，激發函數處理、合併）之後處理電路1056。在一些實例中，資料處理電路1006可包括用以實施倍增器1054及後處理電路1056之其他類型的電路，諸如查找表。 FIG. 10C illustrates the internal components of data processing circuit 1006 and an example of implementing the operation of CNN 1020. As shown in FIG. 10C , the data processing circuit 1006 may include an array of arithmetic circuits 1050 . Each arithmetic circuit, such as 1050a-1050f, may include a multiplier 1054 to multiply an input data element (represented by "i") and a weight data element (represented by "w") to generate a local partial sum. The input data elements may correspond, for example, to pixels in an image frame, and the weight data elements may be corresponding weights of a weight matrix (eg, [W ₀ ], [W ₁ ], [W ₂ ]) of the neural network layer. Each arithmetic circuit 1050 may also include a method to add the local partial sum to the input partial sum (labeled "p_in") received from an adjacent arithmetic circuit and generate an output partial sum (labeled "p_out") The adder 1052. This output partial sum is then input to another adjacent arithmetic circuit. For example, arithmetic circuit 1050a may receive the input partial sum from arithmetic circuit 1050b, add its local partial sum to the input partial sum to generate an output partial sum, and provide the output partial sum to arithmetic circuit 1050c. Thus, each arithmetic circuit produces a local partial sum, and the local partial sums are accumulated in an array of arithmetic circuits to form an intermediate output. Data processing circuitry 1006 further includes post-processing circuitry 1056 to perform post-processing (eg, excitation function processing, merging) on the intermediate outputs. In some examples, data processing circuitry 1006 may include other types of circuitry, such as look-up tables, to implement multiplier 1054 and post-processing circuitry 1056 .

為了執行第一輸入卷積層1024及合併層1028之卷積及後處理操作，控制器1004（圖 10C中未展示）可基於輸入資料與算術電路之間的預定映射根據CNN 1020控制資料處理電路1006以自計算記憶體1002提取輸入資料。舉例而言，包括算術電路1050a、1050b及1050c之算術電路之第一群組可自計算記憶體1002提取輸入像素1064之群組，而包括算術電路1050d、1050e及1050f之算術電路之第二群組可自計算記憶體1002提取輸入像素1066之群組。算術電路之每一群組可基於乘加運算執行輸入像素之群組與權重陣列之間的卷積運算以產生中間輸出，如上文所描述。中間輸出接著可藉由後處理電路1056經後處理。經後處理輸出可經儲存回至計算記憶體1002。舉例而言，中間輸出1068可自輸入像素1064之群組的卷積及後處理產生，而中間輸出1070可自輸入像素1066之群組的卷積及後處理產生。在第一卷積層1024及合併層1028之運算完成且中間輸出經儲存於計算記憶體1002中之後，控制器1004可控制算術電路1050之陣列以提取中間輸出以執行第二卷積層1032及合併層1036之卷積及後處理運算以產生第二組中間輸出且將其儲存在計算記憶體1002中。該控制器接著可基於全連接層1040的拓樸結構控制算術電路之陣列以提取第二組中間輸出，以產生神經網路輸出1042。 To perform the convolution and post-processing operations of the first input convolution layer 1024 and the pooling layer 1028, the controller 1004 (not shown in FIG. 10C ) may control the data processing circuit 1006 according to the CNN 1020 based on a predetermined mapping between the input data and the arithmetic circuit To extract input data from computing memory 1002 . For example, a first group of arithmetic circuits including arithmetic circuits 1050a, 1050b, and 1050c may extract a group of input pixels 1064 from computing memory 1002, while a second group of arithmetic circuits including arithmetic circuits 1050d, 1050e, and 1050f Groups may fetch groups of input pixels 1066 from computing memory 1002 . Each group of arithmetic circuits may perform a convolution operation between the group of input pixels and the weight array to produce an intermediate output based on a multiply-add operation, as described above. The intermediate output may then be post-processed by post-processing circuitry 1056 . The post-processed output may be stored back to computing memory 1002 . For example, intermediate output 1068 may be generated from convolution and post-processing of groups of input pixels 1064, while intermediate output 1070 may be generated from convolution and post-processing of groups of input pixels 1066. After the operations of the first convolutional layer 1024 and the merging layer 1028 are completed and the intermediate outputs are stored in the computational memory 1002, the controller 1004 can control the array of arithmetic circuits 1050 to extract the intermediate outputs to perform the second convolutional layer 1032 and the merging layer 1036 convolution and post-processing operations to generate a second set of intermediate outputs and store them in computational memory 1002. The controller can then control the array of arithmetic circuits to extract a second set of intermediate outputs to generate the neural network output 1042 based on the topology of the fully connected layer 1040 .

如圖 8A中所描述，影像處理器810可自圖框緩衝器809接收包括主動像素及非主動像素之稀疏影像。主動像素可對應於一或多個所關注物件，而非主動像素可不含有影像資訊（例如，具有全黑顏色或其他預定顏色）。圖 11A說明儲存於影像處理器810之計算記憶體1002中的稀疏影像1100之實例。如圖 11A中所展示，為了支援追蹤個體之頭部及手之應用程式，稀疏影像1100可包括：主動像素1102之第一群組，其包括個體之頭部的像素；主動像素1104之第二群組，其包括個體之左手的像素；及主動像素1106之第三群組，其包括個體之右手的像素。可基於第一程式化信號820藉由像素單元陣列808產生並且傳輸主動像素之群組。包括像素1108之群組的稀疏影像1100之其餘像素為非主動的並且不含有影像資訊。每一非主動像素可具有零或另一值的像素值以指示像素為非主動的。在一些實例中，可在接收影像圖框的主動像素之前重置圖框緩衝器809之記憶體裝置。在將主動像素寫入至圖框緩衝器809之對應的記憶體裝置中時，未接收主動像素之其餘記憶體裝置可保留其重置狀態（例如，邏輯零）並且變為非主動像素。表示影像圖框之主動像素及非主動像素的像素值接著可儲存於影像處理器810之計算記憶體1002中。 As depicted in FIG. 8A , image processor 810 may receive from frame buffer 809 a sparse image including active pixels and inactive pixels. Active pixels may correspond to one or more objects of interest, while non-active pixels may contain no image information (eg, have an all-black color or other predetermined color). 11A illustrates an example of a sparse image 1100 stored in computing memory 1002 of image processor 810. As shown in FIG. 11A , to support applications that track the head and hands of an individual, the sparse image 1100 may include: a first group of active pixels 1102, which includes pixels of the individual's head; a second group of active pixels 1104 A group that includes pixels of the individual's left hand; and a third group of active pixels 1106 that includes pixels of the individual's right hand. Groups of active pixels can be generated and transmitted by the pixel cell array 808 based on the first programming signal 820 . The remaining pixels of the sparse image 1100 comprising the group of pixels 1108 are inactive and contain no image information. Each inactive pixel may have a pixel value of zero or another value to indicate that the pixel is inactive. In some examples, the memory device of frame buffer 809 may be reset prior to receiving active pixels of an image frame. The remaining memory devices that did not receive active pixels may retain their reset state (eg, logic zero) and become inactive pixels while the active pixels are being written to the corresponding memory devices of frame buffer 809 . The pixel values representing the active and inactive pixels of the image frame may then be stored in the computational memory 1002 of the image processor 810 .

儘管稀疏影像1100的產生及傳輸可縮減像素單元陣列808之功率消耗，但若資料處理電路1006將對稀疏影像1100之每一個像素執行處理操作（例如，卷積運算），則資料處理電路1006仍可消耗許多功率。另一方面，如圖 11A中所展示，鑒於僅像素之小子集為主動像素並且含有影像資料而大部分像素為非主動像素，具有對非主動像素執行處理操作之資料處理電路1006將不產生適用於偵測並且定位所關注物件之資訊。因而，在產生不適用之資訊時會浪費巨大功率量，從而可降低影像處理操作之總功率及計算效率。 Although the generation and transmission of the sparse image 1100 can reduce the power consumption of the pixel cell array 808, if the data processing circuit 1006 is to perform a processing operation (eg, a convolution operation) on each pixel of the sparse image 1100, the data processing circuit 1006 will still Can consume a lot of power. On the other hand, as shown in FIG. 11A , given that only a small subset of pixels are active pixels and contain image data and most of the pixels are inactive pixels, having data processing circuitry 1006 performing processing operations on inactive pixels would not be applicable Information on detecting and locating the object of interest. Thus, a huge amount of power is wasted in generating unsuitable information, which can reduce the overall power and computational efficiency of image processing operations.

參考圖 11B，為了改善總功率及計算效率，控制器1004可包括稀疏資料處置電路1110。在一些實例中，稀疏資料處置電路1110可將輸入資料之群組（例如，像素、中間輸出等）自記憶體1002提取至神經網路層並且偵測輸入資料之群組之子集，其中輸入資料之整個群組具有非主動值（例如，零）或另外不含有影像資訊。稀疏資料處置電路1110可不包括來自資料處理電路1006之非主動輸入資料的該些群組，並且資料處理電路1006不產生用於非主動輸入資料之該些群組的中間輸出並且不將其寫回至計算記憶體1002。另一方面，包括表示影像資訊之主動值（例如，非零）的輸入資料之群組可藉由稀疏資料處置電路1110轉發至資料處理電路1006，該資料處理電路接著可處理主動輸入資料之群組以產生中間輸出且將其寫回至計算記憶體1002。 Referring to FIG. 11B , to improve overall power and computational efficiency, the controller 1004 may include a sparse data handling circuit 1110. In some examples, sparse data handling circuitry 1110 may extract groups of input data (eg, pixels, intermediate outputs, etc.) from memory 1002 to a neural network layer and detect a subset of the groups of input data, where the input data The entire group has an inactive value (eg, zero) or otherwise contains no image information. The sparse data handling circuit 1110 may not include the groups of inactive input data from the data processing circuit 1006, and the data processing circuit 1006 does not generate intermediate outputs for the groups of inactive input data and does not write them back to computing memory 1002. On the other hand, groups of input data that include active values (eg, non-zero) representing image information may be forwarded by sparse data handling circuitry 1110 to data processing circuitry 1006, which may then process groups of active input data group to generate intermediate outputs and write them back to computing memory 1002 .

在一些實例中，稀疏資料處置電路1110亦可接收關於儲存於記憶體1002中之影像圖框的稀疏性之資訊。該稀疏性資訊可基於例如來自程式化地圖產生器812之程式化地圖資訊或基於如下文所描述的神經網路模型拓樸結構。稀疏資料處置電路1110可判定儲存主動像素資料之記憶體1002之記憶體位址，且自該些記憶體位址提取主動像素資料。In some examples, sparse data handling circuit 1110 may also receive information about the sparsity of image frames stored in memory 1002. The sparsity information may be based on, for example, stylized map information from stylized map generator 812 or based on a neural network model topology as described below. The sparse data handling circuit 1110 can determine the memory addresses of the memory 1002 in which the active pixel data is stored, and extract the active pixel data from the memory addresses.

在於不同影像圖框之間重置/重新初始化計算記憶體1002（例如，將其重置/重新初始化至邏輯零）之狀況下，計算記憶體1002之經指派以儲存用於非主動輸入資料之群組的中間輸出（基於神經網路層之輸入與輸出之間的映射指派）之記憶體裝置可保留其經初始化/重置狀態並且對於非主動輸入資料之群組不被存取。同時，計算記憶體1002之經指派以儲存用於主動輸入資料之群組的中間輸出之記憶體裝置可藉由資料處理電路1006經更新。此類配置可縮減對用於處理稀疏影像資料之計算記憶體1002的存取，從而可進一步縮減整個計算記憶體1002及影像處理器810之功率消耗。In the event that the computing memory 1002 is reset/re-initialized between different image frames (eg, reset/re-initialized to logic zero), the computing memory 1002 is assigned to store data for inactive input data. The memory device of the intermediate output of the group (based on the mapping assignment between the input and output of the neural network layer) may retain its initialized/reset state and not be accessed for the group of inactive input data. Meanwhile, the memory devices of computing memory 1002 that are assigned to store intermediate outputs for groups of actively input data may be updated by data processing circuitry 1006 . Such a configuration may reduce access to the computing memory 1002 for processing sparse image data, which may further reduce the power consumption of the entire computing memory 1002 and the image processor 810 .

舉例而言，返回參考圖 10C，稀疏資料處置電路1110可偵測到輸入像素1064之群組可完全為非主動的並且不含有影像資訊，而輸入像素1066之群組含有主動像素及影像資訊。稀疏資料處置電路1110可不包括來自算術電路之第一群組（包括算術電路1050a至1050c）的輸入像素1064之群組及對應的權重，或另外停用算術電路之第一群組，使得無中間輸出經寫回至用於輸入像素1064之群組的計算記憶體1002。用於輸入像素1064之群組的中間輸出1068可在用於第一卷積層1024之處理結束時保留重置值（例如，邏輯零）。另一方面，稀疏資料處置電路1110可將輸入像素1066之群組及對應的權重提供至算術電路之第二群組（包括算術電路1050d至1050f）以產生中間輸出1070，該中間輸出接著可經寫入至計算記憶體1002。稀疏資料處置電路1110亦可基於偵測到輸入資料之非主動群組及將其排除在資料處理電路1006之外來重複用於其他神經網路層之稀疏資料處置，以阻止資料處理電路1006執行計算並將中間輸出寫入至用於輸入資料之該些非主動群組的計算記憶體1002。 For example, referring back to FIG. 10C , sparse data handling circuit 1110 may detect that groups of input pixels 1064 may be completely inactive and contain no image information, while groups of input pixels 1066 contain active pixels and image information. The sparse data handling circuit 1110 may exclude the group of input pixels 1064 and corresponding weights from the first group of arithmetic circuits (including arithmetic circuits 1050a-1050c), or otherwise disable the first group of arithmetic circuits so that there is no intermediate The output is written back to compute memory 1002 for the group of input pixels 1064 . The intermediate output 1068 for the group of input pixels 1064 may retain a reset value (eg, logic zero) at the end of processing for the first convolutional layer 1024 . On the other hand, sparse data handling circuit 1110 may provide the group of input pixels 1066 and the corresponding weights to a second group of arithmetic circuits (including arithmetic circuits 1050d-1050f) to generate an intermediate output 1070, which may then be processed by Write to computing memory 1002 . The sparse data handling circuit 1110 may also reuse the sparse data handling for other neural network layers based on detecting and excluding inactive groups of input data from the data processing circuit 1006 to prevent the data processing circuit 1006 from performing computations The intermediate outputs are written to computing memory 1002 for the inactive groups of input data.

另外，資料處理電路1006亦可包括旁路機構以縮減與藉由稀疏資料處置電路1110轉發之主動輸入資料之群組內的非主動/零輸入資料的處理相關聯之功率消耗。具體言之，參考圖 11C，算術電路1050a可包括停用電路1120及多工器1122。當輸入資料元素（i）或權重資料元素（w）中之一或多者為零時，乘積將為零。為了避免算術電路1050a浪費功率計算零，停用電路1120可在偵測到輸入資料元素（i）或權重資料元素（w）中之一或多者為零後停用加法器1052及倍增器1054（例如，基於切斷其電力供應）。此外，當乘積將為零時，可控制多工器1122以將輸入部分總和（p_in）直接傳遞作為輸出部分總和（p_out）。 In addition, the data processing circuit 1006 may also include a bypass mechanism to reduce power consumption associated with the processing of inactive/zero input data within the group of active input data forwarded by the sparse data handling circuit 1110. Specifically, referring to FIG. 11C , the arithmetic circuit 1050a may include a disable circuit 1120 and a multiplexer 1122. When one or more of the input data element (i) or the weight data element (w) are zero, the product will be zero. To avoid wasting power in arithmetic circuit 1050a computing zero, disabling circuit 1120 may disable adder 1052 and multiplier 1054 after detecting that one or more of the input data element (i) or weight data element (w) is zero (e.g. based on cutting off its power supply). Furthermore, when the product will be zero, the multiplexer 1122 may be controlled to pass the input partial sum (p_in) directly as the output partial sum (p_out).

在一些實例中，為了進一步縮減功率消耗並改善功率及計算效率，影像感測器802可支援時間稀疏性操作。作為主動像素當中之時間稀疏性操作之部分，可識別靜態像素及非靜態像素。影像處理器810可被配置以僅對非靜態像素執行影像處理操作，而稀疏資料處置電路1110可將靜態像素以及非主動像素排除在資料處理電路1006之外，以進一步縮減功率消耗並改善功率及計算效率。In some examples, to further reduce power consumption and improve power and computational efficiency, image sensor 802 may support temporal sparsity operation. As part of the temporal sparsity operation among active pixels, static and non-static pixels can be identified. Image processor 810 can be configured to perform image processing operations only on non-static pixels, while sparse data handling circuit 1110 can exclude static and non-active pixels from data processing circuit 1006 to further reduce power consumption and improve power and Computational efficiency.

圖 12A說明具有靜態及非靜態像素之主動像素1200之群組的實例。如圖 12A中所展示，在時間T0處及在時間T1處在兩個影像圖框中擷取主動像素1200之群組。主動像素1200之群組可包括待追蹤之所關注物件（例如，個體之頭部）。主動像素1200之群組亦可包括像素1202及1204之子集（其包括圖 12A中之眼睛及嘴的像素），其經歷時間T0與T1之間的改變，而其餘的主動像素1200在時間T0與T1之間保持靜態。若像素之改變程度低於臨限值，則可判定像素為靜態的。 12A illustrates an example of a group of active pixels 1200 with static and non-static pixels. As shown in FIG. 12A , groups of active pixels 1200 are captured in two image frames at time T0 and at time T1. The group of active pixels 1200 may include an object of interest (eg, an individual's head) to be tracked. The group of active pixels 1200 may also include a subset of pixels 1202 and 1204 (which include the pixels for the eyes and mouth in FIG. 12A ) that undergo changes between times T0 and T1, while the remaining active pixels 1200 are between times T0 and T1. Static between T1. If the degree of change of the pixel is below a threshold value, the pixel can be determined to be static.

在一些實例中，圖框緩衝器809可自該影像感測器輸出之主動像素偵測靜態像素，並且儲存用於該些像素之像素值以向影像處理器電路標示該些像素為靜態像素。在一些實例中，圖框緩衝器809亦可偵測來自儲存於圖框緩衝器中之像素的靜態像素，該些靜態像素中之一些可對應於非主動像素，該影像感測器不提供該些非主動像素，且其因此保持靜態。圖 12B包括用以支援靜態像素之信令傳輸的圖框緩衝器809之實例內部組件。如圖 12B中所展示，圖框緩衝器809可包括像素更新模組1212、緩衝器1204及像素更新追蹤表1216。具體言之，像素更新模組1212可接收自像素單元陣列808接收之最新影像圖框，其包括主動及非主動像素，並且運用最新影像圖框覆寫緩衝器記憶體1214中之先前影像圖框像素。對於緩衝器記憶體1214中之每一像素，像素更新模組1212可判定像素相對於先前影像圖框之改變程度，並且基於改變程度是否超過臨限值來判定像素為靜態抑或非靜態的。像素更新模組1212亦可在像素更新追蹤表1216中針對每一像素更新像素被更新（並且被視為非靜態）的最後一個圖框時間。在像素更新模組1212運用來自由像素單元陣列808產生的先前圖框之像素值更新像素之後，像素更新模組1212可基於來自像素更新追蹤表1216之資訊追蹤像素藉以保持靜態（靜態圖框時間）之圖框之數目，且基於靜態圖框時間設定像素之像素值。 In some examples, frame buffer 809 may detect static pixels from active pixels output by the image sensor, and store pixel values for those pixels to indicate to image processor circuitry that the pixels are static pixels. In some examples, frame buffer 809 may also detect static pixels from pixels stored in the frame buffer, some of which may correspond to inactive pixels, which the image sensor does not provide. some inactive pixels, and thus remain static. 12B includes example internal components of frame buffer 809 to support signaling of static pixels . As shown in FIG. 12B , frame buffer 809 may include pixel update module 1212 , buffer 1204 , and pixel update tracking table 1216 . Specifically, pixel update module 1212 may receive the latest image frame received from pixel cell array 808, which includes active and inactive pixels, and overwrite the previous image frame in buffer memory 1214 with the latest image frame pixel. For each pixel in the buffer memory 1214, the pixel update module 1212 can determine the degree of change of the pixel relative to the previous image frame, and determine whether the pixel is static or non-static based on whether the degree of change exceeds a threshold value. The pixel update module 1212 may also update the last frame time at which a pixel was updated (and considered non-static) in the pixel update tracking table 1216 for each pixel. After pixel update module 1212 updates pixels using pixel values from previous frames generated by pixel cell array 808, pixel update module 1212 can track pixels based on information from pixel update tracking table 1216 to remain static (static frame time ), and the pixel value of the pixel is set based on the static frame time.

圖 12C說明像素更新模組1212可藉以設定靜態像素之像素值的實例技術。如圖 12C之左側的表1220中所展示，像素更新模組1212可基於具有如下之時間常數C的漏泄積分器函數設定靜態像素之像素值：

（等式1） 12C illustrates an example technique by which pixel update module 1212 may set pixel values for static pixels. As shown in table 1220 on the left side of Figure 12C , pixel update module 1212 can set pixel values for static pixels based on a leaky integrator function with a time constant C as follows:

(Equation 1)

在等式1中，P表示由像素更新模組1212設定之像素值，S0表示用以表示靜態像素之預定像素值，而S表示自先前圖框獲得之原始像素值與S0之間的差。當像素上次經歷大於改變臨限值之改變程度且因此在圖框緩衝器中更新時，自先前圖框獲得原始像素值。靜態像素之像素值經設定為S0 + S，並且隨著像素保持靜態而減小。最終，若像素針對數目擴展的圖框保持靜態，則靜態像素之像素值穩定為S0。In Equation 1, P represents the pixel value set by the pixel update module 1212, S0 represents the predetermined pixel value used to represent the static pixel, and S represents the difference between the original pixel value obtained from the previous frame and S0. The original pixel value is obtained from the previous frame when the pixel last experienced a degree of change greater than the change threshold and was therefore updated in the frame buffer. The pixel value of a static pixel is set to S0 + S and decreases as the pixel remains static. Finally, if the pixels remain static for the expanded number of frames, the pixel value of the static pixel stabilizes at S0.

作為另一實例，如圖 12C之右側的表1222中所展示，像素更新模組1212可基於階梯函數設定靜態像素之像素值。具體言之，針對由T _th表示之臨限數目個圖框，像素更新模組1212可使靜態像素之像素值為S0 + S。在超過臨限數目個圖框之後，像素更新模組1212可將像素值設定為像素值S0。 As another example, as shown in table 1222 on the right side of Figure 12C , pixel update module 1212 may set the pixel values of static pixels based on a step function. Specifically, for a threshold number of frames denoted by _Tth , the pixel update module 1212 may cause the pixel value of the static pixel to be S0+S. After the threshold number of frames is exceeded, the pixel update module 1212 may set the pixel value to the pixel value S0.

預定像素值S0可對應於黑色（零）、白色（255）、灰色（128），或指示靜態像素之任何值。在所有該些狀況下，該影像處理器可基於識別標示靜態像素之像素值來區分靜態像素與非靜態像素，且僅對如上文所描述之非靜態像素執行影像處理操作。The predetermined pixel value S0 may correspond to black (zero), white (255), gray (128), or any value indicative of a static pixel. In all such cases, the image processor may distinguish static pixels from non-static pixels based on identifying pixel values that identify static pixels, and perform image processing operations only on non-static pixels as described above.

在一些實例中，影像處理器810之控制器1004可包括額外組件以進一步改進靜態像素之處置。如圖 13A中所展示，控制器1004可包括神經網路操作控制器1302及資料傳播控制器1304，該資料傳播控制器可包括圖 11B之稀疏資料處置電路1110。神經網路操作控制器1302可針對每一神經網路層判定操作，該些操作包括提取輸入資料、儲存中間輸出資料、算術運算及產生反映該些操作之控制信號1306。資料傳播控制器1304可基於控制信號1306實行輸入資料之提取及中間輸出資料之儲存。 In some examples, the controller 1004 of the image processor 810 may include additional components to further improve the handling of static pixels. As shown in FIG. 13A , the controller 1004 may include a neural network operation controller 1302 and a data spreading controller 1304, which may include the sparse data handling circuit 1110 of FIG. 11B . The neural network operation controller 1302 can determine operations for each neural network layer, including extracting input data, storing intermediate output data, arithmetic operations, and generating control signals 1306 reflecting the operations. Data dissemination controller 1304 may perform extraction of input data and storage of intermediate output data based on control signal 1306 .

具體言之，神經網路操作控制器1302可具有藉由資料處理電路1006實施之神經網路模型的拓樸結構資訊，其包括例如用於每一神經網路層並且相鄰網路層之間的輸入/輸出連接性、每一神經網路層之大小、每一神經網路層處之量化操作及其他後處理操作（例如，激發函數處理、合併操作）、神經網路之接收場等。神經網路操作控制器1302可產生控制信號1306以基於拓樸結構資訊控制輸入資料之提取、中間輸出資料之儲存及算術運算。舉例而言，神經網路操作控制器1302可針對每一神經網路基於連接性資訊包括輸入資料之位址與計算記憶體1002中之中間輸出的位址之間的映射，作為控制信號1306之部分，從而允許資料傳播控制器1304提取輸入資料並且將中間輸出資料儲存在計算記憶體1002內之正確記憶體位置處。Specifically, the neural network operation controller 1302 may have topology information for the neural network model implemented by the data processing circuit 1006, including, for example, for each neural network layer and between adjacent network layers The input/output connectivity of each neural network layer, the size of each neural network layer, the quantization operations at each neural network layer and other post-processing operations (eg, excitation function processing, merge operations), the receptive field of the neural network, etc. The neural network operation controller 1302 can generate control signals 1306 to control extraction of input data, storage of intermediate output data, and arithmetic operations based on topology information. For example, the neural network operation controller 1302 may include, for each neural network, a mapping between the addresses of the input data and the addresses of the intermediate outputs in the computing memory 1002 based on the connectivity information as part of the control signal 1306. part, allowing the data dissemination controller 1304 to extract the input data and store the intermediate output data at the correct memory locations within the computing memory 1002.

另外，神經網路操作控制器1302可包括控制信號1306中之額外資訊以促進靜態像素處置操作。舉例而言，基於神經網路模型之拓樸結構資訊以及主動及非主動像素之分佈，神經網路操作控制器1302可判定指示（影像圖框之間的）像素之改變在神經網路之不同層中的傳播方式之資料改變傳播地圖1310，並且將該地圖提供至資料傳播控制器1304。基於識別靜態像素及資料改變傳播地圖，資料傳播控制器1304（及稀疏資料處置電路1110）可選擇性地將預定為非靜態之輸入資料提取至資料處理電路1006中以產生中間輸出資料之子集，並且將中間輸出資料之子集儲存在計算記憶體1002處。同時，對應於靜態像素（自先前圖框產生）之中間輸出資料經保留在計算記憶體1002中並且不經更新。Additionally, the neural network operation controller 1302 may include additional information in the control signal 1306 to facilitate static pixel manipulation operations. For example, based on the topology information of the neural network model and the distribution of active and inactive pixels, the neural network operation controller 1302 can determine that the changes in the indicated pixels (between image frames) differ in the neural network The data of the propagation mode in the layer changes the propagation map 1310 and provides the map to the data propagation controller 1304. Based on the identification of static pixels and the data change propagation map, the data propagation controller 1304 (and the sparse data handling circuit 1110) may selectively extract input data that is predetermined to be non-static into the data processing circuit 1006 to generate a subset of the intermediate output data, And a subset of the intermediate output data is stored at computational memory 1002 . At the same time, intermediate output data corresponding to static pixels (generated from previous frames) are retained in computational memory 1002 and are not updated.

圖 13B說明基於資料改變傳播地圖1310識別用於不同神經網路層之非靜態輸入/輸出的實例操作。在圖 13B中，黑色區1314a、1314b、1314c、1314d及1314n可對應於儲存非靜態/主動像素資料以及每一神經網路層處之非靜態中間輸出之計算記憶體1002中之主動資料位址區，而白色區可對應於儲存靜態/非主動像素資料及靜態中間輸出之計算記憶體1002中之位址區。基於資料改變傳播地圖1310，資料傳播控制器1304可判定儲存或將儲存非靜態/主動像素資料之計算記憶體1002中之資料位址區，並且將僅來自計算記憶體1002中之該些資料區的像素資料及/或中間輸出資料提取至資料處理電路1006以針對每一神經網路層執行神經網路計算（例如，相乘及累加運算）。 13B illustrates an example operation of identifying non-static inputs/outputs for different neural network layers based on a data change propagation map 1310. In Figure 13B , black areas 1314a, 1314b, 1314c, 1314d, and 1314n may correspond to active data addresses in compute memory 1002 that store non-static/active pixel data and non-static intermediate outputs at each neural network layer regions, while the white regions may correspond to address regions in computing memory 1002 that store static/inactive pixel data and static intermediate outputs. Based on the data change propagation map 1310, the data propagation controller 1304 may determine the data address regions in the computing memory 1002 that store or will store non-static/active pixel data, and will only come from those data regions in the computing memory 1002 The pixel data and/or the intermediate output data of the are extracted to the data processing circuit 1006 to perform neural network computations (eg, multiply and accumulate operations) for each neural network layer.

返回參考圖 13A，神經網路操作控制器302亦可判定改變臨限值1320，以用於判定像素是否為靜態的。改變臨限值1320可基於神經網路模型之拓樸結構（諸如神經網路模型的深度）及每一層處之量化操作以及合併操作而判定。具體言之，雖然像素資料之改變可經由神經網路模型之不同層傳播，但中間輸出及神經網路模型之輸出的改變程度通常在較高神經網路層處減小，尤其在很大程度上量化輸入及輸出資料（例如，由極數目的位元表示）之情況下。因此，對於具有某一數目個層及某一量化方案之給定神經網路模型，神經網路操作控制器302可判定用於像素資料之改變臨限值1320，使得經視為非靜態之像素可在神經網路模型輸出處產生至少某一程度的改變。在一些實例中，由於例如在不同神經網路層處執行之不同合併操作、不同神經網路層處之不同量化精確度、用於不同神經網路層之輸入資料的不同稀疏性分佈等，神經網路操作控制器302亦可判定用於不同神經網路層之不同改變臨限值1320，以確保基於改變臨限值選擇之非靜態輸入資料可在用於神經網路層之輸出資料處產生有意義的改變。 Referring back to Figure 13A , the neural network operation controller 302 may also determine to change the threshold value 1320 for use in determining whether a pixel is static. The change threshold 1320 may be determined based on the topology of the neural network model (such as the depth of the neural network model) and the quantization and merging operations at each layer. Specifically, although changes in pixel data may propagate through different layers of the neural network model, the degree of change in intermediate outputs and outputs of the neural network model generally decreases at higher neural network layers, especially to a large extent In the case of up-quantizing input and output data (eg, represented by a very large number of bits). Thus, for a given neural network model with a certain number of layers and a certain quantization scheme, the neural network operation controller 302 may determine a change threshold 1320 for pixel data such that pixels considered non-static At least some degree of change can be made at the output of the neural network model. In some instances, due to, for example, different merge operations performed at different neural network layers, different quantization accuracy at different neural network layers, different sparsity distributions of input data for different neural network layers, etc. The network operations controller 302 may also determine different change thresholds 1320 for different neural network layers to ensure that non-static input data selected based on change thresholds can be generated at the output data for the neural network layers meaningful change.

在一些實例中，資料傳播控制器1304可包括殘差處置電路1316以追蹤連續影像圖框之間的像素之改變，以及藉由其他影像圖框之序列（例如，10）分離之非連續影像圖框之間的像素之改變兩者。殘差處置電路1316可處置以下情形：由於具有連續影像圖框之間的小改變而將像素判定為靜態但非連續圖框之間的像素之改變足夠大使得需要更新神經網路的中間輸出。在圖 13C中說明此情形。如圖 13C中所展示，在圖框1、2、3及4之間，僅像素1330及1332在圖框之間改變。在連續圖框之間，像素值之改變（0.2至0.3）可為小的，並且該些像素可經判定為靜態像素。但在圖框4與圖框1之間，改變為0.5至0.7並且為顯著的，且在圖框1與4之間的卷積輸出之顯著差上反映該些改變（0.119相對於0.646）。為了處置此情形，殘差處置電路1316可不僅基於兩個連續圖框之間的像素之改變且亦基於藉由預定數目個圖框分離之兩個非連續的圖框之間的改變判定像素為靜態的。若該像素在兩個連續圖框之間展現小改變但在非連續圖框之間展現大改變，則殘差處置電路1316可判定像素為非靜態像素並且允許資料處理電路1006對像素執行影像處理操作。 In some examples, data dissemination controller 1304 may include residual processing circuitry 1316 to track pixel changes between consecutive image frames, as well as non-consecutive image frames separated by sequences (eg, 10) of other image frames The pixels between the boxes change both. Residual handling circuitry 1316 can handle situations where a pixel is determined to be static due to having small changes between consecutive image frames, but the change in pixels between non-consecutive frames is large enough to require updating the intermediate output of the neural network. This situation is illustrated in Figure 13C . As shown in Figure 13C , between frames 1, 2, 3, and 4, only pixels 1330 and 1332 change between frames. Between successive frames, the change in pixel value (0.2 to 0.3) may be small, and the pixels may be determined to be static pixels. But between boxes 4 and 1, the changes are 0.5 to 0.7 and are significant, and these changes are reflected in the significant difference in convolution output between boxes 1 and 4 (0.119 vs. 0.646). To handle this situation, residual handling circuit 1316 may determine that a pixel is based not only on changes in pixels between two consecutive frames, but also on changes between two non-consecutive frames separated by a predetermined number of frames. still. If the pixel exhibits a small change between two consecutive frames but a large change between non-consecutive frames, the residual processing circuit 1316 may determine that the pixel is a non-static pixel and allow the data processing circuit 1006 to perform image processing on the pixel operate.

圖 14A 及圖 14B說明影像感測器802之實體配置的實例。如圖 14A中所展示，影像感測器802可包括：半導體基板1400，其包括像素單元陣列808之組件中之一些，諸如像素單元之光電二極體；及一或多個半導體基板1402，其包括像素單元陣列808之處理電路，諸如緩衝器606、量化器607及記憶體955，以及感測器計算電路806。在一些實例中，一或多個半導體基板1402包括半導體基板1402a及半導體基板1402b。半導體基板1402a可包括像素單元陣列808之處理電路，而半導體基板1402b可包括感測器計算電路806。半導體基板1400及一或多個半導體基板1402可容納於半導體封裝內以形成晶片。 14A and 14B illustrate an example of a physical configuration of image sensor 802. As shown in Figure 14A , image sensor 802 may include: a semiconductor substrate 1400 that includes some of the components of pixel cell array 808, such as photodiodes of pixel cells; and one or more semiconductor substrates 1402, which Processing circuitry including pixel cell array 808 , such as buffer 606 , quantizer 607 and memory 955 , and sensor computation circuitry 806 . In some examples, the one or more semiconductor substrates 1402 include a semiconductor substrate 1402a and a semiconductor substrate 1402b. The semiconductor substrate 1402a may include the processing circuitry of the pixel cell array 808 , and the semiconductor substrate 1402b may include the sensor computing circuitry 806 . The semiconductor substrate 1400 and one or more semiconductor substrates 1402 can be housed within a semiconductor package to form a chip.

在一些實例中，半導體基板1400及一或多個半導體基板1402可沿著豎直方向（例如，由z軸表示）形成堆疊，其中豎直互連件1404及1406在當該些基板中提供電氣連接。此類配置可縮減像素單元陣列808與感測器計算電路806之間的電氣連接之佈線距離，從而可增加資料（尤其像素資料）自像素單元陣列808至感測器計算電路806的傳輸之速度並且縮減傳輸所需之功率。在一些實例中，影像感測器802可包括形成於半導體基板上或其間以提供圖框緩衝器809及計算記憶體1002之記憶體裝置（例如，SRAM，RRAM等）之陣列。In some examples, semiconductor substrate 1400 and one or more semiconductor substrates 1402 may form a stack along a vertical direction (eg, represented by the z-axis), where vertical interconnects 1404 and 1406 provide electrical power in the substrates connect. Such a configuration can reduce the routing distance of the electrical connections between the pixel cell array 808 and the sensor computing circuit 806 , thereby increasing the speed of the transfer of data, particularly pixel data, from the pixel cell array 808 to the sensor computing circuit 806 And reduce the power required for transmission. In some examples, image sensor 802 may include an array of memory devices (eg, SRAM, RRAM, etc.) formed on or between a semiconductor substrate to provide frame buffer 809 and computational memory 1002 .

圖 14B說明影像感測器802之堆疊結構的細節之實例。如圖 14B中所展示，第一半導體基板1000可包括：背側表面1408，其被配置為光接收表面且包括每一像素單元之光電二極體；及前側表面1410，在其上實施轉移電晶體604及電荷儲存裝置605（例如，轉移電晶體604之浮動汲極），而在半導體基板1402a之前側表面1412下方實施包括緩衝器606、量化器607、記憶體955等之像素單元的處理電路。半導體基板1400之前側表面1410可藉由包括晶片對晶片銅結合之豎直互連件1404與半導體基板1402a之前側表面1012電連接。晶片對晶片銅結合可例如在每一像素單元之轉移電晶體604與每一像素單元之緩衝器606之間提供像素互連。 FIG. 14B illustrates an example of the details of the stack structure of image sensor 802 . As shown in FIG. 14B , the first semiconductor substrate 1000 may include: a backside surface 1408 configured as a light receiving surface and including a photodiode for each pixel cell; and a frontside surface 1410 on which the transfer electrons are implemented Crystal 604 and charge storage device 605 (eg, the floating drain of transfer transistor 604), while processing circuitry for pixel units including buffers 606, quantizers 607, memory 955, etc. is implemented under the front side surface 1412 of the semiconductor substrate 1402a . The front side surface 1410 of the semiconductor substrate 1400 may be electrically connected to the front side surface 1012 of the semiconductor substrate 1402a by vertical interconnects 1404 including die-to-die copper bonds. Wafer-to-wafer copper bonding can provide pixel interconnection, for example, between the transfer transistor 604 of each pixel cell and the buffer 606 of each pixel cell.

另外，成像感測器800進一步包括像素單元陣列808與感測器計算電路806之間的貫通豎直互連件，諸如矽穿孔（through silicon via；TSV）、微TSV、銅-銅凸塊等。豎直互連件可在堆疊之肩部區1420及1422上並且穿過半導體基板1402a及1402b。豎直互連件可被配置以傳輸例如第一程式化信號820及影像圖框（例如，第一影像圖框822）。豎直互連件可支援例如像素資料（例如，1920像素×1080像素）之全圖框以正常圖框速率（例如，60圖框/秒）自像素單元陣列808傳輸至影像處理器810以執行影像特徵提取操作。Additionally, imaging sensor 800 further includes through vertical interconnects between pixel cell array 808 and sensor computing circuitry 806, such as through silicon vias (TSVs), micro TSVs, copper-copper bumps, and the like . Vertical interconnects may be on the shoulder regions 1420 and 1422 of the stack and through the semiconductor substrates 1402a and 1402b. The vertical interconnect may be configured to transmit, for example, the first programming signal 820 and an image frame (eg, the first image frame 822). The vertical interconnect may support the transfer of a full frame of, eg, pixel data (eg, 1920 pixels by 1080 pixels) from the pixel cell array 808 to the image processor 810 for execution at normal frame rates (eg, 60 frames/second) Image feature extraction operations.

圖 15說明操作影像感測器，諸如圖 8A之影像感測器802，的方法1500。方法1500可藉由例如包括感測器計算電路806、像素單元陣列808及圖框緩衝器809之影像感測器802之各種組件執行。感測器計算電路806進一步包括影像處理器810及程式化地圖產生器812。在一些實例中，影像感測器實施於第一半導體基板中，圖框緩衝器及感測器計算電路實施於一或多個第二半導體基板中，而第一半導體基板及一或多個第二半導體基板形成堆疊並且容納於單個半導體封裝中。 15 illustrates a method 1500 of operating an image sensor, such as image sensor 802 of FIG. 8A . Method 1500 may be performed by various components such as image sensor 802 including sensor computing circuit 806 , pixel cell array 808 , and frame buffer 809 . The sensor computing circuit 806 further includes an image processor 810 and a programmed map generator 812 . In some examples, the image sensor is implemented in a first semiconductor substrate, the frame buffer and sensor computing circuitry are implemented in one or more second semiconductor substrates, and the first semiconductor substrate and one or more second semiconductor substrates are Two semiconductor substrates form a stack and are housed in a single semiconductor package.

在步驟1502中，程式化地圖產生器812將第一程式化資料傳輸至包含複數個像素單元之影像感測器，以選擇像素單元之第一子集以產生第一主動像素。In step 1502, the programmed map generator 812 transmits first programmed data to an image sensor comprising a plurality of pixel cells to select a first subset of pixel cells to generate a first active pixel.

在一些實例中，可基於稀疏影像感測操作產生第一程式化資料以支援主機處理器804處之物件偵測及追蹤操作。像素單元之第一子集可經選擇性地啟用以僅擷取與追蹤及偵測物件相關之像素資料作為主動像素，或僅將主動像素傳輸至圖框緩衝器，以支援稀疏影像感測操作。可基於例如來自主機處理器804之回饋資料產生第一程式化資料。In some examples, first programming data may be generated based on sparse image sensing operations to support object detection and tracking operations at host processor 804 . A first subset of pixel cells can be selectively enabled to capture only pixel data related to tracking and detecting objects as active pixels, or to transfer only active pixels to a frame buffer to support sparse image sensing operations . The first programming data may be generated based on, for example, feedback data from the host processor 804 .

在步驟1504中，感測器計算電路806自圖框緩衝器809接收第一影像圖框，該第一影像圖框包含由像素單元之第一子集產生的主動像素中之至少一些，該些像素單元由影像感測器基於第一程式化資料來選擇。第一影像圖框進一步包含對應於未被選擇用於產生主動像素之像素單元的第二子集之非主動像素。在一些實例中，如圖 12A 至圖 12C中所描述，圖框緩衝器809亦可運用預定值覆寫第一影像圖框中之像素中之一些，以指示該些像素為靜態像素並且尚未經歷多個影像圖框之臨限改變程度。 In step 1504, the sensor computing circuit 806 receives a first image frame from the frame buffer 809, the first image frame including at least some of the active pixels generated by the first subset of pixel cells, the The pixel cells are selected by the image sensor based on the first programming data. The first image frame further includes inactive pixels corresponding to a second subset of pixel cells not selected for generating active pixels. In some examples, as described in Figures 12A - 12C , frame buffer 809 may also overwrite some of the pixels in the first image frame with predetermined values to indicate that those pixels are static and have not yet experienced The threshold change level for multiple image frames.

具體言之，圖框緩衝器可自該影像感測器輸出之主動像素偵測靜態像素，並且儲存用於該些像素之像素值以向影像處理器標示該些像素為靜態像素。舉例而言，圖框緩衝器可將來自影像感測器之每一像素單元的最新像素資料（包括主動及非主動像素）儲存為第一影像圖框。對於主動像素中之每一像素，圖框緩衝器可判定像素相對於先前圖框（諸如緊接在第一影像圖框之前的影像圖框）之改變程度。圖框緩衝器可以各種方式設定像素值以指示靜態像素。舉例而言，圖框緩衝器可基於具有時間常數之漏泄積分器函數並且基於橫跨其之由該影像感測器輸出之像素保持靜態之多個連續影像圖框設定圖框緩衝器中之用於像素之像素值。若該像素針對大量連續影像圖框保持靜態，則像素之像素值可穩定在預定像素值。作為另一實例，若該像素針對臨限數目個連續影像圖框（例如，10個）保持靜態，則圖框緩衝器可針對圖框緩衝器中之像素設定預定像素值。預定像素值可對應於黑色（零）、白色（255）、灰色（128），或指示靜態像素之任一值。Specifically, the frame buffer may detect static pixels from active pixels output by the image sensor, and store pixel values for the pixels to indicate to the image processor that the pixels are static pixels. For example, the frame buffer may store the latest pixel data (including active and inactive pixels) from each pixel unit of the image sensor as the first image frame. For each of the active pixels, the frame buffer may determine how much the pixel has changed relative to a previous frame, such as the image frame immediately preceding the first image frame. The frame buffer may set pixel values in various ways to indicate static pixels. For example, a frame buffer may be based on a leaky integrator function with a time constant and based on a number of consecutive image frames across which the pixels output by the image sensor remain static for use in the frame buffer pixel value in pixel. If the pixel remains static for a large number of consecutive image frames, the pixel value of the pixel may stabilize at a predetermined pixel value. As another example, if the pixel remains static for a threshold number of consecutive image frames (eg, 10), the frame buffer may set a predetermined pixel value for the pixel in the frame buffer. The predetermined pixel value may correspond to black (zero), white (255), gray (128), or any value indicative of a static pixel.

在步驟1506中，影像處理器810對第一影像圖框之像素的第一子集執行影像處理操作以產生處理輸出，其中第一影像圖框之像素的第二子集被排除在影像處理操作之外。執行影像處理操作所針對之第一影像圖框之像素之第一子集可對應於例如主動像素，該些主動像素為經歷圖框之間的一定程度之改變的非靜態像素等。舉例而言，像素之第一子集可對應於藉由主機處理器804處之物件偵測及追蹤操作追蹤/偵測之所關注物件。In step 1506, the image processor 810 performs an image processing operation on the first subset of pixels of the first image frame to generate a processing output, wherein the second subset of pixels of the first image frame is excluded from the image processing operation outside. The first subset of pixels of the first image frame for which the image processing operation is performed may correspond, for example, to active pixels, which are non-static pixels that undergo some degree of change between frames, or the like. For example, the first subset of pixels may correspond to the object of interest tracked/detected by object detection and tracking operations at the host processor 804 .

在一些實例中，影像處理操作可包括神經網路操作。具體言之，參考圖 10A，影像處理器810可包括資料處理電路1006以為神經網路操作提供硬體加速度，該資料處理電路諸如包括輸入層及輸出層之多層卷積神經網路（CNN）。該影像處理器可包括計算記憶體1002以儲存輸入影像圖框及與每一神經網路層相關聯的一組權重。該組權重可表示待偵測之物件的特徵。該影像處理器亦可包括控制器1004以控制資料處理電路自計算記憶體提取輸入影像圖框資料及權重。該控制器可控制資料處理電路以在輸入影像圖框與權重之間執行算術運算，諸如乘加（multiply-and-accumulate；MAC）運算以產生用於輸入層之中間輸出資料。例如，基於激發函數、合併操作等對中間輸出資料進行後處理，且接著經後處理中間輸出資料可儲存於計算記憶體中。經後處理中間輸出資料可自計算記憶體提取並且經提供至下一神經網路層作為輸入。針對直至輸出層之所有層重複算術運算以及中間輸出資料之提取及儲存，以產生神經網路輸出。神經網路輸出可指示例如物件存在於輸入影像圖框中之可能性及物件在輸入影像圖框中之像素位置。 In some examples, image processing operations may include neural network operations. 10A , the image processor 810 may include data processing circuitry 1006, such as a multilayer convolutional neural network (CNN) including input and output layers, to provide hardware acceleration for neural network operations. The image processor may include computational memory 1002 to store input image frames and a set of weights associated with each neural network layer. The set of weights may represent characteristics of the object to be detected. The image processor may also include a controller 1004 to control the data processing circuit to extract input image frame data and weights from the computing memory. The controller can control the data processing circuitry to perform arithmetic operations, such as multiply-and-accumulate (MAC) operations, between the input image frames and the weights to generate intermediate output data for the input layer. For example, the intermediate output data is post-processed based on excitation functions, merge operations, etc., and then the post-processed intermediate output data may be stored in computational memory. Post-processed intermediate output data may be extracted from computational memory and provided to the next neural network layer as input. The arithmetic operations and the extraction and storage of intermediate output data are repeated for all layers up to the output layer to generate the neural network output. The neural network output may indicate, for example, the likelihood of the object being present in the input image frame and the pixel location of the object in the input image frame.

該控制器可對資料處理電路進行配置以按高效方式處理稀疏影像資料。舉例而言，對於輸入層，該控制器可控制資料處理電路以自計算記憶體僅提取像素之第一子集及對應的權重，並且僅對主動像素及對應的權重執行MAC運算以產生對應於用於輸入層之主動像素的中間輸出之子集。該控制器亦可基於神經網路之拓樸結構及後續的神經網路層當中之連接來判定每一後續的神經網路處之中間輸出資料之子集，其可經追蹤回至主動像素。該控制器可控制資料處理電路以執行MAC運算，以僅產生每一後續的神經網路層處之中間輸出資料之子集。另外，為了縮減對計算記憶體之存取，可在神經網路操作之前將用於每一層之中間輸出資料的預定值（例如，零）儲存於計算記憶體中。僅更新用於主動像素之中間輸出資料。所有該些操作可縮減針對稀疏影像資料之神經網路操作的功率消耗。The controller can configure the data processing circuitry to process sparse image data in an efficient manner. For example, for the input layer, the controller may control the data processing circuitry to extract only a first subset of pixels and corresponding weights from computational memory, and perform MAC operations on only active pixels and corresponding weights to generate a corresponding Subset of intermediate outputs for active pixels of the input layer. The controller can also determine a subset of intermediate output data at each subsequent neural network, which can be traced back to the active pixel, based on the topology of the neural network and connections among subsequent neural network layers. The controller can control the data processing circuitry to perform MAC operations to generate only a subset of the intermediate output data at each subsequent neural network layer. Additionally, to reduce access to computational memory, predetermined values (eg, zeros) for the intermediate output data of each layer may be stored in computational memory prior to neural network operations. Only the intermediate output data for active pixels is updated. All of these operations can reduce power consumption for neural network operations on sparse image data.

在一些實例中，為了進一步縮減功率消耗並且改善功率及計算效率，資料處理電路可僅對第一影像圖框之非靜態像素執行影像處理操作（例如，神經網路操作）以產生用於非靜態像素之經更新輸出。對於靜態像素（其可包括非主動像素），可跳過影像處理操作，同時可保留來自對先前影像圖框之影像處理操作的輸出。在影像處理操作包含神經網路操作之狀況下，該控制器可控制資料處理電路以自計算記憶體僅提取非靜態像素及對應的權重資料，以更新對應於用於輸入層之非靜態像素的中間輸出資料之子集。可針對輸入層保留計算記憶體中之對應於靜態像素（自先前影像圖框獲得）且對應於非主動像素（例如，具有諸如零之預定值）的其餘的中間輸出資料。該控制器亦可基於神經網路之拓樸結構及後續的神經網路層當中之連接來判定每一後續的神經網路處之可經追蹤回至非靜態像素之中間輸出資料之子集，並且僅更新中間輸出資料之子集，以縮減對計算記憶體之存取並且縮減功率消耗。In some examples, to further reduce power consumption and improve power and computational efficiency, the data processing circuitry may perform image processing operations (eg, neural network operations) only on non-static pixels of the first image frame to generate images for non-static pixels The updated output of the pixel. For static pixels (which may include inactive pixels), the image processing operation may be skipped, while the output from the image processing operation on the previous image frame may be retained. In the case where the image processing operations include neural network operations, the controller may control the data processing circuit to extract only non-static pixels and corresponding weight data from the computational memory to update the data corresponding to the non-static pixels used in the input layer. A subset of intermediate output data. The remaining intermediate output data in computational memory corresponding to static pixels (obtained from previous image frames) and corresponding to inactive pixels (eg, having predetermined values such as zero) may be retained for the input layer. The controller may also determine a subset of intermediate output data at each subsequent neural network that can be traced back to non-static pixels based on the topology of the neural network and connections among subsequent neural network layers, and Only a subset of the intermediate output data is updated to reduce access to computing memory and reduce power consumption.

在步驟1508中，程式化地圖產生器812基於處理輸出產生第二程式化資料。第二程式化資料可基於處理輸出反映例如物件之移動、經追蹤物件之部分的改變等。In step 1508, the stylized map generator 812 generates second stylized data based on the processing output. The second programmed data may reflect, for example, movement of the object, changes in portions of the tracked object, etc. based on the processing output.

在步驟1510中，程式化地圖產生器812將第二程式化資料傳輸至影像感測器以產生用於第二影像圖框之第二主動像素。In step 1510, the programmed map generator 812 transmits the second programmed data to the image sensor to generate second active pixels for the second image frame.

本描述之一些部分按關於資訊的操作之演算法及符號表示來描述本發明之實例。熟習資料處理技術者常使用該些演算法描述及表示來將其工作之實質有效地傳達給其他所屬領域具通常知識者。該些操作雖然在功能上、計算上或邏輯上加以描述，但應理解為係藉由電腦程式或等效電路、微碼等來實施。此外，在不失一般性的情況下，將該些操作配置稱為模組，有時亦證明為方便的。所描述操作及其相關聯模組可體現於軟體、韌體及/或硬體中。Portions of this description describe examples of the invention in terms of algorithms and symbolic representations of operations on information. Those skilled in the art of data processing often use these algorithmic descriptions and representations to effectively convey the substance of their work to others of ordinary skill in the art. These operations, although described functionally, computationally, or logically, should be understood to be implemented by computer programs or equivalent circuits, microcode, and the like. Furthermore, it has proven convenient at times, without loss of generality, to refer to these operational configurations as modules. The described operations and their associated modules may be embodied in software, firmware, and/or hardware.

所描述步驟、操作或程序可單獨地或與其他裝置組合地藉由一或多個硬體或軟體模組來執行或實施。在一些實例中，軟體模組係運用包含含有電腦程式碼之電腦可讀取媒體之電腦程式產品實施，電腦程式碼可由電腦處理器執行以用於執行所描述之任何或所有步驟、操作或程序。The steps, operations or procedures described may be performed or implemented by one or more hardware or software modules, alone or in combination with other means. In some examples, a software module is implemented using a computer program product comprising a computer-readable medium containing computer code that can be executed by a computer processor for performing any or all of the steps, operations, or procedures described .

本發明之實例亦可關於用於執行所描述之操作的設備。該設備可經特別建構以用於所需目的，及/或其可包含由儲存於電腦中之電腦程式選擇性地啟動或重組態之通用計算裝置。此電腦程式可儲存於非暫時性有形電腦可讀取儲存媒體或適合於儲存電子指令之任何類型之媒體中，該或該些媒體可耦接至電腦系統匯流排。此外，在本說明書中提及之任何計算系統可包括單一處理器，或可為使用多個處理器設計以用於提高計算能力的架構。Examples of the invention may also relate to apparatus for performing the described operations. This apparatus may be specially constructed for the required purposes, and/or it may comprise a general-purpose computing device selectively activated or reconfigured by a computer program stored in the computer. The computer program may be stored on a non-transitory tangible computer-readable storage medium or any type of medium suitable for storing electronic instructions, which may be coupled to a computer system bus. Furthermore, any computing system referred to in this specification may include a single processor, or may be an architecture designed using multiple processors for increased computing power.

本發明之實例亦可關於由本文中所描述的計算程序產生之產品。此產品可包含產生於計算程序之資訊，且可包括本文中所描述之電腦程式產品或其他資料組合的任何實例，其中該資訊儲存於非暫時性有形電腦可讀取儲存媒體上。Examples of the present invention may also relate to products produced by the computational programs described herein. Such a product may include information generated by a computing program, and may include any instance of the computer program product or other combination of data described herein, where the information is stored on a non-transitory tangible computer-readable storage medium.

用於本說明書中之語言主要出於可讀性及指導性之目的而加以選擇，且其可能尚未經選擇以劃定或限定本發明標的。因此，意欲本發明之範圍不受此詳細描述限制，而實際上由關於基於此處之應用發佈的任何申請專利範圍限制。因此，實例之揭露內容意欲說明但不限制在以下申請專利範圍中闡述的本發明之範圍。The language used in this specification has been principally selected for readability and instructional purposes, and it may not have been selected to delineate or define the inventive subject matter. Therefore, it is intended that the scope of the present invention be limited not by this detailed description, but rather by the scope of any claims issued with respect to applications based herein. Accordingly, the disclosure of examples is intended to illustrate, but not to limit, the scope of the invention set forth in the following claims.

100:近眼顯示器 105:框架 110:顯示器 120a:影像感測器 120b:影像感測器 120c:影像感測器 120d:影像感測器 130:主動照明器 135:眼球 140a:照明器 140b:照明器 140c:照明器 140d:照明器 140e:照明器 140f:照明器 150a:影像感測器 150b:影像感測器 200:橫截面 210:波導顯示器總成 220:眼球 230:出射光瞳 300:波導顯示器 310:源總成 320:輸出波導 325:照明器 330:控制器 340:影像光 350:耦接元件 355:影像光 360:引導元件 365:解耦元件 370:影像感測器 370-1:第一側 370-2:第二側 372:物件 374:光 376:光源 378:光 390:遠端控制台 400:橫截面 402:像素單元 404:機械快門 406:光學濾光片陣列 410:源 415:光學系統 500:系統 510:控制電路 525:位置感測器 530:慣性量測單元（IMU） 535:成像裝置 540:輸入/輸出介面 545:應用程式儲存器 550:追蹤模組 555:引擎 600:影像感測器 601:像素單元 602:光電二極體 603:電子快門開關 604:轉換開關 605:電荷儲存裝置 606:緩衝器 607:量化器 608:量測資料 640:照明器 642:結構化光之圖案 650:物件 652:反射光之圖案 660:像素單元 662:像素單元 664:像素單元 666:像素單元 700:影像圖框 702:所關注區（ROI） 704:物件 710:影像圖框 712:ROI 720:影像圖框 722:車輛 724:人 726:像素 728:像素 730:影像 732:影像 734:像素 736:像素 738:瞳孔 739:閃爍 740:影像 742:像素 744:像素 750:影像 800:成像系統 802:影像感測器 804:主機處理器 806:感測器計算電路 808:像素單元陣列 809:圖框緩衝器 810:影像處理器 812:程式化地圖產生器 814:應用程式 820:第一程式化信號 822:第一影像圖框 824:影像圖框 832:第二程式化信號 840:特徵 842:特徵 850:所關注區（ROI） 852:ROI資訊 854:ROI資訊 904:行控制器 906:列控制器 908a:行匯流排 908b:行匯流排 908c:行匯流排 908n:行匯流排 910a:列匯流排 910b:列匯流排 910n:列匯流排 912:程式化匯流排 920:程式化信號剖析器 926:像素級程式化信號 930:行位址信號 932:列位址信號 940:像素陣列程式化地圖 950:像素單元 951:重置開關 952:光電二極體 952a:光電二極體 952b:光電二極體 954:開關 955:記憶體 956:電流源 958:功率閘極 960:比較器 961:電源開關 961a:電源開關 961b:電源開關 962:輸出邏輯 970:像素單元控制器 972:特徵提取電路 973:機器學習模型 975:比較電路 976:記憶體 1002:計算記憶體 1004:控制器 1006:資料處理電路 1012:前側表面 1020:CNN 1022:輸入影像 1024:第一輸入卷積層 1026:中間輸出張量 1028:合併層 1030:中間輸出張量 1032:第二卷積層 1034:中間輸出張量 1036:合併層 1038:中間輸出張量 1040:全連接層 1042:神經網路輸出 1050:算術電路 1050a:算術電路 1050b:算術電路 1050c:算術電路 1050d:算術電路 1050e:算術電路 1050f:算術電路 1052:加法器 1054:倍增器 1056:後處理電路 1064:輸入像素 1066:輸入像素 1068:中間輸出 1070:中間輸出 1100:稀疏影像 1102:主動像素 1104:主動像素 1106:主動像素 1108:像素 1110:稀疏資料處置電路 1120:停用電路 1122:多工器 1200:主動像素 1202:像素 1204:緩衝器 1212:像素更新模組 1214:緩衝器記憶體 1216:像素更新追蹤表 1220:表 1222:表 1302:神經網路操作控制器 1304:資料傳播控制器 1306:控制信號 1310:資料改變傳播地圖 1314a:黑色區 1314b:黑色區 1314c:黑色區 1314d:黑色區 1314n:黑色區 1316:殘差處置電路 1320:改變臨限值 1330:像素 1332:像素 1400:半導體基板 1402a:半導體基板 1402b:半導體基板 1404:豎直互連件 1406:豎直互連件 1408:背側表面 1410:前側表面 1412:前側表面 1420:肩部區 1422:肩部區 1500:方法 1502:步驟 1504:步驟 1506:步驟 1508:步驟 1510:步驟 A:方向 A ₀₀:像素級程式化資料 AB:控制信號 BIAS1:偏置信號 BIAS2:偏置信號 B:方向 C:時間常數 D:方向 GAIN:增益控制信號 i:輸入資料元素 P ₀₀:框/像素單元 P ₀₁:框/像素單元 P _0j:框/像素單元 p_in:輸入部分總和 p_out:輸出部分總和 PWR_GATE:控制信號 S:預定像素值 S0:差 T0:時間 T1:時間 T2:時間 T3:時間 T4:時間 TG:控制信號 VREF:參考電壓/控制信號 [W ₀]:權重矩陣 [W ₁]:第二權重陣列 [W ₂]:第三權重陣列 w:權重資料元素 100: Near Eye Display 105: Frame 110: Display 120a: Image Sensor 120b: Image Sensor 120c: Image Sensor 120d: Image Sensor 130: Active Illuminator 135: Eyeball 140a: Illuminator 140b: Illuminator 140c: Illuminator 140d: Illuminator 140e: Illuminator 140f: Illuminator 150a: Image Sensor 150b: Image Sensor 200: Cross Section 210: Waveguide Display Assembly 220: Eyeball 230: Exit Pupil 300: Waveguide Display 310: source assembly 320: output waveguide 325: illuminator 330: controller 340: image light 350: coupling element 355: image light 360: guiding element 365: decoupling element 370: image sensor 370-1: th Side 370-2: Second Side 372: Object 374: Light 376: Light Source 378: Light 390: Remote Console 400: Cross Section 402: Pixel Unit 404: Mechanical Shutter 406: Optical Filter Array 410: Source 415 : Optical System 500: System 510: Control Circuit 525: Position Sensor 530: Inertial Measurement Unit (IMU) 535: Imaging Device 540: Input/Output Interface 545: Application Storage 550: Tracking Module 555: Engine 600 : image sensor 601 : pixel unit 602 : photodiode 603 : electronic shutter switch 604 : switch 605 : charge storage device 606 : buffer 607 : quantizer 608 : measurement data 640 : illuminator 642 : structured Pattern of light 650: Object 652: Pattern of reflected light 660: Pixel unit 662: Pixel unit 664: Pixel unit 666: Pixel unit 700: Image frame 702: Region of interest (ROI) 704: Object 710: Image frame 712 :ROI 720:ImageFrame722:Vehicle 724:People726:Pixel728:Pixel730:Image732:Image734:Pixel736:Pixel738:Pupil739:Flicker740:Image742:Pixel744:Pixel750:Image800 : Imaging System 802 : Image Sensor 804 : Host Processor 806 : Sensor Computing Circuit 808 : Pixel Cell Array 809 : Frame Buffer 810 : Image Processor 812 : Programmable Map Generator 814 : Application 820 : First Stylized Signal 822: First Image Frame 824: Image Frame 832: Second Stylized Signal 840: Feature 842: Feature 850: Region of Interest (ROI) 852: ROI Information 854: ROI Information 904: Line Control Controller 906:Column Controller 908a:Row Bus 908b:Row Bus 908c:Row Bus 908n:Row Bus 910a:Column Bus 910b:Column Bus 910n:Column Bus 912:Programmed Bus 920:Program Programmable Signal Parser 926: Pixel-Level Programmable Signals 930: Row Address Signals 932: Column Address Signals 940: Pixel Array Stylized Map 950: Pixel Cell 951: Reset Switch 952: Photodiode 952a: Photodiode 952b: Photodiode 954: Switch 955 : Memory 956: Current Source 958: Power Gate 960: Comparator 961: Power Switch 961a: Power Switch 961b: Power Switch 962: Output Logic 970: Pixel Cell Controller 972: Feature Extraction Circuit 973: Machine Learning Model 975: Comparison Circuit 976: Memory 1002: Computational Memory 1004: Controller 1006: Data Processing Circuit 1012: Front Surface 1020: CNN 1022: Input Image 1024: First Input Convolutional Layer 1026: Intermediate Output Tensor 1028: Merging Layer 1030: Intermediate output tensor 1032: Second convolutional layer 1034: Intermediate output tensor 1036: Merging layer 1038: Intermediate output tensor 1040: Fully connected layer 1042: Neural network output 1050: Arithmetic circuit 1050a: Arithmetic circuit 1050b: Arithmetic circuit 1050c : Arithmetic circuit 1050d: Arithmetic circuit 1050e: Arithmetic circuit 1050f: Arithmetic circuit 1052: Adder 1054: Multiplier 1056: Post-processing circuit 1064: Input pixel 1066: Input pixel 1068: Intermediate output 1070: Intermediate output 1100: Sparse image 1102: Active Pixel 1104: Active Pixel 1106: Active Pixel 1108: Pixel 1110: Sparse Data Handling Circuit 1120: Disable Circuit 1122: Multiplexer 1200: Active Pixel 1202: Pixel 1204: Buffer 1212: Pixel Update Module 1214: Buffer memory 1216: pixel update tracking table 1220: table 1222: table 1302: neural network operation controller 1304: data propagation controller 1306: control signal 1310: data change propagation map 1314a: black area 1314b: black area 1314c: black area 1314d: Black area 1314n: Black area 1316: Residual processing circuit 1320: Change threshold value 1330: Pixel 1332: Pixel 1400: Semiconductor substrate 1402a: Semiconductor substrate 1402b: Semiconductor substrate 1404: Vertical interconnect 1406: Vertical interconnect Connector 1408: Back Surface 1410: Front Surface 1412: Front Surface 1420: Shoulder Region 1422: Shoulder Region 1500: Method 1502: Step 1504: Step 1506: Step 1508: Step 1510: Step A: Direction A ₀₀ : Pixel Stage programming data AB: Control signal BIAS1: Bias signal BIAS2: Bias signal B: Direction C: Time constant D: Direction GAIN: Gain control signal i: Input data element P ₀₀ : Box/pixel unit P ₀₁ : Box/ Pixel unit P _0j : frame/pixel unit p_in: input partial sum p_out: output partial sum PWR_GATE: control signal S: predetermined pixel value S0: difference T0: time T1: time T2: time T3: time T4: time TG: control signal VREF: reference voltage/control signal [ W ₀ ]: weight matrix [W ₁ ]: second weight array [W ₂ ]: third weight array w: weight data element

參考以下諸圖描述說明性實例。Illustrative examples are described with reference to the following figures.

[ 圖 1A] 及 [ 圖 1B]為近眼顯示器之實例的圖式。 [ FIG. 1A] and [ FIG. 1B] are diagrams of examples of near-eye displays.

[ 圖 2]為近眼顯示器之橫截面的實例。 [ FIG. 2] is an example of a cross section of a near-eye display.

[ 圖 3]說明具有單個源總成之波導顯示器之實例的等角視圖。 [ FIG. 3] An isometric view illustrating an example of a waveguide display with a single source assembly.

[ 圖 4]說明波導顯示器之實例的橫截面。 [ FIG. 4] A cross section illustrating an example of a waveguide display.

[ 圖 5]係包括近眼顯示器之系統之實例的方塊圖。 [ FIG. 5] is a block diagram of an example of a system including a near-eye display.

[ 圖 6A] 及 [ 圖 6B]說明影像感測器及其操作之實例。 [ FIG. 6A] and [ FIG. 6B] illustrate an example of an image sensor and its operation.

[ 圖 7A] 、 [ 圖 7B] 、 [ 圖 7C] 及 [ 圖 7D]說明圖 6A 及圖 6B之影像感測器的輸出所支援之應用程式的實例。 [ FIG. 7A] , [ FIG. 7B] , [ FIG. 7C] , and [ FIG. 7D] illustrate examples of applications supported by the output of the image sensor of FIGS. 6A and 6B .

[ 圖 8A ] 及 [ 圖 8B ]說明用以支援圖 7A 至圖 7D中所說明的操作之成像系統的實例。 [ FIG. 8A ] and [ FIG. 8B ] illustrate an example of an imaging system to support the operations illustrated in FIGS. 7A - 7D .

[ 圖 9A] 、 [ 圖 9B] 及 [ 圖 9C]說明圖 8A 及圖 8B之成像系統的實例內部組件及其操作。 [ FIG. 9A] , [ FIG. 9B] , and [ FIG. 9C] illustrate example internal components and operations of the imaging system of FIGS. 8A and 8B .

[ 圖 10A] 、 [ 圖 10B] 及 [ 圖 10C]說明圖 8A 及圖 8B之影像處理器的實例內部組件及其操作。 [ FIG. 10A] , [ FIG. 10B] , and [ FIG. 10C] illustrate example internal components and operations of the image processor of FIGS. 8A and 8B .

[ 圖 11A] 、 [ 圖 11B] 及 [ 圖 11C]說明圖 10A 至圖 10C之影像處理器的實例內部組件及其操作。 [ FIG. 11A] , [ FIG. 11B] , and [ FIG. 11C] illustrate example internal components and operations of the image processor of FIGS. 10A to 10C .

[ 圖 12A] 、 [ 圖 12B] 及 [ 圖 12C]說明圖 8A及圖 8B之圖框緩衝器的實例內部組件及其操作。 [ FIG. 12A] , [ FIG. 12B] and [ FIG. 12C] illustrate example internal components of the frame buffer of FIGS. 8A and 8B and their operation.

[ 圖 13A] 、 [ 圖 13B] 及 [ 圖 13C]說明圖 10A 至圖 10C之影像處理器的實例內部組件及其操作。 [ FIG. 13A] , [ FIG. 13B] , and [ FIG. 13C] illustrate example internal components and operations of the image processor of FIGS. 10A - 10C .

[ 圖 14A] 及 [ 圖 14B]說明圖 8A 至圖 13C之影像感測器的實體配置之實例。 [ FIG. 14A] and [ FIG. 14B] illustrate an example of a physical configuration of the image sensor of FIGS. 8A to 13C .

[ 圖 15]說明操作影像感測器之實例程序之流程圖。 [ FIG. 15] A flowchart illustrating an example procedure for operating an image sensor.

諸圖僅出於說明之目的描繪本發明之實例。熟習此項技術者將易於自以下描述認識到，在不脫離本發明之原理或所宣稱之優點的情況下，可使用說明的結構及方法之替代性實例。The figures depict examples of the invention for purposes of illustration only. Those skilled in the art will readily recognize from the following description that alternative examples of the structures and methods illustrated may be employed without departing from the principles or claimed advantages of the invention.

在附圖中，類似組件及/或特徵可具有相同元件符號。此外，可藉由在元件符號之後加上破折號及在類似組件之間進行區分之第二標記來區分相同類型之各種組件。若在說明書中僅使用第一元件符號，則描述適用於具有相同第一元件符號的類似組件中之任一者而與第二元件符號無關。In the drawings, similar components and/or features may have the same reference numerals. In addition, various components of the same type may be distinguished by adding a dash after the reference symbol and a second designation to distinguish between similar components. If only the first reference number is used in the specification, the description applies to any of the similar components having the same first reference number regardless of the second reference number.

800:成像系統 800: Imaging System

802:影像感測器 802: Image Sensor

804:主機處理器 804: Host processor

806:感測器計算電路 806: Sensor calculation circuit

808:像素單元陣列 808: Pixel cell array

809:圖框緩衝器 809: Frame buffer

810:影像處理器 810: Image Processor

812:程式化地圖產生器 812: Stylized Map Generator

814:應用程式 814: Application

820:第一程式化信號 820: First programmed signal

822:第一影像圖框 822: First image frame

824:影像圖框 824: Image frame

832:第二程式化信號 832: Second programmed signal

Claims

A device comprising: an image sensor including a plurality of pixel units, the image sensor configurable by programming data to select a subset of the pixel units to generate active pixels; a frame buffer; and A sensor computing circuit configured to: A first image frame is received from the frame buffer, including first active pixels and first inactive pixels, the first active pixels being one of the pixel units selected based on the first programming data generating a first subset, the first inactive pixels corresponding to a second subset of the pixel units not selected for generating the first active pixels; performing an image processing operation on a first subset of pixels of the first image frame to generate a processing output in which a second subset of pixels of the first image frame is excluded from the image processing operation outside; generating second programmed data based on the processing output; and The second programming data is transmitted to the image sensor to select a second subset of the pixel cells to generate second active pixels for a second image frame.

The apparatus of claim 1, wherein the image processing operation includes a processing operation of a neural network model to detect an object of interest in the first image frame; and wherein the first subset of pixels corresponds to the object of interest.

The apparatus of claim 2, wherein the sensor computing circuit is coupled to a host device configured to execute an application that uses the results of detection of the object of interest; and wherein the sensor computing circuit is configured to receive information about the object of interest from the host device.

The apparatus of claim 2, wherein the sensor computing circuit comprises: a computational memory configured to store: input data to a neural network layer of the neural network, weight data for the neural network layer, and intermediate output data for the neural network layer; a data processing circuit configured to perform arithmetic operations of the neural network layer on the input data and the weight data to generate the intermediate output data; and A compute controller configured to: A first subset of the input data and a first subset of the weight data corresponding to the first subset of input data are extracted from the computing memory, the first subset of the input data corresponding to the at least some of the first active pixels; controlling the data processing circuit to perform the arithmetic operations on the first subset of input data and the first subset of weight data to generate a first one of the intermediate output data for the first image frame a subset, the first subset of the intermediate output data corresponds to the first subset of the input data; storing the first subset of the intermediate output data for the first image frame in the computing memory; and A predetermined value is stored in the computing memory for a second subset of the intermediate output data of the first image frame, the second subset of the intermediate output data corresponding to the inactive pixels.

The apparatus of claim 4, wherein the x value is stored based on resetting the computational memory prior to the image processing operation.

The apparatus of claim 4, wherein the computing controller is configured to: extracting the input data from the computing memory; identifying the first subset of the input data from the extracted input data; and The identified first subset of the input data is provided to the computing controller.

The apparatus of claim 4, wherein the computing controller is configured to: determining an address region of the computing memory that stores the first subset of the input data; and The first subset of the input data is extracted from the computing memory.

7. The apparatus of claim 7, wherein the address zone is determined based on at least one of: the first programmed data, or about connectivity between neural network layers of the neural network model News.

The apparatus of claim 4, wherein: the first active pixels include static pixels and non-static pixels; The static pixels correspond to a first subset of the first active pixels, and for the first subset of the first active pixels, pixel values between the first image frame and a previous image frame The degree of change is higher than a change threshold; The non-static pixels correspond to a second subset of the first active pixels, for the second subset of the first active pixels, the distance between the first image frame and the previous image frame some pixel values are changed less than the change threshold; and The computing controller is configured to extract the first subset of the input data corresponding to the non-static pixels of the first active pixels.

The apparatus of claim 9, wherein the predetermined value is a first predetermined value; wherein the frame buffer is configured to store a second predetermined value for each of the static pixels to identify the static pixels; and wherein the computing controller is configured to exclude the static pixels from the data processing circuit based on detecting that the static pixels have the second predetermined value.

The apparatus of claim 10, wherein the frame buffer is configured to store the second for a pixel across a threshold number of frames based on a determination that the degree of change of the pixel is below the change threshold predetermined value.

The apparatus of claim 10, wherein the frame buffer is configured to update the setting based on a leaky integrator function having a time constant and based on the last time a pixel experienced a degree of change greater than one of the change thresholds A pixel value of a pixel.

The apparatus of claim 9, wherein the computing controller is configured to: determining a data change propagation map based on a topology of the neural network model, the data change propagation map indicating how changes to the non-static pixels propagate through different neural network layers of the neural network model; Based on the data change propagation map, determine a first address region of the computing memory for extracting the first subset of the input data, and determine a second address region of the computing memory for storing the first subset of the intermediate output data; extract the first subset of the input data from the first address area; and The first subset of the intermediate output data is stored in the second address area.

9. The apparatus of claim 9, wherein the computing controller is configured to determine the change threshold based on a depth of the neural network model and a quantization accuracy at each neural network layer of the neural network model .

The apparatus of claim 9, wherein the change threshold is a first change threshold; and where the compute controller is configured to: tracking the degree of change in the pixel values of the first active pixels between two non-consecutive frames; and A third subset of the first active pixels is determined to be non-static pixels based on the degree of change exceeding a second change threshold.

The apparatus of claim 1, wherein the image sensor is implemented in a first semiconductor substrate; wherein the frame buffer and the sensor computing circuit are implemented in one or more second semiconductor substrates; and Wherein the first semiconductor substrate and the one or more second semiconductor substrates form a stack and are accommodated in a single semiconductor package.

A method that includes: transmitting the first programming data to an image sensor including a plurality of pixel units to select a first subset of the pixel units to generate a first active pixel; A first image frame including the first active pixels and first inactive pixels corresponding to not selected for generating the first active pixels is received from a frame buffer a second subset of the pixel units; performing an image processing operation on a first subset of pixels of the first image frame to generate a processing output in which a second subset of pixels of the first image frame is excluded from the image processing operation outside; generating second programmed data based on the processing output; and The second programming data is transmitted to the image sensor to select a second subset of the pixel cells to generate second active pixels for a second image frame.

The method of claim 17, wherein the image processing operation includes a processing operation of a neural network to detect an object of interest in the first image frame; and wherein the first subset of pixels corresponds to the object of interest.

The method of claim 18, further comprising: Store input data to the neural network layer of the neural network, weight data of the neural network layer in a computing memory; A first subset of the input data and a first subset of the weight data corresponding to the first subset of input data are extracted from the computing memory, the first subset of the input data corresponding to the at least some of the first active pixels; using a data processing circuit to perform arithmetic operations on the first subset of input data and the first subset of weight data to generate a first subset of intermediate output data for the first image frame, the the first subset of intermediate output data corresponds to the first subset of the input data; storing the first subset of the intermediate output data for the first image frame in the computing memory; and A predetermined value for a second subset of the intermediate output data of the first image frame is stored in the computing memory, the second subset of the intermediate output data corresponding to the inactive pixels.

The method of claim 19, wherein: the first active pixels include static pixels and non-static pixels; The static pixels correspond to a first subset of the first active pixels, and for the first subset of the first active pixels, pixel values between the first image frame and a previous image frame The degree of change is higher than a change threshold; The non-static pixels correspond to a second subset of the first active pixels, for the second subset of the first active pixels, the distance between the first image frame and the previous image frame some pixel values are changed less than the change threshold; and The first subset of the input data corresponds to the non-static pixels of the first active pixels.