CN117834830A

CN117834830A - Image processor, processing method, storage medium, and augmented reality display device

Info

Publication number: CN117834830A
Application number: CN202211185033.9A
Authority: CN
Inventors: 贾韬; 王超昊; 乐诚瑞
Original assignee: Universal Gravitation Ningbo Electronic Technology Co ltd
Current assignee: Universal Gravitation Ningbo Electronic Technology Co ltd
Priority date: 2022-09-27
Filing date: 2022-09-27
Publication date: 2024-04-05
Also published as: WO2024066661A1

Abstract

The invention provides an image processor, an image processing method, a storage medium and an augmented reality display device. The image processor is configured to: acquiring an eye movement signal of a user and a first image to be processed; determining a gaze region in the first image in dependence on the eye movement signal; and performing super-resolution processing on the staring region to obtain a second image with local resolution of the staring region reaching the target resolution. By adopting the configuration, the image processor can determine the staring region according to the eye movement signal of the user and perform targeted super-resolution processing on the staring region, thereby meeting the high-resolution display requirement of the user on the staring region and reducing the delay and hardware requirement of high-resolution display.

Description

Image processor, processing method, storage medium, and augmented reality display device

Technical Field

The present invention relates to an augmented reality display technology, and more particularly, to an image processor, an image processing method, a computer-readable storage medium, and an augmented reality display device.

Background

Augmented Reality (XR) display technology refers to technology that combines Reality with Virtual through a computer to create a Virtual environment that can be man-machine interacted with, including but not limited to augmented Reality (Augmented Reality, AR) display technology, virtual Reality (VR) display technology, mixed Reality (MR) display technology. By combining the three visual interaction technologies, the augmented reality display technology can bring the immersion of seamless transition between the virtual world and the real world to the experimenter.

Aiming at the high-resolution display requirement of the XR field, the super-resolution technology of the existing XR equipment cannot provide accurate and efficient super-resolution processing for the gaze area of a user. In order to overcome the above-mentioned drawbacks of the prior art, there is a need in the art for an augmented reality display technology that satisfies the high resolution display requirements of users for gaze areas and reduces the latency and hardware requirements of high resolution displays.

Disclosure of Invention

The following presents a simplified summary of one or more aspects in order to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated aspects, and is intended to neither identify key or critical elements of all aspects nor delineate the scope of any or all aspects. Its sole purpose is to present some concepts of one or more aspects in a simplified form as a prelude to the more detailed description that is presented later.

In order to overcome the above-mentioned drawbacks of the prior art, the present invention provides an image processor, an image processing method, a computer readable storage medium, and an augmented reality display device, which can determine a gaze area according to an eye movement signal of a user, and perform targeted super-resolution processing on the gaze area, so as to meet a high-resolution display requirement of the user for the gaze area, and reduce delay and hardware requirements of the high-resolution display.

Specifically, the above-described image processor provided according to the first aspect of the present invention is configured to: acquiring an eye movement signal of a user and a first image to be processed; determining a gaze region in the first image in dependence on the eye movement signal; and performing super-resolution processing on the staring region to obtain a second image with local resolution of the staring region reaching the target resolution.

Further, in some embodiments of the invention, the image processor is coupled to an eye tracker. Said based on said eye movement signal. The step of determining a gaze area in said first image comprises: performing eye tracking calculation according to an eye movement signal acquired from the eye movement instrument so as to determine a gaze point position of the user in the first image; updating gaze point information according to the gaze point location to construct an updated compressed model; and determining coordinates of a plurality of partitions with respect to the gaze point location according to compression parameters of the updated compression model, wherein at least one of the plurality of partitions includes the gaze region.

Further, in some embodiments of the present invention, the step of performing super-resolution processing on the gaze region to obtain a second image with a local resolution of the gaze region reaching a target resolution includes: determining an up-sampling magnification of at least one gaze area comprising and/or adjacent to the gaze point location according to compression parameters of the updated compression model; and performing corresponding super-resolution processing on each gaze area of the first image according to the coordinates and the up-sampling multiplying power of each gaze area to obtain the second image.

Further, in some embodiments of the invention, the image processor is further configured to: determining a non-gaze region in the first image in dependence on the eye movement signal; and compressing the non-staring area to obtain a second image with local resolution of the staring area reaching the target resolution and the local resolution of the non-staring area being smaller than the target resolution.

Further, in some embodiments of the present invention, the step of compressing the non-gaze area to obtain a second image with a local resolution of the gaze area reaching a target resolution, and the local resolution of the non-gaze area being smaller than the target resolution comprises: determining a downsampling ratio of at least one non-gaze area away from the gaze point location according to compression parameters of the updated compression model; and performing corresponding compression processing on each non-gaze area of the first image according to the coordinates and the downsampling magnification of each non-gaze area to obtain the second image.

Further, in some embodiments of the invention, a plurality of the partitions have the same and/or different upsampling rates, and/or a plurality of the partitions have the same and/or different downsampling rates.

Further, in some embodiments of the invention, the image processor is further configured to: extracting image features from the first image via a pre-trained feature extraction model; determining a corresponding up-sampling convolution kernel or down-sampling convolution kernel according to the extracted image features; and performing corresponding super-resolution processing on each gaze region of the first image according to the coordinates, the upsampling magnification and the upsampling convolution kernel of each gaze region, or performing corresponding compression processing on each non-gaze region of the first image according to the coordinates, the downsampling magnification and the downsampling convolution kernel of each non-gaze region, so as to obtain the second image.

Further, in some embodiments of the invention, the image processor is further configured to: identifying a first resolution of the first image; responding to the recognition result that the first resolution is lower than the target resolution, and performing super-resolution processing on the staring area to obtain a second image of which the local resolution of the staring area reaches the target resolution; and responding to the recognition result that the first resolution is higher than the target resolution, performing the compression processing on the non-staring area to obtain a second image that the local resolution of the staring area reaches the target resolution and the local resolution of the non-staring area is smaller than the target resolution.

Further, in some embodiments of the invention, the image processor includes a display pipeline. The display pipeline has integrated therein a software processing unit and at least one image processing hardening unit and is configured to: and according to the eye movement signal, performing the super-resolution processing on the first image by using first software configured in the software processing unit and the at least one image processing hardening unit, and performing the compression processing on the first image by using second software configured in the software processing unit and the at least one image processing hardening unit to obtain the second image.

Further, in some embodiments of the present invention, third software is further configured in the software processing unit. The display pipeline is further configured to: and performing distortion correction processing on the first image by using the third software configured in the software processing unit and the at least one image processing hardening unit.

Further, in some embodiments of the present invention, the at least one image processing hardening unit is selected from at least one of a buffer memory, a weighted sum circuit, a mean value calculation circuit, a filter circuit, and a mapping circuit of pixel position relationships.

Further, in some embodiments of the present invention, the cached data stored in the cache memory includes real-time image data of at least one pixel in the first image and/or the second image of the current frame and/or historical image data of at least one pixel in the first image and/or the second image of the previous frame.

Further, in some embodiments of the invention, the display pipeline is connected to a main processor of an augmented reality display device. The step of acquiring the eye movement signal of the user and the first image to be processed comprises the following steps: a real and/or virtual image is acquired via the main processor via gaze point rendering compression, wherein the gaze point rendering compression is implemented based on the eye movement signal.

Further, the above-described image processing method provided according to the second aspect of the present invention includes the steps of: acquiring an eye movement signal of a user and a first image to be processed; determining a gaze region in the first image in dependence on the eye movement signal; and performing super-resolution processing on the staring region to obtain a second image with local resolution of the staring region reaching the target resolution.

Further, the above-described computer-readable storage medium according to the third aspect of the present invention has stored thereon computer instructions. The computer instructions, when executed by a processor, implement the above-mentioned image processing method provided by the second aspect of the present invention.

In addition, the above-mentioned augmented reality display device provided according to the fourth aspect of the present invention includes an eye tracker, a main processor, a coprocessor, and a display terminal. The eye movement instrument is used for collecting eye movement signals of a user. The main processor outputs a first image with or without gaze point rendering compression, wherein the gaze point rendering compression is implemented based on the eye movement signal. The coprocessor may be the image processor provided in the first aspect of the present invention, where the image processor is connected to the eye tracker and the main processor respectively, so as to obtain the first image and the eye movement signal. The display terminal is connected with the image processor to acquire and display a second image subjected to super-resolution processing of the image processor.

Drawings

The above features and advantages of the present invention will be better understood after reading the detailed description of embodiments of the present disclosure in conjunction with the following drawings. In the drawings, the components are not necessarily to scale and components having similar related features or characteristics may have the same or similar reference numerals.

Fig. 1 illustrates a flow diagram of an image processing method provided in accordance with some embodiments of the present invention.

Fig. 2 illustrates a schematic diagram of an image partial compression process provided in accordance with some embodiments of the invention.

Fig. 3 illustrates a schematic diagram of an augmented reality display device provided in accordance with some embodiments of the present invention.

Fig. 4 illustrates a flow diagram of a recompressed image provided in accordance with some embodiments of the present invention.

Detailed Description

Further advantages and effects of the present invention will become apparent to those skilled in the art from the disclosure of the present specification, by describing the embodiments of the present invention with specific examples. While the description of the invention will be presented in connection with a preferred embodiment, it is not intended to limit the inventive features to that embodiment. Rather, the purpose of the invention described in connection with the embodiments is to cover other alternatives or modifications, which may be extended by the claims based on the invention. The following description contains many specific details for the purpose of providing a thorough understanding of the present invention. The invention may be practiced without these specific details. Furthermore, some specific details are omitted from the description in order to avoid obscuring the invention.

In the description of the present invention, it should be noted that, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.

In addition, the terms "upper", "lower", "left", "right", "top", "bottom", "horizontal", "vertical" as used in the following description should be understood as referring to the orientation depicted in this paragraph and the associated drawings. This relative terminology is for convenience only and is not intended to be limiting of the invention as it is described in terms of the apparatus being manufactured or operated in a particular orientation.

It will be understood that, although the terms "first," "second," "third," etc. may be used herein to describe various elements, regions, layers and/or sections, these elements, regions, layers and/or sections should not be limited by these terms and these terms are merely used to distinguish between different elements, regions, layers and/or sections. Accordingly, a first component, region, layer, and/or section discussed below could be termed a second component, region, layer, and/or section without departing from some embodiments of the present invention.

As described above, for the high resolution display requirement in the XR field, the existing XR device generally performs super resolution processing of different dimensions on multiple different regions in the low-definition picture, and then splices and nests each region together, so as to improve the equivalent resolution of a part of the regions in the low-definition picture. However, the conventional super-resolution technology is mainly implemented based on artificial intelligence (Artificial Intelligence, AI) model training of natural picture samples at present and is mainly implemented for still images or mobile phone videos, on one hand, low-latency and hardware implementation is not pursued, on the other hand, the actual application scene displayed by combining with XR is not pursued, and accurate and efficient super-resolution processing cannot be provided for the gaze area of a user.

In some non-limiting embodiments, the above-mentioned image processing method provided by the second aspect of the present invention may be implemented via the above-mentioned image processor provided by the first aspect of the present invention. Specifically, the image processor may be configured with or connected to a processing unit and a storage unit of a software program. The storage unit includes, but is not limited to, the above-described computer-readable storage medium provided in the third aspect of the present invention, on which computer instructions are stored. The processing unit is connected to the storage unit and configured to execute computer instructions stored on the storage unit to implement the above-mentioned image processing method provided in the second aspect of the present invention.

Referring to fig. 1, fig. 1 is a flow chart illustrating an image processing method according to some embodiments of the invention.

As shown in fig. 1, in the process of performing image processing, the image processor may first acquire an eye movement signal of a user and a low-definition image to be processed (i.e., a first image), and determine at least one gaze area including a near gaze point position in the first image according to the eye movement signal.

In particular, the image processor may be connected to an external eye tracker to acquire the user eye movement signals it collects. Then, the image processor may perform an eye tracking calculation based on the eye movement signal acquired from the eye movement device to determine a gaze point location of the user in the first image, and update gaze point information based on the gaze point location to construct an updated compressed model. Still further, the image processor may divide the first image into a plurality of partitions with respect to gaze point locations according to compression parameters of the updated compression model, and determine coordinate ranges of the partitions, respectively. Here, the plurality of zones with respect to gaze point location should include at least one gaze area comprising a proximal gaze point location and may preferably include at least one non-gaze area distal from the gaze point location. The compressed model may be an artificial intelligence model trained on a large number of super-resolution processed image samples based on an XR display device, the technical details of which will be described later.

With continued reference to fig. 1, after determining at least one gaze area, the image processor may perform local super-resolution processing on each pixel point in the gaze area according to the coordinate range of the gaze area, so as to obtain a high-definition image (i.e. a second image) with local resolution of the gaze area reaching a target resolution (e.g. 40 pixels/degree).

Specifically, in the process of performing super-resolution processing, the image processor may determine an upsampling magnification of at least one gaze area including and/or adjacent to the gaze point position according to the compression parameter of the updated compression model, and then perform super-resolution processing of corresponding magnification on each gaze area in the first image according to the coordinate range and the upsampling magnification of each gaze area, so as to obtain the second image with the local resolution reaching the target resolution. Preferably, each gaze region including and/or adjacent to the gaze point location may have the same and/or different upsampling magnification depending on its distance from the gaze point location, so as to achieve a display effect that gradually increases the resolution towards the gaze point location.

Further, for the at least one non-gaze area far from the gaze point position, the image processor may further perform compression processing on the at least one non-gaze area to obtain a second image with a local resolution of the gaze area reaching the target resolution, and the local resolution of the non-gaze area being smaller than the target resolution.

Specifically, in the process of compressing the non-gaze area, the image processor may determine the downsampling ratio of at least one non-gaze area far from the gaze point position according to the compression parameter of the updated compression model, and then perform compression processing of the corresponding ratio on each non-gaze area of the first image according to the coordinate range and the downsampling ratio of each non-gaze area, so as to obtain a second image with the local resolution of the non-gaze area smaller than the target resolution. Preferably, each non-gaze region distant from the gaze point location may have the same and/or different downsampling magnification depending on its distance from the gaze point location to achieve a display effect that gradually decreases in resolution with distance from the gaze point location.

Still further, in some embodiments of the present invention, in response to acquiring a first image to be processed during image processing, the image processor may first identify a first resolution of the first image. In response to the recognition that the first resolution is lower than the target resolution (e.g., 40 pixels/degree), the image processor may super-resolve the gaze region of the first image to obtain a second image with a local resolution of the gaze region that reaches the target resolution. Conversely, in response to the recognition that the first resolution is higher than the target resolution (e.g., 40 pixels/degree), the image processor may compress the non-gaze region of the first image to obtain a second image in which the local resolution of the gaze region reaches the target resolution and the local resolution of the non-gaze region is less than the target resolution.

Specifically, the compression model may be composed of a feature extraction model and a plurality of candidate convolutional neural network models. In doing the image processing, the image processor may first extract image features from the first image via a pre-trained feature extraction model. Then, in response to the recognition result that the first resolution of the first image is lower than the target resolution (for example, 40 pixels/degree), the image processor may select an up-sampling convolution kernel corresponding to the up-sampling magnification and/or the pixel mapping relation according to the extracted image feature, and perform corresponding super-resolution processing on each gaze region of the first image according to the coordinate range, the up-sampling magnification and the up-sampling convolution kernel of each gaze region, so as to obtain a second image in which the local resolution of the gaze region reaches the target resolution, thereby preserving the image feature as much as possible. Otherwise, in response to the recognition result that the first resolution is higher than the target resolution (for example, 40 pixels/degree), the image processor may select a downsampling convolution kernel corresponding to the downsampling magnification and/or the pixel mapping relation according to the extracted image features, and perform corresponding low-pass filtering and compression processing on each non-gaze region of the first image according to the coordinates, downsampling magnification and downsampling convolution kernel of each non-gaze region, so as to obtain a second image in which the local resolution of the gaze region reaches the target resolution and the local resolution of the non-gaze region is smaller than the target resolution, thereby avoiding blurring and aliasing of the image.

By adopting a pre-trained feature extraction model to extract image features and selecting a corresponding convolution kernel according to the extracted image features to perform image recompression processing (namely up-sampling processing and down-sampling processing), the invention can effectively reserve the image features in the first image and avoid blurring and aliasing of the image.

As shown in fig. 2, compared with the conventional super-resolution technology of the existing XR device, the method and the device determine the gaze area according to the eye movement signal of the user, and perform targeted super-resolution processing on the gaze area, so that the high-resolution display requirement of the user on the gaze area can be met more accurately. In addition, by performing local super-resolution processing on the gaze area, the invention can cancel the super-resolution processing operation on other non-gaze areas, reduce the up-sampling multiplying power of the non-gaze areas, or perform compression processing of down-sampling on the non-gaze areas, thereby reducing the delay of high-resolution display and the hardware requirement of an image processor.

Furthermore, in some embodiments of the present invention, a Display Pipeline (Display Pipeline) may also be configured in the image processor. The display pipeline is integrated with a software processing unit and at least one image processing hardening unit and is configured to perform (re) compression processing on the first image by using the software processing unit and the at least one image processing hardening unit to obtain a second image which meets the high resolution display requirement of a user for a staring area and reduces the delay and hardware requirement of high resolution display.

Referring specifically to fig. 3, fig. 3 is a schematic diagram illustrating an augmented reality display device according to some embodiments of the present invention.

In the embodiment shown in fig. 3, the above-described augmented reality display device according to the fourth aspect of the present invention may be configured with a coprocessor 10, an eye tracker 20, a main processor 30, a display driving chip 40, and a display terminal 50. The co-processor 10 may be optionally an image processor as described above as provided in the first aspect of the invention, wherein a display pipeline is arranged. The display pipeline is integrated with a software processing unit and at least one image processing hardening unit, and is configured to adopt different software programs and the same image processing hardening unit to carry out super-resolution, compression and other image processing flows, so that the storage, processing and transmission efficiency of data are improved, the high-resolution display requirement of a user for a staring area is met based on limited software and hardware resources, and the time delay of the high-resolution display and a second image of the hardware requirement are reduced. The eye tracker 20 is used to acquire eye movement signals such as gaze position, gaze direction, etc. of the user. The main processor 30 is selected from a common image processor such as a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), etc., and is used for outputting a first image to be processed by the coprocessor 10. The display driver chip 40 may be integrated with the display terminal 50, and is used for acquiring the second image processed by the coprocessor 10, and displaying the image through a pixel array circuit and an OLED/LED/LCD pixel array configured on the display terminal 50.

In particular, the software processing units configured in the display pipeline of coprocessor 10 include, but are not limited to, up-sampling units and/or down-sampling units. The image processing hardening unit configured in the display pipeline may be selected from at least one of a transistor-level buffer memory, a weighted summation circuit, a mean calculation circuit, a filter circuit, a mapping circuit of pixel location relationships for buffering a plurality of pixel buffer data in a first image of a current frame and/or a number of previous history frames and/or performing weighted summation, mean calculation, filtering and/or hardening calculation of pixel location relationship mapping on the pixel buffer data.

In the process of image processing, the display pipeline may first acquire an eye movement signal of a user and a first image to be processed. Here, the eye movement signal includes, but is not limited to, data of an eyeball deflection angle, a gaze position, a gaze direction, and the like of the user. The display pipeline may be directly connected to the eye movement device 20 and directly acquire the eye movement signal from the eye movement device 20, or may be indirectly connected to the eye movement device 20 via an image processing unit such as a GPU of the main processor 30 and indirectly and synchronously acquire the eye movement signal of the user via the main processor 30. Further, for embodiments of Mixed Reality (MR) display, the first image may be an original virtual rendered image generated by an image processing unit such as a GPU of the host processor 30.

Further, in some preferred embodiments, the main processor 30 may also be connected to the eye tracker 20, and configured to obtain an eye movement signal of the user from the eye tracker 20, perform gaze point rendering compression on the generated original virtual rendered image according to the eye movement signal, and send the gaze point rendered compressed image to the coprocessor 10, so as to reduce the data transmission load and the data processing load of the whole framework.

Still further, in some embodiments as shown in FIG. 1, coprocessor 10 may also preferably be configured with a display driver software and/or firmware computing platform. The display driver software and/or firmware computing platform is respectively connected with the eye tracker 20 and a display pipeline. In super-resolving the first image, the co-processor 10 may first obtain the user's eye movement signal from the eye tracker 20 via the display driver software and/or firmware computing platform and perform eye movement tracking calculations to determine gaze point location. And then, the display driving software and/or the firmware computing platform can update the gaze point information in the system according to the gaze point position so as to construct an updated compression model, and transmit compression parameters of the updated compression model to a display pipeline for super-resolution processing of the first image.

Referring to fig. 3 and 4 in combination, fig. 4 is a flow chart illustrating a method for recompressing an image according to some embodiments of the invention.

In the embodiments shown in fig. 3 and 4, in response to acquiring the first image and the eye movement signal (or a compression parameter associated with the user's eye movement signal), the display pipeline may perform super-resolution processing on the first image according to the eye movement signal using an upsampling unit (i.e., first software) configured therein, and at least one image processing hardening unit integrated therein. Specifically, in response to acquiring the compression parameters associated with the eye movement signal of the user and the first image to be processed, the upsampling unit (i.e., the first software) may first divide the first image into a plurality of partitions based on the gaze point position of the user according to the compression parameters of the updated compression model, and determine the coordinate range of each partition and the upsampling magnification of each gaze region including the adjacent gaze point position, respectively. Then, the up-sampling unit (i.e. the first software) may obtain, according to the coordinates of each pixel in each partition, the processing parameters of the up-sampling operation from each corresponding memory, and obtain, from each corresponding buffer memory, the pixel buffer data required for the super-resolution processing. And then, the up-sampling unit (i.e. the first software) may sequentially input the acquired processing parameters and the pixel buffer data of each pixel into one or more of the weighted summation circuit, the average value calculation circuit, the filter circuit and the mapping circuit of the pixel position relationship, and perform weighted summation, averaging, filtering and/or hardening calculation of the pixel buffer data mapping to obtain a first hardening calculation result based on the up-sampling processing. And then, the up-sampling unit (namely the first software) can obtain an image after up-sampling processing through software operations such as data arrangement, assignment and the like.

Furthermore, in some embodiments, a downsampling unit (i.e. second software) may be further arranged in the display pipeline of the co-processor 10, so that the first image is recompressed together with the upsampling unit (i.e. first software). Specifically, in the process of performing the recompression processing on the first image, in response to acquiring the compression parameter associated with the eye movement signal of the user and the first image to be processed, the downsampling unit (i.e., the second software) may first divide the first image into a plurality of subareas based on the gaze point position of the user according to the compression parameter of the updated compression model, and determine the coordinate range of each subarea and the downsampling magnification of each non-gaze area far from the gaze point position respectively. Then, the downsampling unit (i.e. the second software) may obtain the processing parameters of the downsampling operation from the corresponding memories according to the coordinates of the pixels in the partitions, and obtain the pixel cache data required for the super-resolution processing from the corresponding cache memories. Here, the pixel buffer data includes, but is not limited to, buffer data of at least one nearby pixel of the current frame, and buffer data in at least one frame history frame before the nearby pixels. And then, the downsampling unit (i.e. second software) may sequentially input the acquired processing parameters and the pixel cache data of each pixel into one or more of the weighted summation circuit, the average value calculation circuit, the filter circuit and the mapping circuit of the pixel position relationship, and perform weighted summation, averaging, filtering and/or hardening calculation of the pixel cache data mapping to obtain a second hardening calculation result based on compression processing. And then, the downsampling unit (namely second software) can collect the second hardening calculation result and the first hardening calculation result, and the recompressed image shown in fig. 2 is obtained through software operations such as data arrangement, assignment and the like.

As shown in fig. 2, after the eye movement information based recompression process, at least one gaze area in the first image containing a near gaze point location is super-resolved based on the same and/or different upsampling magnification to achieve an equivalent resolution of a 4K display of 40 pixels/degree (pixels per degree) in these regions of interest. At the same time, at least one non-gaze area of the first image distant from the gaze point location is compressed based on the same and/or different downsampling magnifications, thereby reducing the data processing load and/or data transmission load of the non-attention areas.

Furthermore, in some embodiments of the present invention, the display pipeline of the coprocessor 10 may be preferably further configured with an image distortion correction unit (i.e. third software) for performing a distortion correction process on the acquired first image in cooperation with at least one image processing hardening unit integrated in the display pipeline. Specifically, in response to acquiring the first image to be processed, the image distortion correction unit (i.e., the third software) may first determine pixel data and processing parameters that are required for the image distortion correction process. In some embodiments, these pixel data may preferably be determined from the eye movement signal of the user. Then, the image distortion correcting unit (i.e. third software) may obtain the processing parameters from each corresponding memory, obtain the pixel buffer data required for the image distortion correcting process from each corresponding buffer memory, and sequentially input the processing parameters and the pixel buffer data into one or more of the weighted summation circuit, the average value calculation circuit, the filter circuit, and the mapping circuit of the pixel position relationship, and perform weighted summation, average value calculation, filtering, and/or hardening calculation of the pixel buffer data mapping to obtain the corresponding hardening calculation result. And then, the image distortion correction unit (namely third software) can obtain the image subjected to the distortion correction processing through software operations such as data arrangement, assignment and the like.

Compared with a pure software scheme of distortion correction and image compression which needs to calculate and cache pixel cache data of each relevant pixel frame by frame and partition by partition, the invention effectively improves the multiplexing rate of the pixel cache data of each frame and each partition and effectively reduces the data processing load of a software unit by designing and multiplexing the cache memory, the weighted summation circuit, the mean value calculation circuit, the filter circuit and the mapping circuit of pixel position relation, thereby reducing the data processing and transmission load of the coprocessor 10 on the whole, being beneficial to meeting the high resolution display requirement of users for a staring area based on limited software and hardware resources and reducing the delay of high resolution display.

In addition, after obtaining the second image subjected to the warp correction process and the recompression process of the coprocessor 10, the coprocessor 10 may also transmit the second image to the display terminal 50 for high resolution display of the augmented reality image.

In particular, for embodiments in which the main processor 30 sends the compressed first image after gaze point rendering to the co-processor 10, a decompression module may also preferably be integrated in the display pipeline of the co-processor 10. The display driving software and/or the firmware computing platform is connected with the decompression module. The software processing unit and at least one image processing hardening unit arranged in the display pipeline are also connected to the display terminal 50 via the decompression module. In this way, in the process of transmitting the second image to the display terminal 50, the decompression module configured in the coprocessor 10 may first acquire the second image subjected to the distortion correction processing and the compression processing through the software processing unit and at least one image processing hardening unit integrated in the display pipeline, and then acquire the compression parameters of the updated compression model through the display driving software and/or the firmware computing platform, so as to decompress the second image subjected to the distortion correction processing and the compression processing according to the compression parameters of the updated compression model, so as to obtain the third image. The co-processor 10 may then transmit the decompressed third image to the display terminal 50 for high resolution display of the augmented reality image.

Further, for embodiments in which the main processor 30 sends the compressed first image after gaze point rendering to the co-processor 10, the above-described decompression module may also be preferably configured on the display terminal 50 or its driver chip 40. Taking the example of a decompression module configured on the display driver chip 40, the decompression module may be connected to the display pipeline, display driver software, and/or firmware computing platform of the coprocessor 10, respectively. In the process of transmitting the second image to the display terminal 50, the decompression module configured on the display driving chip 40 may first acquire the second image subjected to the distortion correction processing and the compression processing through the software processing unit and at least one image processing hardening unit integrated in the display pipeline, and then acquire the eye movement signal, the gaze point position and/or the compression parameters of the updated compression model of the user through the display driving software and/or the firmware computing platform, so as to decompress the second image subjected to the distortion correction processing and the compression processing according to the eye movement signal, the gaze point position and/or the compression parameters of the updated compression model, so as to obtain the third image. Then, the display driving chip 40 may directly drive the pixel array circuit of the display terminal 50 such as the OLED display screen/LED display screen/LCD display screen according to the decompressed third image, so as to perform high resolution display of the augmented reality image on the display terminal 50.

Still further, in some embodiments, the decompression module may be further preferably configured in the pixel array circuit of the display terminal 50. Therefore, the image data transmitted in the whole framework of the augmented reality display device is rendered and compressed through the gaze point, and the data book transmission load and the data processing load of the whole framework can be further reduced. The operation principle of the decompression module of the pixel array circuit disposed on the display terminal 50 is similar to that of the embodiment disposed on the display driving chip 40, and will not be described herein.

It will be appreciated by those skilled in the art that the system architecture of the augmented reality display device using the coprocessor 10 is merely a non-limiting embodiment provided by the present invention, and is intended to clearly illustrate the general concept of the present invention and to provide a specific solution for public implementation, not to limit the scope of the present invention.

Optionally, in other embodiments, the image processor provided in the first aspect of the present invention may be integrated into a main processor unit such as a Central Processing Unit (CPU), a Graphics Processing Unit (GPU) of the augmented reality display device provided in the fourth aspect of the present invention through a software program and a hardware unit, so as to achieve the same technical effects, which are not described in detail herein.

While, for purposes of simplicity of explanation, the methodologies are shown and described as a series of acts, it is to be understood and appreciated that the methodologies are not limited by the order of acts, as some acts may, in accordance with one or more embodiments, occur in different orders and/or concurrently with other acts from that shown and described herein or not shown and described herein, as would be understood and appreciated by those skilled in the art.

Those of skill in the art would understand that information, signals, and data may be represented using any of a variety of different technologies and techniques. For example, data, instructions, commands, information, signals, bits, symbols, and chips that may be referenced throughout the above description may be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles, or any combination thereof.

Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The various illustrative logical modules, and circuits described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, an NPU AI network model computation acceleration processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a user terminal.

In one or more exemplary embodiments, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software as a computer program product, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. Any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a web site, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk (disk) and disc (disk) as used herein include Compact Disc (CD), laser disc, optical disc, digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks (disk) usually reproduce data magnetically, while discs (disk) reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An image processor, the image processor configured to:

acquiring an eye movement signal of a user and a first image to be processed;

determining a gaze region in the first image in dependence on the eye movement signal; and

and performing super-resolution processing on the staring region to obtain a second image with the local resolution of the staring region reaching the target resolution.

2. The image processor of claim 1, wherein the image processor is coupled to an eye-tracker, the step of determining a gaze area in the first image based on the eye-movement signal comprising:

performing eye tracking calculation according to an eye movement signal acquired from the eye movement instrument so as to determine a gaze point position of the user in the first image;

Updating gaze point information according to the gaze point location to construct an updated compressed model; and

and determining coordinates of a plurality of subareas related to the gaze point location according to compression parameters of the updated compression model, wherein at least one gaze area is included in the plurality of subareas.

3. The image processor of claim 2, wherein the step of super-resolving the gaze region to obtain a second image of the gaze region having a local resolution up to a target resolution comprises:

determining an up-sampling magnification of at least one gaze area comprising and/or adjacent to the gaze point location according to compression parameters of the updated compression model; and

and carrying out corresponding super-resolution processing on each gaze area of the first image according to the coordinates and the up-sampling multiplying power of each gaze area so as to obtain the second image.

4. The image processor of claim 1 or 2, wherein the image processor is further configured to:

determining a non-gaze region in the first image in dependence on the eye movement signal; and

and compressing the non-staring area to obtain a second image with local resolution of the staring area reaching the target resolution and smaller than the target resolution.

5. The image processor of claim 4, wherein the step of compressing the non-gaze region to obtain a second image having a local resolution of the gaze region that reaches a target resolution and a local resolution of the non-gaze region that is less than the target resolution comprises:

determining a downsampling ratio of at least one non-gaze area away from the gaze point location according to compression parameters of the updated compression model; and

and carrying out corresponding compression processing on each non-staring region of the first image according to the coordinates and the downsampling multiplying power of each non-staring region so as to obtain the second image.

6. The image processor according to claim 3 or 5, wherein a plurality of the partitions have the same and/or different upsampling magnifications, and/or a plurality of the partitions have the same and/or different downsampling magnifications.

7. The image processor of claim 3 or 5, wherein the image processor is further configured to:

extracting image features from the first image via a pre-trained feature extraction model;

determining a corresponding up-sampling convolution kernel or down-sampling convolution kernel according to the extracted image features; and

And performing corresponding super-resolution processing on each gaze region of the first image according to the coordinates, the upsampling multiplying power and the upsampling convolution kernel of each gaze region, or performing corresponding compression processing on each non-gaze region of the first image according to the coordinates, the downsampling multiplying power and the downsampling convolution kernel of each non-gaze region, so as to obtain the second image.

8. The image processor of claim 4, wherein the image processor is further configured to:

identifying a first resolution of the first image;

responding to the recognition result that the first resolution is lower than the target resolution, and performing super-resolution processing on the staring area to obtain a second image of which the local resolution of the staring area reaches the target resolution; and

and responding to the recognition result that the first resolution is higher than the target resolution, performing compression processing on the non-staring area to obtain a second image that the local resolution of the staring area reaches the target resolution and the local resolution of the non-staring area is smaller than the target resolution.

9. The image processor of claim 4, comprising a display pipeline, wherein the display pipeline has integrated therein a software processing unit and at least one image processing hardening unit, and is configured to:

And according to the eye movement signal, performing the super-resolution processing on the first image by using first software configured in the software processing unit and the at least one image processing hardening unit, and performing the compression processing on the first image by using second software configured in the software processing unit and the at least one image processing hardening unit to obtain the second image.

10. The image processor of claim 9, wherein the software processing unit is further configured with a third software, the display pipeline is further configured to:

and performing distortion correction processing on the first image by using the third software configured in the software processing unit and the at least one image processing hardening unit.

11. The image processor of claim 9 or 10, wherein the at least one image processing hardening unit is selected from at least one of a buffer memory, a weighted sum circuit, a mean value calculation circuit, a filter circuit, a mapping circuit of pixel location relationships.

12. The image processor of claim 11, wherein the cache data stored in the cache memory includes real-time image data of at least one pixel in the first image and/or the second image of the current frame and/or historical image data of at least one pixel in the first image and/or the second image of the previous frame.

13. The image processor of claim 9, wherein the display pipeline is connected to a main processor of an augmented reality display device, and wherein the step of acquiring the eye movement signal of the user and the first image to be processed comprises:

a real and/or virtual image is acquired via the main processor via gaze point rendering compression, wherein the gaze point rendering compression is implemented based on the eye movement signal.

14. An image processing method, characterized by comprising the steps of:

acquiring an eye movement signal of a user and a first image to be processed;

15. A computer-readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the image processing method of claim 14.

16. An augmented reality display device, comprising:

the eye movement instrument is used for collecting eye movement signals of a user;

a main processor outputting a first image with or without gaze point rendering compression, wherein the gaze point rendering compression is implemented based on the eye movement signal;

The image processor of any one of claims 1-13, wherein the image processor is coupled to the eye-tracker and the main processor, respectively, to obtain the first image and the eye-movement signal; and

and the display terminal is connected with the image processor to acquire and display a second image subjected to super-resolution processing of the image processor.