CN117940902A - Intelligent dispatcher - Google Patents

Intelligent dispatcher Download PDF

Info

Publication number
CN117940902A
CN117940902A CN202280062283.5A CN202280062283A CN117940902A CN 117940902 A CN117940902 A CN 117940902A CN 202280062283 A CN202280062283 A CN 202280062283A CN 117940902 A CN117940902 A CN 117940902A
Authority
CN
China
Prior art keywords
tasks
scheduler
schedule
computing device
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280062283.5A
Other languages
Chinese (zh)
Inventor
A·卡南
V·M·达格涅尼
R·德赛
R·S·帕蒂尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Apple Inc
Original Assignee
Apple Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Apple Inc filed Critical Apple Inc
Priority claimed from PCT/US2022/044430 external-priority patent/WO2023049287A1/en
Publication of CN117940902A publication Critical patent/CN117940902A/en
Pending legal-status Critical Current

Links

Landscapes

  • User Interface Of Digital Computer (AREA)

Abstract

Techniques related to kernel task scheduling are disclosed. In various embodiments, a computing device receives, at a first scheduler, a computational graph defining interrelationships of a set of tasks to be performed by the computing device. In some embodiments, the set of tasks is performed to provide an augmented reality (XR) experience to a user. The first scheduler determines a schedule for implementing the set of tasks based on the correlations defined in the computational graph and issues instructions for causing a second scheduler of the computing device to schedule execution of the set of tasks according to the determined schedule.

Description

Intelligent dispatcher
Background
Technical Field
The present disclosure relates generally to computing devices, and more particularly to scheduling tasks performed on computing devices.
Description of related Art
Modern operating systems typically support multitasking, the concept of which is that concurrent execution of multiple tasks may occur within a given period of time. To facilitate multitasking, the operating system kernel may include a scheduler that dynamically allocates resources among tasks. For example, two threads may compete for execution in a Central Processing Unit (CPU) pipeline. To share this resource, the kernel scheduler may initially allocate a first time block to a first thread for execution, then perform a context switch to initiate execution of a second thread during a second time block. This switching may occur periodically so that both threads can fully use the resources. Because not all tasks are created equally, the kernel scheduler may support multiple execution priorities, with threads executing important (and potentially time sensitive) tasks assigned higher execution priorities in order to obtain priority scheduling.
Drawings
FIG. 1 is a block diagram illustrating an example of a computing device configured to schedule tasks for execution using an intelligent scheduler.
FIG. 2 is a block diagram illustrating an example of a program stack including an intelligent scheduler.
FIG. 3 is a block diagram illustrating an example of a system health monitor associated with an intelligent scheduler.
Fig. 4A-4D are block diagrams illustrating examples of a computational graph analyzer associated with an intelligent scheduler.
Fig. 5A and 5B are block diagrams illustrating examples of actuators associated with an intelligent scheduler.
Fig. 6A to 6E are flowcharts showing examples of a method for intelligently scheduling tasks.
FIG. 7 is a block diagram illustrating an example of components that may be included in a computing device.
The present disclosure includes references to "one embodiment" or "an embodiment". The appearances of the phrase "in one embodiment" or "in an embodiment" are not necessarily referring to the same embodiment. The particular features, structures, or characteristics may be combined in any suitable manner consistent with the present disclosure.
Within this disclosure, different entities (which may be variously referred to as "units," "circuits," other components, etc.) may be described or claimed as "configured to" perform one or more tasks or operations. This expression-an entity configured to perform one or more tasks-is used herein to refer to a structure (i.e., a physical thing such as an electronic circuit). More specifically, this expression is used to indicate that the structure is arranged to perform one or more tasks during operation. A structure may be said to be "configured to" perform a task even though the structure is not currently being operated on. A "neural network engine configured to implement a neural network" is intended to encompass circuitry that performs this function, for example, during operation, even if the circuitry in question is not currently being used (e.g., the circuitry is not connected to a power supply). Thus, an entity described or stated as "configured to" perform a task refers to a physical thing, such as a device, circuitry, memory storing executable program instructions, etc., for performing the task. The phrase is not used herein to refer to intangible things. Accordingly, a "configured to" structure is not used herein to refer to a software entity, such as an Application Programming Interface (API).
The term "configured to" is not intended to mean "configurable to". For example, an un-programmed FPGA may not be considered "configured to" perform a particular function, although it may be "configured to" perform that function and may be "configured to" perform that function after programming.
The expression "configured to" perform one or more tasks in the appended claims is expressly intended to not refer to 35u.s.c. ≡112 (f) for that claim element. Accordingly, none of the claims in the present application as filed are intended to be interpreted as having a device-plus-function element. If applicants want to refer to section 112 (f) during an application, then it will use the "means for performing function" structure to express the elements of the claims.
As used herein, the terms "first," "second," and the like, serve as labels for nouns following them, and do not imply any sort of ordering (e.g., spatial, temporal, logical, etc.), unless explicitly indicated. For example, in a processor having eight processing cores, the terms "first" and "second" processing cores may be used to refer to any two of the eight processing cores. In other words, the "first" processing core and the "second" processing core are not limited to, for example, processing core 0 and processing core 1.
As used herein, the term "based on" is used to describe one or more factors that affect a determination. The term does not exclude that there may be additional factors that may influence the determination. That is, the determination may be based on specified factors alone or on specified factors and other unspecified factors. Consider the phrase "determine a based on B". This phrase specifies that B is a factor for determining a or that B affects a. This phrase does not preclude the determination of a from being based on some other factor, such as C. The phrase is also intended to cover embodiments in which a is determined based only on B. As used herein, the phrase "based on" is therefore synonymous with the phrase "based at least in part on".
A physical environment refers to a physical world in which people can sense and/or interact without the assistance of an electronic system. The physical environment may include physical features, such as physical surfaces or physical objects. For example, the physical environment may correspond to a physical park that includes physical trees, physical buildings, and physical people. People can directly sense and/or interact with a physical environment, such as by visual, tactile, auditory, gustatory, and olfactory.
Conversely, an augmented reality (XR) environment (or Computer Generated Reality (CGR) environment) refers to a fully or partially simulated environment in which people perceive and/or interact via an electronic system. For example, the XR environment may include Augmented Reality (AR) content, mixed Reality (MR) content, virtual Reality (VR) content, and the like. In the case of an XR system, a subset of the physical movements of a person, or a representation thereof, are tracked and in response one or more characteristics of one or more virtual objects simulated in the XR environment are adjusted in a manner consistent with at least one physical law. For example, an XR system may detect head movements of a person and, in response, adjust the graphical content and sound field presented to the person in a manner similar to how such views and sounds would change in a physical environment. As another example, the XR system may detect movement of an electronic device (e.g., mobile phone, tablet computer, laptop computer, etc.) presenting the XR environment, and in response, adjust the graphical content and sound field presented to the person in a manner similar to how such views and sounds would change in the physical environment. In some cases (e.g., for reachability reasons), the XR system may adjust characteristics of graphical content in the XR environment in response to representations of physical movements (e.g., voice commands).
A person may sense and/or interact with an XR object using gestures or any of its sensations including visual, audible, and tactile. For example, a person may sense and/or interact with audio objects that create a 3D or spatial audio environment that provides a perception of point audio sources in 3D space. As another example, an audio object may enable audio transparency that selectively introduces environmental sounds from a physical environment with or without computer generated audio. In some XR environments, a person may sense and/or interact with only audio objects.
Examples of XRs include virtual reality and mixed reality.
A Virtual Reality (VR) environment refers to a simulated environment designed to be based entirely on computer-generated sensory input for one or more sensations. The VR environment includes a plurality of virtual objects that a person can sense and/or interact with. For example, computer-generated images of trees, buildings, and avatars representing people are examples of virtual objects. A person may sense and/or interact with virtual objects in the VR environment through a simulation of the presence of the person within the computer-generated environment and/or through a simulation of a subset of the physical movements of the person within the computer-generated environment.
A Mixed Reality (MR) environment refers to a simulated environment designed to incorporate sensory input from a physical environment or a representation thereof in addition to including computer-generated sensory input (e.g., virtual objects). On a virtual continuum, a mixed reality environment is any condition between, but not including, a full physical environment as one end and a virtual reality environment as the other end.
In some MR environments, the computer-generated sensory input may be responsive to changes in sensory input from the physical environment. In addition, some electronic systems for rendering MR environments may track the position and/or orientation relative to the physical environment to enable virtual objects to interact with real objects (i.e., physical objects or representations thereof from the physical environment). For example, the system may cause movement such that the virtual tree appears to be stationary relative to the physical ground.
Examples of mixed reality include augmented reality and augmented virtualization.
An Augmented Reality (AR) environment refers to a simulated environment in which one or more virtual objects are superimposed over a physical environment or representation thereof. For example, an electronic system for presenting an AR environment may have a transparent or translucent display through which a person may directly view the physical environment. The system may be configured to present the virtual object on a transparent or semi-transparent display such that a person perceives the virtual object superimposed over the physical environment with the system. Alternatively, the system may have an opaque display and one or more imaging sensors that capture images or videos of the physical environment, which are representations of the physical environment. The system combines the image or video with the virtual object and presents the composition on an opaque display. A person utilizes the system to indirectly view the physical environment via an image or video of the physical environment and perceive a virtual object superimposed over the physical environment. As used herein, video of a physical environment displayed on an opaque display is referred to as "pass-through video," meaning that the system captures images of the physical environment using one or more image sensors and uses those images when rendering an AR environment on the opaque display. Further alternatively, the system may have a projection system that projects the virtual object into the physical environment, for example as a hologram or on a physical surface, such that a person perceives the virtual object superimposed on top of the physical environment with the system.
An augmented reality environment also refers to a simulated environment in which a representation of a physical environment is transformed by computer-generated sensory information. For example, in providing a passthrough video, the system may transform one or more sensor images to apply a selected viewing angle (e.g., a viewpoint) that is different from the viewing angle captured by the imaging sensor. As another example, the representation of the physical environment may be transformed by graphically modifying (e.g., magnifying) portions thereof such that the modified portions may be representative but not real versions of the original captured image. For another example, the representation of the physical environment may be transformed by graphically eliminating or blurring portions thereof.
Enhanced virtual (AV) environments refer to simulated environments in which a virtual or computer-generated environment incorporates one or more sensory inputs from a physical environment. The sensory input may be a representation of one or more characteristics of the physical environment. For example, an AV park may have virtual trees and virtual buildings, but the face of a person is realistically reproduced from an image taken of a physical person. As another example, the virtual object may take the shape or color of a physical object imaged by one or more imaging sensors. For another example, the virtual object may employ shadows that conform to the positioning of the sun in the physical environment.
There are many different types of electronic systems that enable a person to sense and/or interact with various XR environments. Examples include head-mounted systems, projection-based systems, head-up displays (HUDs), vehicle windshields integrated with display capabilities, windows integrated with display capabilities, displays formed as lenses designed for placement on a person's eyes (e.g., similar to contact lenses), headphones/earphones, speaker arrays, input systems (e.g., wearable or handheld controllers with or without haptic feedback), smartphones, tablet computers, and desktop/laptop computers. The head-mounted system may have an integrated opaque display and one or more speakers. Alternatively, the head-mounted system may be configured to accept an external opaque display (e.g., a smart phone). The head-mounted system may incorporate one or more imaging sensors for capturing images or video of the physical environment and/or one or more microphones for capturing audio of the physical environment. The head-mounted system may have a transparent or translucent display instead of an opaque display. The transparent or translucent display may have a medium through which light representing an image is directed to the eyes of a person. The display may utilize digital light projection, OLED, LED, uLED, liquid crystal on silicon, laser scanning light sources, or any combination of these techniques. The medium may be an optical waveguide, a holographic medium, an optical combiner, an optical reflector, or any combination thereof. In some implementations, the transparent or translucent display may be configured to selectively become opaque. Projection-based systems may employ retinal projection techniques that project a graphical image onto a person's retina. The projection system may also be configured to project the virtual object into the physical environment, for example as a hologram or on a physical surface.
Detailed Description
The basic nature of the kernel scheduler is relatively agnostic to the task being performed. For example, while the scheduler may be told to schedule threads at a particular execution priority, the kernel scheduler may not know that one scheduled thread depends on the output of another thread. The kernel scheduler will also not know that the particular task being performed by the thread has a particular time constraint and if that constraint is not met, it may have an adverse effect on the user experience. The kernel scheduler is also unaware of the potential health of the computing device, such as that the processor of the device is about to reach its thermal limit due to heavy execution load. This inadequate understanding may thus result in the kernel scheduler implementing less efficient scheduling.
Such less efficient scheduling may be particularly problematic when a user is attempting to view content, such as augmented reality (XR) content, for example, via a Head Mounted Display (HMD). Generating an immersive XR experience may consume a significant amount of power and computation, and may often cause a computing device to reach its limits. The small amount of delay and jitter that can occur when the strict timing constraints of some tasks cannot be met can completely disrupt the user's experience and, in some cases, even lead to dizziness and nausea. Fortunately, tasks of this nature can have a large amount of certainty, which can allow for more intelligent task scheduling.
This disclosure describes embodiments in which a computing device uses another scheduler that works in conjunction with a kernel scheduler to schedule tasks more intelligently. As will be described in more detail below, a first scheduler executing at the application layer of a computing device may receive tasks from various processes attempting to perform those tasks. The first scheduler may analyze the computational graph defining the interrelationships of the tasks and various other information related to the tasks. The first scheduler may then determine a schedule for implementing the task based on the analysis, and issue instructions for causing a second kernel scheduler executing at a kernel layer of the computing device to schedule execution of the task according to the determined schedule. In some implementations, the first scheduler may also act as a global scheduler that may facilitate scheduling with additional schedulers associated with other resources (e.g., those associated with a graphics processing unit, a neural engine, etc.). In various embodiments, the first scheduler also monitors various health information and may dynamically adjust the schedule based on that information. In some embodiments, when a health issue arises, the first scheduler may also notify processes that provide the tasks and allow the processes to determine how the tasks should be handled. In some cases, precautions may be taken to avoid reaching the thermal and power limits of the computing device, which may often result in abrupt system pullback, which may result in delays and jitter, for example, when XR content is being presented. Being able to schedule tasks more intelligently in this way may result in an improved user experience for activities that are particularly computationally intensive but with a degree of certainty.
Turning now to FIG. 1, a block diagram of a computing device 10 configured to implement intelligent scheduling is depicted. The computing device 10 may correspond to (or be included in) any of a variety of computing devices, such as telephones, tablet computers, laptop computers, desktop computers, watches, internet of things (IoT) devices, and the like. In some embodiments discussed below, the computing device 10 may be a head mounted display, such as headphones, helmets, goggles, glasses, a phone inserted into a housing, or the like. In the illustrated embodiment, computing device 10 includes an application 110, a resource 120, a kernel 130, and an intelligent scheduler 140. As shown, the kernel 130 also includes a kernel scheduler 132. In some embodiments, computing device 10 may be implemented in a different manner than shown. For example, as will be described in connection with fig. 2, one or more additional software components may reside between the application 110 and the scheduler 140. In some embodiments, the scheduler 140 may not reside in the application layer 102, and so on. Various examples of other hardware components that may be included in computing device 10 are discussed below with respect to fig. 7.
In various embodiments, the application 110 is a program having various tasks 112 that use various shared resources 120. In some embodiments, these tasks 112 are performed to implement an augmented reality XR experience that may utilize AR, MR, or VR generated content. As one example, the application 110 may provide a common presence experience in which multiple users may interact with each other in a shared XR environment using their respective devices. As another example, the application 110 may support streaming of various content (such as movies, live sporting events, concerts, etc.). As another example, the application 110 may include a gaming application that places a user in an XR environment in which the user can interact with computer-generated objects. Although various embodiments are described herein in which task 112 may be performed with respect to the generation of XR content, in other embodiments, the techniques described herein may be applicable to other situations where improved scheduling is desired. As shown in FIG. 1, in some cases, an application 110 may have time-sensitive tasks 112 and non-time-sensitive tasks 114. Time sensitive tasks 112 may include those tasks that directly affect a user interface, provide real-time content, and so forth. For example, in embodiments in which the computing device 10 is an HMD, the task 112 associated with head tracking may be time sensitive in that a user looking left or right may directly affect what content is displayed on the user interface, and tracking delays in the user's head movement may result in jitter being introduced in the content being displayed. In contrast, the application 110 that requests a thread to download a user's email in the background may be a time insensitive task 114. As described above, in some cases, a particular task 112 has a large amount of certainty because the task 112 may need to be repeated periodically and have consistent known dependencies. Continuing with the head tracking example, the operations may include a plurality of repeated visual ranging tasks 112 that consume a plurality of video frames and sensor data over time to determine a changed position and orientation of computing device 10 in the physical environment.
In various embodiments, resource 120 is a variety of resources for performing task 112. Accordingly, resources 120 may include hardware resources such as Central Processing Units (CPUs), graphics Processing Units (GPUs), neuro-engine circuits, security elements, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), image Signal Processors (ISPs), displays, network Interface Cards (NICs), non-volatile and volatile memory, cameras, input/output devices, sensors, and the like. Resources 120 may also include software resources such as memory buffers, threads, applications, operating system services, and the like. The resources 120 may also include specific data sets used by the tasks 112. In some implementations, the resource 120 may also be located on a device other than the computing device 10.
In various embodiments, kernel 130 is a component of the operating system of computing device 10 and is capable of being executed to manage aspects of the computing device including one or more of resources 120. As shown in fig. 1, the kernel 130 resides in the kernel layer 104/kernel space 104 along with the kernel scheduler 132, as opposed to the applications 110 and intelligent schedulers 140 residing in the application layer 102/application space 102. As used herein, the term "application layer" (or user layer) refers to a class of programs that execute in a mode with restricted privileges. For example, in an x86 processor, this mode is referred to as ring 3. In this mode, the program may be prohibited from executing instructions defined by a particular Instruction Set Architecture (ISA). The processor may also block direct access to specific hardware and/or limit program access to "application space," which refers to an area of memory allocated to programs executing in an application mode. Exemplary programs executing in the application mode may include, for example, word processing applications, web browsers, mail clients, or other user applications. Most applications typically reside in the application layer 102 for security reasons. In contrast, the term "kernel layer" (or system layer) refers to a class of programs that execute in a mode in which the processor executes programs with unrestricted privileges. For example, in an x86 processor, this mode is referred to as ring 0. The kernel layer is typically used for programs responsible for system management, such as operating system kernels, bootloaders, drivers, hypervisors, and the like. The term "kernel space" refers to a restricted area of memory that is only accessible by programs executing in the kernel layer—in some embodiments, kernel layer applications may also be restricted from accessing an application space area of memory. To facilitate management of resources 120, kernel 130 may include one or more kernel schedulers 132.
In various embodiments, the kernel scheduler 132 is a scheduler executing at the kernel layer that handles scheduling of processes based on their respective execution priorities. As used herein, the term "process" is to be interpreted according to its meaning as understood in the art and includes instances of the computer program being executed. Thus, the term "process" may refer to an application having multiple threads or a single thread of execution. In various implementations, the kernel 130 may assign a separate Process Identifier (PID) to each process, which may assist the scheduler 132 in distinguishing the processes being scheduled. As used herein, the term "execution priority" is to be interpreted in accordance with its meaning as understood in the art and includes a value assigned to a process that controls the frequency and/or duration that a process/thread is scheduled for execution. For example, execution priority in a Unix TM -based system may correspond to a priority value (PR) and/or a friendliness (niceness) value (NI). Although depicted as a single scheduler, in some embodiments, the scheduler 132 may be one of a plurality of schedulers 132, as will be discussed with reference to fig. 2. For example, the kernel scheduler 132 may handle scheduling processes for execution on the CPU of the device 10, while a separate scheduler 132 may handle scheduling with respect to the GPU of the device 10. As described above, the kernel scheduler 132 may be relatively agnostic to the tasks 112 being performed by the process it is scheduling. Scheduler 132 may be aware of the execution priority of a process and place the process into an appropriate scheduling queue for execution; however, the scheduler 132 may not know that the originally scheduled process will wait for the output of a later scheduled process-which may be placed further into a lower execution priority queue.
In various embodiments, the intelligent scheduler 140 is an application layer scheduler that can be executed to determine a more informed schedule 144 for execution of the tasks 112. As shown, the scheduler 140 may receive the time sensitive tasks 112 and analyze a relationship computation graph 142 that identifies interrelationships of the tasks 112 to be performed. For a given requested task 112, such a computational graph 142 may include graph nodes that specify both: 1) A task 112 that provides input to be used in executing a given task 112, and 2) a task 112 that should receive output of the given task once the given task is completed. As will be discussed, the computational graph 142 may also include additional useful information, such as the resources 120 required to perform the task 112, timing constraints of the task 112, and the like. In various embodiments, the process requesting execution of the task 112 may identify the interrelationship of the task 112 to the scheduler 140, and in some embodiments, provide a portion of the computational graph 142 to the scheduler 140. Based on this analyzed information, the scheduler 140 may determine a schedule 144 indicating how the tasks 112 should be performed in order to improve performance and resource usage. In various embodiments, the scheduler 140 may focus on identifying a critical path for executing a set of tasks 112 and attempt to schedule tasks 112 along that path in a manner that ensures that the timing constraints of those tasks 112 can be fully met. To reduce the number of tasks 112 being considered by the scheduler 140, the less time sensitive tasks 114 may be processed independently of the scheduler 140 and thus may be processed by the kernel scheduler 132 in a conventional manner.
Since power and computational availability may vary over time, in various embodiments, the scheduler 140 also receives system health information 146 as shown in fig. 1. As will be discussed below with respect to fig. 3, this information 146 may include various performance, power, and thermal statistics that may be used by the scheduler 140 to determine which resources 120 are available to perform the tasks 112 and how the availability of these resources may change over time. Accordingly, the scheduler 140 may use the health information 146 when initially determining the schedule 144. Later, in response to the health information 146 indicating that the health of the computing device 10 has changed, the scheduler 140 may modify the schedule 144. As will be discussed below with respect to fig. 4D, if the scheduler 140 determines from the health information 146 that it may no longer meet certain timing constraints in the near future, the scheduler 140 may contact processes requesting execution of tasks 112 and allow those processes to decide how to handle problems with poor health of the computing device 10 before they reach a certain problematic threshold, which may require those processes to modify the set of tasks 112 that they are requesting the scheduler 140 to handle. In so doing, scheduler 140 may implement a less demanding backoff than waiting for the CPU to decide to suddenly reduce its clock frequency, e.g., in response to reaching its thermal limit. Instead, the scheduler 140 may notify the process in advance of such a backoff occurring.
To enable the task 112 to receive prioritized scheduling, in various embodiments, the scheduler 140 issues prioritized scheduling instructions 148 for causing the kernel scheduler 132 to schedule execution of the task 112 according to the determined schedule 144. In some implementations, the scheduler 140 issues instructions 148 to the kernel 130 for requesting threads with particular execution priorities according to the determined schedule 144. In one embodiment, where kernel 130 implements an Application Programming Interface (API) compatible with a Portable Operating System Interface (POSIX), instructions 148 may include a pthread system call requesting creation of a thread. In response, the kernel 130 may dispatch the requested one or more threads to the application layer 102, wherein the scheduler 140 may provide tasks 112 to the dispatched threads according to the determined schedule. The kernel scheduler 132 may then schedule the dispatched threads to execute the tasks 112 at the requested execution priority. In some embodiments, the scheduler 140 tracks execution of the task 112 as the task 112 is being executed, and the scheduler enqueues the task 112 in one or more ready queues when it is determined that the task 112 is ready to execute based on the tracking. The dispatched thread may then dequeue the task 112 from the ready queue to execute the dequeued task 112. In various embodiments, scheduler 140 may be given the right to allow it to request threads at execution priorities that other processes (such as those requesting execution of task 112) cannot. Thus, if such processes want to execute tasks with a higher priority that is accessible to the scheduler 140, these processes may need to interface with the intelligent scheduler 140. As shown in fig. 1, processes (such as application 110) may still be able to request threads from kernel 130 independent of scheduler 140, however, those threads may be able to perform those tasks 114 only with lower execution priorities available to those requesting processes.
Although the tasks 112 are shown in FIG. 1 as being provided by the application 110, the tasks 112 may originate from one or more middle layers located between the application 110 and the scheduler 140. The program stack including these additional layers will now be discussed.
Turning now to FIG. 2, a block diagram of a program stack 200 including a scheduler 140 is depicted. In the illustrated embodiment, the program stack 200 includes an application 110 at a fourth layer L4, a system framework 210 at a third layer L3, a reality algorithm 220 at a second layer L2, a scheduler 140 at a first layer L1, and a driver/firmware 240 at a zeroth layer L0. As shown, layers L1-L4 are application layers 102, as opposed to layer L0, which is kernel layer 104. In some embodiments, program stack 200 may be implemented in a different manner than shown. For example, layer L2 may include components other than the reality algorithm 220, the application 110 may be directly connected with the scheduler 140, and so on.
In various embodiments, the system framework 210 provides various functions that may be requested by the application 110 via one or more Application Programming Interfaces (APIs), without the application 110 having to directly incorporate program instructions for those functions. For example, an application 110 that wants to use object classification for video frames recorded by computing device 10 may implement this functionality using the corresponding system architecture 210, without the developer of application 110 having to write program instructions, such as creating a neural network classifier or the like. As another example associated with algorithm 220 discussed below, application 110 providing a common presence session between two users may want to track head movements, eye movements, and hand movements of a given user in order for the corresponding avatar to mimic the user's movements in a common presence experience.
In various embodiments, the reality algorithm 220 is a program instruction implementing an underlying algorithm that supports the functionality provided by the system framework 210. As several examples shown in fig. 2, the reality algorithm 220 includes a visual inertial ranging (VIO) algorithm 220, a gaze tracking algorithm 220, and a hand tracking algorithm 220.VIO algorithm 220 may attempt to determine an orientation of computing device 10 using camera sensors and Inertial Measurement Unit (IMU) sensors, which may be resources 120. In embodiments in which computing device 10 is an HMD, the orientation may correspond to an orientation of a user's head/pose. The gaze algorithm 220 may track the position and movement of the user's eyes by using one or more eye tracking sensors (e.g., an IR camera with an IR illumination source, which may be the resource 120). In some embodiments, gaze algorithm 220 may also track other components of the user's face, such as the user's mouth/jaw, eyebrows, and so forth. The hand algorithm 220 may track the position, movement, and pose of a user's hand, fingers, and/or arms through the use of one or more hand sensors (e.g., an IR camera with IR illumination). Implementing various ones of these algorithms 220 may require performing various repetitive deterministic tasks 112, which may be expressed in the computational graph 142. In some embodiments, the algorithm 220 may provide not only the tasks 112 to the scheduler 140, but also portions of the computational graph 142 to the scheduler 140.
In various embodiments, driver/firmware 240 is the various components of management resource 120 in kernel layer 104. As shown, these components 240 may include drivers for the display, GPU drivers for graphics operations, neuro-engine drivers for performing machine learning operations, kernel 130, network interface drivers, and sensor drivers such as those for the sensors discussed below with respect to fig. 7. In the illustrated embodiment, it is also important to note that one or more of the components 240 may include their own respective schedulers 132, such as a scheduler 132B for a neural engine and a scheduler 132C for a GPU, in addition to the kernel scheduler 132A. As described above, although a single kernel layer scheduler is depicted in FIG. 1, in some embodiments, scheduler 140 may issue instructions 148 for causing multiple schedulers 132A-C to schedule execution of tasks 112 by resource 120 according to a determined schedule 144. Thus, in some embodiments, scheduler 140 may be described as a global scheduler for other resource-specific schedulers 132.
As shown in fig. 2 and described in more detail below, scheduler 140 may include a system health monitor 231, a graph analyzer 232, and an executor 233. However, in other embodiments, the scheduler 140 may include other components. As will be described in connection with fig. 3, the system health monitor 231 may monitor ongoing power and performance capability changes of the computing device 10 to proactively determine system health, which may be used by the scheduler 140 in scheduling tasks 112. As will be described in connection with fig. 4A-4D, the computational graph analyzer 232 may view the task 112 being requested in its entirety by analyzing health information, computational graph 142, and other metadata to generate the schedule 144 of tasks 112. As will be described in connection with fig. 5A and 5B, the executor 233 may consume the schedule 144 determined by the graph analyzer 232 and facilitate implementation of the schedule 144.
In various embodiments, data protocol 236 is an interface protocol used by scheduler 140 to connect with lower and/or higher layers of program stack 200 and components of those layers to connect with scheduler 140.
Turning now to fig. 3, a block diagram of the system health monitor 231 is depicted. As described above, in some embodiments, the system health monitor 231 may monitor and report various statistical information that may be indicative of the underlying health of the computing device 10. In the illustrated embodiment, health monitor 231 receives expiration date tracking information 302, performance statistics 304, power statistics 306, and thermal statistics 308, and outputs health telemetry 322. The health monitor 231 may then process this information 146 in a calculation availability block 310 and a power availability block 320. In some embodiments, the system health monitor 231 may be implemented in a different manner than shown, such as including different components and/or having different inputs or outputs.
In various embodiments, the expiration date tracking information 302 includes various information regarding the ability of the scheduler 140 to meet various expiration date/timing constraints. In some embodiments, this information 302 may include specified timing constraints for a particular task 112. For example, the information 302 may include information about the monitored audio and video expiration dates-e.g., a particular video task 112 needs to be completed within 100 ms. As will be discussed in connection with fig. 4B, this information 302 may be obtained from information included in the computational graph 142. In some embodiments, this information 302 may include historical information regarding past execution of the tasks 112—for example, a particular set of tasks 112 have traditionally spent 100ms to complete. In some embodiments, the information may include a count value indicating a frequency that satisfies (or does not satisfy) a particular timing constraint.
In various embodiments, performance statistics 304 include various statistics related to the performance of computing device 10. Thus, statistics 304 may include current utilization information for various resources 120, such as an indication that the CPU is experiencing 60% utilization, current power management state (p-state) of the processor core, dynamic Voltage and Frequency Management (DVFM) information, and so on. The statistics 304 may also include an indication of the current space available in the non-volatile memory, page swap rate, swap space size, etc. Statistics 304 may also identify network interface information such as network delay, bandwidth, and the like.
In various embodiments, power statistics 306 include various information related to power consumption of computing device 10. In some embodiments, statistics 306 identify the current wattage being consumed by resource 120 (or computing device 10). In instances where computing device 10 is using battery power, statistics 306 may identify the current charge level of the battery and its total capacity. In instances where computing node 140 has a plug-in power supply, statistics 306 may identify a plug-in condition.
In various embodiments, the thermal statistics 308 include various temperature information collected by one or more temperature sensors located in the computing device. In some embodiments, these sensors may be located within an integrated circuit of computing device 10 (such as an integrated circuit located on a processor chip). Computing device 10 may also include one or more temperature sensors to collect temperature outside of computing device 10. For example, in one embodiment in which computing device 10 is an HMD, device 10 may include one or more skin temperature sensors to detect the temperature of the location where device 10 of device 10 contacts the user's skin.
In various embodiments, the computing availability 310 is program instructions that can be executed to determine availability of resources 120 for performing the task 112 based on the health information 146. In some embodiments, computing availability 310 may view not only current information 146, but also a prior history of information 146 in order to infer how the health of computing device 10 may change and influence what resources 120 are available in the future. In the illustrated embodiment, the computing availability 310 may communicate this information as health telemetry 322 to the computational graph analyzer 232, which may consider this information in determining how to schedule the tasks 112.
In various embodiments, the power availability 320 is program instructions that can be executed to determine the availability of power for performing the task 112 based on the health information 146. Similar to computing availability 310, power availability 320 may view not only current information 146, but also a prior history of information 146 to infer how power consumption of computing device 10 may change and affect what power is available for execution of task 112 in the future. In the illustrated embodiment, the power availability 320 may include this information in the health telemetry 322 sent to the computational graph analyzer 232.
Turning now to fig. 4A, a block diagram of the graph analyzer 232 is depicted. As described above, the graph analyzer 232 may analyze the computational graph 142 of the task 112 and generate the corresponding schedule 144 for execution by the executor 233. In the illustrated embodiment, the graph analyzer 232 receives the computational graph 142, use cases 402, system modeling 404, and health telemetry 322. The graph analyzer 232 also outputs the schedule 144. In other embodiments, the graph analyzer 232 may be implemented differently than as shown.
Calculation map 142 is discussed in more detail below with respect to FIG. 4B. As described above, the computational graph 142 may define correlations between the tasks 112, including identifying inputs and outputs of the tasks 112. As will be discussed below, the computational graph 142 may include various additional information about the tasks 112, such as time constraints for these tasks, computational affinity, and the like.
In various embodiments, use case 402 identifies the overall context in which task 112 is being performed. In some implementations, use case 402 can identify a particular XR experience associated with task 112, such as a co-presence experience, a game, streaming XR content, and the like. In some embodiments, use case 402 identifies a process requesting task 112, such as a PID that includes VIO algorithm 220, gaze algorithm 220, and the like. In some embodiments, use case 402 uses the results received from executing task 112 to identify the entire application 110.
In various embodiments, the system modeling 404 includes information about the underlying resources 120 available to perform the tasks 112. For example, system modeling 404 may identify the number of processors included in computing device 10, the type of processor, voltage and operating frequency, and so on. The system modeling 404 may identify the type of memory and its storage capacity. The system modeling 404 may identify the types of network interfaces supported by the computing device 10, such asEtc. In some embodiments, the modeling 404 identifies the presence of particular hardware, such as a secure element, a biometric authentication sensor, a Hardware Security Module (HSM), a secure processor, and the like. In the illustrated embodiment, system modeling 404 is provided by kernel 130, but in other embodiments, the determination may be made from other sources.
As just discussed, health telemetry 322 may include information aggregated from various sources in computing device 10 and indicative of the health of computing device 10. The graph analyzer 232 may evaluate the health telemetry 322 to determine which resources 120 are currently available and may be available in the future. As will be discussed in connection with fig. 4D, when system health deteriorates, graph analyzer 232 may use health telemetry 322 to determine when to provide feedback to a process requesting performance of task 112.
Based on this received information 142, 402, 404, and 322, the graph analyzer 232 may determine a schedule 144 for implementing the task 112. In various embodiments, the determination includes determining which resources 120 to assign to the task 112, which may be assessed based on the computational affinity defined in the computational graph 142, the resources 120 owned by the computing device as identified from the system modeling 404, and the current availability of those resources as determined from the health telemetry 322. In various embodiments, determining the schedule 144 also includes determining, for some tasks 112, when those tasks 112 should be performed. As will be discussed in connection with fig. 4C, this may entail arranging tasks 112 in channels corresponding to available resources 120 based on the interrelationships of the tasks determined from computational graph 142, and determining whether the layout is capable of satisfying timing constraints associated with tasks 112. In various implementations, determining the schedule 144 may also include determining an execution priority for executing the particular task 112, which may be based on the importance of the task and the relevance to the critical timing path within the computational graph 142. As new tasks 112 are requested and the health of computing device 10 changes as identified from health telemetry 322, graph analyzer 232 may continuously update schedule 144.
In some cases, before a task 112 needs to be performed, and when the schedule 144 is determined, the graph analyzer 232 may be able to adequately determine how to schedule a particular task 112. For example, the analyzer 232 may determine from the information accompanying the computational graph 142 that a particular task needs to be performed at some regular interval, immediately after some other task 112 responsible for its input, etc. However, in other cases, the graph analyzer 232 may not be able to adequately determine how to schedule a particular task 112 before more nearly run-time, and thus may determine to dynamically schedule the task 112. For example, execution of a particular task 112 may be predicated on irregular occurrence of some event (such as a particular user input). In some embodiments, the graph analyzer 232 may generate the schedule 144 including the scheduling information 412 that the graph analyzer determines tasks 112 to be statically scheduled and including the scheduling information 414 that the graph analyzer determines tasks 112 to be dynamically scheduled later, but in other embodiments, the information may be transmitted separately. Execution of the statically scheduled tasks 112 will be described in more detail below with respect to fig. 5A. Execution of the dynamically scheduled task 112 will be described below with respect to fig. 5B.
Turning now to fig. 4B, a block diagram of a computational graph 142 is depicted. As described and illustrated above, the computational graph 142 is a graph data structure having a plurality of relationship nodes 420 corresponding to tasks 112 being considered for scheduling. In the illustrated embodiment, each node 420 may identify a particular task 112 associated with that node and the relationship of that particular task to tasks 112 associated with other nodes 420 by identifying resources 422A consumed by the particular task 112 and resources 422B generated by the particular task 112 (which may be consumed by other tasks 112).
As one example, the application 110 may request, via the framework 210, the use of the object classification algorithm 220 to classify objects captured by cameras of the user's surroundings. The object classification algorithm 220 may request that an initial object detection task 112 be performed in which objects are detected in video frames and bounding boxes are placed around the objects for subsequent analysis. The object classification algorithm 220 may then request to perform an image cropping task 112 in which content outside of the bounding box is removed from the frame to produce a cropped frame, and an object classification task 112 in which the cropped frame is analyzed to identify a classification of the object in the cropped frame-e.g., that the user is looking at a pair of shoes. In the computational graph 142, each task 112 may be assigned a respective node 420. With respect to node 420 for image cropping task 112, video frames with bounding boxes from object detection task 112 may be identified as input resources 422A, and cropped frames for object classification task 112 may be identified as output. The node 420 for the object classification task 112 may then identify the cropped frame as an input. Based on this relationship, the graph analyzer 232 may thus determine that the object classification task 112 should be scheduled after the image cropping task 112, which in turn should be scheduled after the object detection task 112.
As shown, each graph node 420 may include other task metadata 430 for a given task 112, such as type 431, time constraints 432, energy profiles 433, computational affinities 434, desired network connections 435, security requirements 436, and task chain 424. In some embodiments, more (or less) metadata 430 may be defined for a given node 420. In addition, the metadata defined for one graph node 420 may be different from those defined in another graph node 420.
In various embodiments, type 431 identifies the type of task 112 associated with a particular node 420. For example, node 420A may indicate that its type 431 is object detection, while node 420C may indicate that its type 431 is image cropping.
In various embodiments, the time constraint 432 identifies the maximum allowable delay for performing a given task 112. For example, constraints 432 specified in node 420C may indicate that object classification task 112 should be completed within 200 ms. Thus, the graph analyzer may analyze the time constraints 432 to determine when and how the tasks 112 should be scheduled. In the event that the analyzer 232 determines that the particular time constraint 432 cannot be met, the graph analyzer 232 may notify the requesting process, such as will be discussed below with respect to FIG. 4D.
In various embodiments, the energy profile 433 indicates an expected energy consumption for performing a given task 112. For example, the configuration file 433 for node 420A may indicate that object detection is a less energy intensive task 112, while the configuration file 433 for node 420C may indicate that object classification is a more energy intensive task 112. Accordingly, the graph analyzer 232 may analyze the energy profile 433 to determine how to optimally schedule the tasks 112 to save power while meeting the time constraint 432.
In various embodiments, the computational affinity 434 indicates that a particular resource 120 is desired to process the task 112. For example, node 420C may specify hardware (or software) implementing a neural network classifier operable to perform object classification task 112. In some cases, affinity 434 may include more general specifications (e.g., general-purpose hardware for implementing neural networks) or may include more specific specifications (e.g., specialized hardware specifically designed to implement Convolutional Neural Networks (CNNs) for object classification). Other examples of affinities 342 may include identifying GPU resources 120 for performing three-dimensional rendering tasks 112, identifying secure elements with payment credentials of a user for performing payment transaction tasks 112 for the user, and so forth. Thus, the analyzer 232 may evaluate the computational affinity 434 against the available resources 120 that generated the schedule 144.
In various embodiments, desired network connection 435 indicates desired characteristics of the network connection associated with a given task 112. These characteristics may be the type of network connection (e.g., Etc.), a desired bandwidth for the connection, and/or a desired delay for the connection. For example, tasks 112 requiring high bandwidth (e.g., streaming media content to computing device 10) may indicate a desire for a higher bandwidth connection. Thus, the analyzer 232 may attempt to match the characteristics identified in the desired network connection 435 with those available based on the health telemetry 322.
In various embodiments, the security requirements 436 indicate the requirements for performing a given task 112 in a secure manner. For example, each of nodes 420 may specify requirements 436 for task 112 to be performed in a secure manner, given that video frames collected by computing device 10 may include sensitive content. Accordingly, the graph analyzer 232 may assign tasks 112 with such requirements to resources 120 that may ensure secure management of data. Other examples of sensitive content may include keychain data, passwords, credit card information, biometric data, user preferences, other forms of personal information. For example, if a particular task 112 is being performed using a cryptographic key, a security requirement 436 may be set to ensure that secure hardware, such as a secure element, a Hardware Security Module (HSM), a secure processor, etc., is used, for example, with respect to the key.
In various embodiments, the task chain 437 indicates that two or more tasks 112 should be grouped together when they are executed. For example, the task chain 424 for node 420A may indicate that the task 112 should be performed on the same resource 120 as the task 112 associated with node 420B. Thus, in some embodiments, graph analyzer 232 may be restricted from assigning chained tasks 112 to resources 120 or different channels, as will be discussed in connection with fig. 4C.
Turning now to fig. 4C, a block diagram of static scheduling information 412 included in the schedule 144 is depicted. In the particular example depicted in fig. 4C, scheduling information 412 has been determined for tasks 112 associated with gaze algorithm 220, hand algorithm 220, and VIO algorithm discussed above in connection with fig. 2. Additionally, in this example, computing device 10 has four available CPU cores (shown as CPUs 0-3) that may be used to perform these tasks 112. In the illustrated embodiment, the static scheduling information 412 includes resource assignments 440, timing assignments 450, and execution priority assignments 460. In other embodiments, the information 412 may include more (or less) information, the information 412 may also vary from task 112, and so on.
In various embodiments, resource assignment 440 indicates what resources 120 should be used when performing task 112. For example, in FIG. 4C, tasks 112 of gaze algorithm 220, hand algorithm 220, and VIO algorithm 220 are assigned to CPUs 0-2, respectively. In some implementations, resource assignment 440 is made based on the computation affinities 434 included in graph nodes 420 of computation graph 142. The resource assignment 440 can also be based on availability to provide a particular timing assignment 450.
In various implementations, the timing assignment 450 indicates when a particular task 112 should be performed. In some embodiments, the timing assignment 450 may include precise timing information indicating when the task 112 is started, how long the task is running, and/or how frequently the task is running. In some implementations, the timing assignment 450 may indicate the number of clock cycles that a particular task 112 will receive within a given interval. In some implementations, the timing assignment 450 can indicate a relationship that indicates when execution of the task 112 occurs. For example, the assignment 450 for one task 112 may indicate that the task should occur after execution of another task 112 that produces output consumed by the one task 112. When determining the timing assignment 450, the graph analyzer 232 can treat the resource assignment 440 as a channel of available resources that vary over time and lay out the tasks 112 on these channels.
In various embodiments, the execution priority assignment 460 indicates the execution priority required by the thread to execute the task 112. In some embodiments, assignment 460 may be a value understood by kernel scheduler 132, such as a priority value (PR) and/or a friendliness value (NI). In other embodiments, assignment 460 may be a quality of service (QoS) class associated with a range of values. For example, kernel 130 may support a background QoS class associated with a lowest execution priority and a User Interaction (UI) QoS class associated with a highest execution priority. In the example depicted in fig. 4C, tasks 112 for algorithm 220 have been assigned entirely to the UIQoS categories to ensure that these tasks are performed by the thread assigned to the highest execution priority group.
Although not depicted in fig. 4C, the dynamic scheduling information 414 may include some of the same information as the static scheduling information 412. For example, in addition to identifying a particular task 112, the information 414 may also indicate a resource assignment 440 and an execution priority assignment 460 for that task 112. As will be discussed in connection with FIG. 5B, the information 414 may also include interrelationship information collected from the computational graph 142 such that the executor 233 may know the order in which to perform the tasks 112 based on their dependencies.
Turning now to fig. 4D, a block diagram of the graph analyzer 232 implementing the feedback loop 470 is depicted. As discussed above, the graph analyzer 232 may receive health telemetry 322 that indicates potential health of the computing device 10 in order to determine the schedule 144 and ensure that various timing constraints may be met. However, in some cases, analyzer 232 may determine that it is unable (or no longer able) to meet these constraints and employ feedback loop 470.
As shown in fig. 4D, the feedback loop 470 may include a graph analyzer 232 (or more generally, the scheduler 140) that receives the initial computational graph 142A (or portions of the initial computational graph 142A) from a process requesting execution of the task 112 (such as the algorithm 220). Based on the health telemetry 322, the graph analyzer 232 may determine whether the schedule 144 meeting the one or more timing constraints can be determined or whether the existing schedule 144 can be modified in a manner that meets the one or more timing constraints. In response to determining that at least one of the timing constraints cannot be met, the graph analyzer 232 may notify the process requesting the task 112 by sending a feedback notification 472 to the process. In some implementations, the notification 472 can indicate a cause of the notification, such as identifying a time constraint that cannot be complied with based on the current health of the computing device. The notification 472 may also indicate the reason for the notification, such as indicating that the number of available CPU cores has dropped due to one of the cores reaching its thermal limit, a lack of battery power to supply power, and so on. The notification 472 may also indicate the desired power consumption of the requestor, which may be expressed as a mode (e.g., lower power mode or higher power mode), an amount (e.g., 20 mW), and so forth.
In response to receiving notification 472, the requesting process may determine to alter its task 112 and provide the altered task to graph analyzer 232. In some embodiments, these tasks 112 may be predetermined. For example, when device 10 is operating in a low power mode/constrained power mode, a requesting process may maintain a computational graph 142B for a set of tasks 112 to be performed, and when device 10 is operating in a high power mode/unconstrained power mode, a requesting process may maintain another computational graph 142B for a set of tasks 112 to be performed. In another embodiment, the requesting process may employ a machine learning algorithm to dynamically determine a set of tasks 112 to execute based on the received notification 472. In some implementations, the process receiving notification 472 can determine the solution to notification 472 itself; however, in other embodiments, the requesting process coordinates with one or more other processes that provide tasks 112 to scheduler 140 to determine a solution for handling notifications 472. Such a solution may require contacting other processes to determine, for example, that not only should the notified process change its requested task 112, but that other processes should also change their tasks. For example, the reality algorithm 220 may need to contact other processes higher in the program stack 200 to indicate that it is no longer able to deliver a particular quality of service and request further input from those processes.
Based on the feedback notification 472 provided, in the illustrated embodiment, the graph analyzer 232 receives a modified computational graph 142B (or a portion of the modified computational graph 142B) that specifies an updated set of tasks 112. The graph analyzer 232 may then determine a new schedule 144 based on the correlations defined in the updated computation graph (if the computation graph may be determined). As described above, this feedback loop 470 may allow for a more intelligent solution to be determined for handling the problem of poor health of the computing device 10 before the computing device 10 reaches its thermal limit, for example, and relies on hardware to initiate a sudden retraction to protect the device 10.
Turning now to FIG. 5A, a block diagram of an executor 233 executing a static schedule 500A is depicted. As described above, the executor 233 may be responsible for implementing the schedule 144 determined by the graph analyzer 232.
In various embodiments, static schedule 500A begins with executor 233 receiving schedule 144 and evaluating included static schedule information 412 to determine what resources 120 are assigned. Based on this evaluation, in the illustrated embodiment, the executor 233 may issue a prioritized scheduling instruction 148 for requesting the kernel 130 to provide one or more high priority worker threads 510 to execute the task 112. In some cases, the kernel 130 may provide a thread 510 to execute a single task 112, such as a thread 510B that executes only task C as shown. In other cases, kernel 130 may dispatch thread 510 to executor 233, which remains available to executor 233 for executing multiple tasks, such as thread 510A, which processes tasks such as tasks A, B and D as shown. In such embodiments, executing multiple separate tasks 112 by virtue of the dispatched threads 510 may more efficiently use system resources because the threads require time to create and destroy, as well as time to perform context switching.
While execution of the thread 510 is scheduled by the kernel scheduler 132, the executor 233 may cause execution of the tasks 112 according to the schedule 144 by controlling what tasks 112 are provided to the thread 510 and when. For example, the executor 233 may delay providing the task 112 to the thread 510 until a time frame identified in the schedule 144 that the task 112 should be executed. As another example, if the first task 112 depends on the output of the second task 112, the executor 233 may not provide the first task 112 to the thread 510 until the second task 112 is complete. To ensure that tasks 112 are performed according to schedule 144, executor 233 may track the execution of tasks 112 so that the executor may determine when tasks 112 are completed. In some embodiments, the executor 233 may also track various metrics to facilitate determining the subsequent schedule 144, such as tracking how long it has traditionally taken to perform a particular task 112. In the illustrated embodiment, this tracking may be handled by an execution monitor of the executor 233, which may submit its own tasks 112 to the worker thread 510. When a task 112 is completed by a thread 510, the executor 233 may also collect results and feed those results to subsequent tasks 112 that rely on them as input, or the executor 233 may send the collected results back to the process requesting execution of the task 112.
Turning now to FIG. 5B, a block diagram of an actuator 233 that performs dynamic scheduling 500B is depicted. In the illustrated embodiment, the executor 233 uses a dependency array 520, a camper counter 530, a shared view pool 540, one or more ready queues 550, and one or more thread pools 560. In other embodiments, the executor 233 may perform different dynamic scheduling than shown.
In various embodiments, the dependency array 520 is a data structure used to track dependencies of tasks 112. For example, as shown, array 520 may indicate that a set of tasks B-D are dependent on task A, as task A may produce results on which tasks B-D depend. In some embodiments, array 520 may be implemented using a bitmap data structure.
In various embodiments, the camper counter 530 is a data structure used to track the number of tasks 112 that a given task 112 is currently waiting for before the given task is ready to be executed. Continuing with the example above, tasks B-D are each assigned a count of 1 while they are all waiting for a task to complete. In some implementations, camper count 530 may be implemented using an integer array.
In various embodiments, shared view pool 540 is a data structure used to temporarily store the inputs required by tasks 112 until they can be executed by thread 510. In some embodiments, threads 510 in thread pool 560 may access pool 540 in order to retrieve the desired data. When tasks 112 are completed, their outputs may be placed into pool 540 so that they may be used for subsequent tasks 112 or to return results to application 110. For example, as shown, the output of task A may be used for access; however, the output of tasks B-D may not be available because these tasks have not yet been completed. In some embodiments, shared view pool 540 is used to provide a central location for storing data needed by tasks 112 and to reduce the number of outstanding copies of data needed by tasks 112 in memory.
In various embodiments, ready queue 550 is a queue that identifies what tasks 112 are ready to be performed. In some implementations, the queue 550 includes various information needed to perform the task 112, such as program instructions to be executed by the thread 510, execution priority assignment 460, location of input data, and the like. In some implementations, a given queue 550 may be associated with a particular execution priority based on the task 112 being queued in the queue 550.
In various embodiments, thread pool 560 is a collection of threads 510 dispatched by kernel 130 to continuously execute tasks 112 for executor 233. In the illustrated embodiment, the executor 233 may initially issue a prioritized scheduling instruction 148 to the core 130 for causing dispatch of threads 510 in the pool 560. In the example depicted in fig. 5B, the executor 233 specifically requests a first pool 560A of threads 510 with real-time execution priority and a second pool 560B of threads 510 with best effort execution priority. In such an example, tasks 112 with more time-sensitive constraints may be assigned to real-time thread pool 560A, while tasks 112 with less time-sensitive constraints may be assigned to best effort pool 560B.
In the illustrated embodiment, dynamic schedule 500B may begin with executor 233 processing dynamic schedule information 414 from received schedule 144. The processing may include, for example, initializing the dependency array 520, waiting for a count 530, and sharing entries in the view pool 540. The executor 233 may also issue instructions 148 for the thread pool 560 and enqueue any ready tasks 112 in the ready queue 550. Once dispatched, thread 510 may begin dequeuing, executing task 112, and loading the results into shared pool 540. While executing task 112, executor 233 may update camper count 530 and place ready task 112 into queue 550. Dynamic schedule 500B may proceed as additional dynamic schedule information 414 is received from graph analyzer 232 as schedule 144 is updated or a new schedule 144 is created.
Turning now to fig. 6A, a flow chart of a method 600 is shown. Method 600 is one embodiment of a method that may be performed by an application layer scheduler (such as scheduler 140) that facilitates scheduling of tasks for a process. In many cases, performing method 600 may allow for more intelligent scheduling of tasks.
In step 605, a first scheduler (e.g., scheduler 140) executing at an application layer (e.g., application layer 102) of a computing device receives a computational graph (e.g., computational graph 142) defining interrelationships of a set of tasks (e.g., tasks 112). In some implementations, the set of tasks may be performed by a computing device to provide an augmented reality (XR) experience to a user. In some embodiments, the first scheduler receives timing constraint information (e.g., time constraint 432) identifying one or more timing constraints of one or more tasks in the set of tasks from a process (e.g., application 110, system framework 210, reality algorithm 220, etc.) requesting execution of the set of tasks. In such embodiments, the first scheduler determines whether a schedule that satisfies the one or more timing constraints can be determined and notifies the process (e.g., via feedback notification 472) in response to determining that at least one of the timing constraints cannot be satisfied. In some embodiments, a computational graph is received from the process, and in response to the notification, the first scheduler receives an updated computational graph (e.g., computational graph 142B) that specifies an updated set of tasks. In some embodiments, the computational graph includes a graph node (e.g., graph node 420) that is directed to a first task of the set of tasks: 1) Designating a second task of the set of tasks to provide input to be used in executing the first task, and 2) designating a third task of the set of tasks to receive output resulting from executing the first task.
In step 610, the first scheduler determines a schedule (e.g., schedule 144) for implementing the set of tasks based on the correlations defined in the computational graph. In various implementations, the first scheduler receives health information (e.g., system health information 146) aggregated from one or more sensors in the computing device and indicating a current health of the computing device. In such an embodiment, the first scheduler determines a schedule based on the health information, and in response to the health information indicating a change in a current health of the computing device, the first scheduler modifies the schedule based on the health information. In some implementations, the health information includes thermal information (e.g., thermal statistics 308) indicating one or more temperatures measured relative to the computing device and power consumption information (e.g., power statistics 306) indicating the power being consumed by the computing device.
In step 615, the first scheduler issues one or more instructions (e.g., instructions 148) for causing a second scheduler (e.g., scheduler 132) executing at a kernel layer (e.g., kernel layer 104) of the computing device to schedule execution of the set of tasks according to the determined schedule. In various embodiments, a first scheduler issues one or more instructions (e.g., instructions 148) to a kernel of a computing device requesting one or more threads (e.g., threads 510) having a particular execution priority according to a determined schedule, and tasks of the set of tasks are provided to the one or more threads according to the determined schedule, execution of the one or more threads being scheduled by a second scheduler. In some embodiments, a process requesting to perform the set of tasks cannot request a thread with a particular execution priority. In various embodiments, the first scheduler tracks execution of the set of tasks (e.g., using elements 520-550) and, when it is determined that the tasks are ready for execution based on the tracking, enqueues the tasks in the set of tasks in a ready queue (e.g., ready queue 550). In such embodiments, the providing comprises: the one or more threads dequeue tasks from the ready queue to perform dequeued tasks. In various embodiments, the second scheduler is one of a plurality of schedulers (e.g., schedulers 132A-C) executing at the kernel layer and associated with the plurality of resources, and the first scheduler issues instructions for causing the scheduler of the plurality of schedulers to schedule execution of the task by the plurality of resources according to the determined schedule. In some embodiments, the plurality of resources includes a Central Processing Unit (CPU) and a Graphics Processing Unit (GPU).
In some embodiments, the method 600 further comprises: the kernel receives a request from a process for a thread to execute a task (e.g., task 114) in a manner that is independent of using the first scheduler and dispatches the requested thread at a lower execution priority than the higher execution priority.
Turning now to fig. 6B, a flow chart of a method 620 is depicted. Method 620 is one embodiment of a method that may be performed by an executing process coupled to an application layer scheduler, such as application 110 or reality algorithm 220. In many cases, execution method 620 may allow tasks of an application to be more efficiently scheduled and executed.
In step 625, the process provides a set of tasks (e.g., tasks 112) to a first scheduler (e.g., scheduler 140) executing at an application layer (e.g., application layer 102) of the computing device, including identifying correlations of the set of tasks to the first scheduler. In various embodiments, the interrelationships can be used by the first scheduler to determine a schedule (e.g., schedule 144) for performing the set of tasks. In some embodiments, the set of tasks is performed to generate augmented reality (XR) content. In some embodiments, the process provides nodes (e.g., nodes 420) of the computational graph (e.g., computational graph 142) analyzed by the first scheduler, the nodes including a first node that targets a first task of the set of tasks: a second task of the set of tasks is identified as providing input for the first task and a third task of the set of tasks is identified as receiving output from the first task. In some embodiments, the process provides one or more time constraints (e.g., time constraint 432) for performing the set of tasks, which can be used by the first scheduler to determine the schedule.
In step 630, the process receives results from executing the set of tasks according to the schedule, the results generated by threads (e.g., threads 510) scheduled by a second scheduler (e.g., kernel scheduler 132) executing at a kernel layer (e.g., kernel layer 104) of the computing device. In various embodiments, the process receives a notification (e.g., feedback notification 472) from the first scheduler indicating that at least one of the time constraints cannot be complied with based on the current health of the computing device. In response to the notification, the process determines to alter the set of tasks and provides the altered set of tasks (e.g., the modified computational graph 142B) to the first scheduler. In some embodiments, the determining comprises: in coordination with one or more other processes (e.g., other applications 110, algorithms 220, etc.) that provide tasks to the first scheduler for execution.
Turning now to fig. 6C, a flow chart of a method 640 is depicted. Method 640 is one embodiment of a method that may be performed by a kernel (such as kernel 130) coupled to an application layer scheduler. In many cases, performing method 640 may allow for more efficient scheduling of tasks.
In step 645, the kernel receives one or more instructions (e.g., instructions 148) from a first scheduler (e.g., scheduler 140) executing at an application layer (e.g., application layer 102) of the computing device to facilitate scheduling execution of a set of tasks (e.g., tasks 112) according to a schedule (e.g., schedule 144). In various embodiments, the first scheduler determines the schedule by analyzing a computational graph (e.g., computational graph 142) defining the interrelationships of the set of tasks. In some embodiments, the set of tasks is performed to provide a visual environment to the user. In some embodiments, the kernel provides health information (e.g., system health information 146) identifying a current health of the computing device, and the first scheduler determines a schedule based on the health information. In some embodiments, the kernel provides system information (e.g., system modeling 404) about one or more hardware resources of the computing device available to perform the task to the first scheduler, and the first scheduler determines a schedule based on the system information.
In step 650, a second scheduler (e.g., kernel scheduler 132) executing at the kernel layer (e.g., kernel layer 104) schedules execution of the set of tasks according to the determined schedule. In some embodiments, the one or more instructions include instructions requesting one or more threads (e.g., thread 510) at a particular execution priority to facilitate execution of tasks in the set of tasks, and the method 640 includes the kernel dispatching the one or more threads to the application layer to execute tasks in the set of tasks. In various embodiments, the scheduling includes the second scheduler scheduling execution of the dispatched one or more threads. In some embodiments, the kernel does not allow a process that provides the set of tasks to the first scheduler to request use of a particular execution priority.
Turning now to fig. 6D, a flow chart of a method 660 is depicted. Method 660 is one embodiment of a method that may be performed by a global scheduler (such as scheduler 140) that facilitates scheduling of tasks for a process. In many cases, execution method 660 may allow for more intelligent scheduling of tasks.
In step 665, the global scheduler of the computing device receives a set of related tasks (e.g., tasks 112) to be performed using heterogeneous resources (e.g., resources 120) of the computing device to provide an augmented reality (XR) experience to a user. In some embodiments, the global scheduler receives a computational graph (e.g., graph 142) defining the interrelationships of the set of related tasks.
In step 670, the global scheduler determines a schedule (e.g., schedule 144) for implementing the set of tasks based on the interrelationships between the tasks in the set of tasks. In some implementations, a schedule is generated based on the received computational graph identifying the first task as an output of the second task. In some embodiments, step 670 includes tracking execution of the set of tasks and queuing the tasks in the set of tasks into a ready queue (e.g., ready queue 550) when the tasks are determined to be ready for execution based on the tracking. In some embodiments, the global scheduler receives health information (e.g., health telemetry 322) indicating a current health of the computing device, and in response to the health information indicating a current health change of the computing device, the global scheduler modifies the schedule based on the health information. In some embodiments, the global scheduler determines whether a schedule that satisfies the one or more timing constraints can be determined and the process that requested (e.g., via feedback notification 472) to provide one or more tasks of the set of tasks provides a different one or more tasks.
In step 675, the global scheduler issues instructions (e.g., instructions 148) to a set of resource-specific schedulers (e.g., scheduler 132) for scheduling execution of the set of tasks according to the determined schedule. In such embodiments, the resource-specific schedulers in the set can be executed to schedule tasks for corresponding ones of the heterogeneous resources. In some implementations, one scheduler of the set of resource-specific schedulers includes a scheduler for a Graphics Processing Unit (GPU) (e.g., scheduler 132C). In some embodiments, one scheduler of the set of resource-specific schedulers includes a scheduler (e.g., scheduler 132B) for a neural engine configured to implement one or more neural networks. In various embodiments, the set of resource-specific schedulers assign execution priorities to the set of tasks, and the execution priorities are only available for tasks scheduled by the global scheduler.
Turning now to fig. 6E, a flow chart of a method 680 is depicted. Method 680 is one embodiment of a method that may be performed by a scheduler (such as scheduler 140) using health telemetry data. In many cases, performing method 680 may allow tasks to be scheduled more intelligently.
In step 682, the scheduler receives a set of related tasks (e.g., task 112) to be performed by the computing device to provide an augmented reality (XR) experience to the user. In some embodiments, the scheduler receives current health telemetry data that includes one or more power consumption statistics (e.g., power statistics 306) that indicate a current power consumption of the computing device. In some embodiments, the scheduler receives current health telemetry data that includes one or more thermal statistics (e.g., thermal statistics 308) that indicate one or more temperatures measured by one or more sensors of the computing device. In some embodiments, the scheduler receives current health telemetry data that includes one or more performance statistics (e.g., performance statistics 304) that indicate a current utilization of resources for performing one or more tasks in the set of tasks.
In step 684, the scheduler generates a schedule (e.g., scheduler 144) for implementing the set of tasks based on the current health telemetry data (e.g., health telemetry 322) indicating the ability of the computing device to execute the set of tasks. In some embodiments, the scheduler receives a computational graph (e.g., graph 142) defining the interrelationships of the set of related tasks and generates a schedule based on the received computational graph.
In step 686, after generating the schedule, the scheduler determines that the schedule can no longer be implemented based on the change in the current health telemetry data. In some embodiments, the determining is based on the one or more power consumption statistics. In some embodiments, the determining is based on the one or more thermal statistics. In some embodiments, the determining is based on the one or more performance statistics. In some embodiments, the scheduler determines respective availability of a plurality of resources for performing the set of tasks based on the current health telemetry data, and determines that the schedule can no longer be implemented based on the determined availability of one or more of the resources. In some embodiments, the scheduler determines availability of the computing device to provide power to perform the set of tasks based on the current health telemetry data, and determines that the schedule can no longer be implemented based on the determined availability. In some embodiments, the computational graph defines one or more timing constraints for the set of related tasks, and the determination is based on the defined one or more timing constraints.
In response to the determination, the scheduler sends a request (e.g., feedback notification 472) for a modified set of tasks to account for the change in current health telemetry data in step 688. In various embodiments, in response to sending the request, the scheduler receives a modified portion of the computational graph from a process requesting execution of one or more tasks of the set of tasks. In some implementations, the scheduler determines the modified schedule based on the modified portion of the computational graph.
Turning now to FIG. 7, a block diagram of components within computing device 10 is depicted. In the illustrated embodiment, the computing device 10 is depicted as a Head Mounted Display (HMD) 700 configured to be worn on the head and to display content to a user, such as an XR view 702 of an XR environment. For example, HMD 700 may be headphones, helmets, goggles, glasses, a phone inserted into a housing, or the like that are worn by a user. However, as described above, computing device 10 may correspond to other devices in other embodiments that may not present XR content. In the illustrated embodiment, HMD 700 includes environmental sensor 704, user sensor 706, display system 710, controller 720, memory 730, secure element 740, and network interface 750. In some embodiments, HMD 700 may be implemented in a different manner than shown. For example, HMD 700 may include multiple network interfaces 750, HMD 700 may not include secure element 740, etc.
In various embodiments, the world sensor 704 is a sensor configured to collect various information about the environment in which the user wears the HMD 700. In some implementations, the world sensor 704 can include one or more visible light cameras that capture video information of the user environment. This information may also be used, for example, to provide an XR view 702 of the real environment, detect objects and surfaces in the environment, provide depth information for objects and surfaces in the real environment, provide position (e.g., position and orientation) and motion (e.g., direction and speed) information for a user in the real environment, and so forth. In some implementations, the HMD 700 may include left and right cameras located on a front surface of the HMD 700 at positions substantially in front of each of the user's eyes. In other implementations, more or fewer cameras may be used in HMD 700 and may be positioned at other locations.
In some implementations, the world sensor 704 may include one or more world-wide-mark-measuring sensors (e.g., infrared (IR) sensors with IR illumination sources or light detection and ranging (LIDAR) transmitters and receivers/detectors), for example, that capture depth information or range information of objects and surfaces in the user's environment. The range information may be used, for example, in conjunction with frames captured by a camera to detect and identify objects and surfaces in a real world environment, and to determine the position, distance, and speed of the objects and surfaces relative to the current position and motion of the user. The range information may also be used to locate a virtual representation of the real world object to be synthesized into the XR environment at the correct depth. In some embodiments, the range information may be used to detect a collision with real world objects and surfaces to redirect the user's walking possibility. In some implementations, the world sensor 704 may include one or more light sensors (e.g., on the front and top of the HMD 700) that capture illumination information (e.g., direction, color, and intensity) in the user's physical environment. For example, this information may be used to alter the brightness and/or color of the display system in HMD 700.
In various embodiments, the user sensor 706 is a sensor configured to collect various information about a user wearing the HMD 700. In some implementations, the user sensors 706 can include one or more head pose sensors (e.g., IR or RGB cameras) that can capture information regarding the position and/or motion of the user and/or the user's head. The information collected by the head pose sensor may be used, for example, to determine how to render and display view 702 and content within the view of the XR environment. For example, the different views 702 of the environment may be rendered based at least in part on the position of the user's head, whether the user is currently traversing the environment, and so on. As another example, the enhanced location and/or motion information may be used to compose virtual content into a scene in a fixed location relative to a background view of the environment. In some implementations, there may be two head pose sensors located on the front or top surface of HMD 700; however, in other embodiments, more (or fewer) head pose sensors may be used and may be positioned at other locations.
In some implementations, the user sensors 706 can include one or more eye tracking sensors (e.g., an IR camera with an IR illumination source) that can be used to track the position and movement of the user's eyes. In some implementations, the information collected by the eye tracking sensor may be used to adjust the rendering of the image to be displayed and/or may be used to adjust the display of the image by the display system of the HMD 700 based on the direction and angle of the user's eye gaze. In some embodiments, the information collected by the eye tracking sensor may be used to match the direction of the eyes of the user's avatar with the direction of the user's eyes. In some implementations, the brightness of the displayed image can be adjusted based on the user's pupil dilation determined by the eye tracking sensor. In some implementations, the user sensors 706 can include one or more eyebrow sensors (e.g., an IR camera with IR illumination) that track the expression of the user's eyebrows/forehead. In some implementations, the user sensors 706 can include one or more mandibular tracking sensors (e.g., an IR camera with IR illumination) that track the user's mouth/jaw expression. For example, in some embodiments, the expressions of the eyebrows, mouth, jaw, and eyes captured by the sensor 706 may be used to simulate expressions on the user's avatar in a co-presence experience and/or may be used to selectively render and synthesize virtual content for viewing based at least in part on the user's reaction to content displayed by the HMD 700.
In some implementations, the user sensors 706 can include one or more hand sensors (e.g., an IR camera with IR illumination) that track the position, movement, and pose of the user's hands, fingers, and/or arms. For example, in some embodiments, the detected position, movement, and gesture of the user's hands, fingers, and/or arms may be used to simulate the movement of the user's avatar's hands, fingers, and/or arms in a co-presence experience. As another example, the detected hand and finger gestures of the user may be used to determine user interactions with virtual content in the virtual space, including, but not limited to, manipulating gestures of virtual objects, interacting with virtual user interface elements displayed in the virtual space, and the like.
In some embodiments, the system framework 210 and the reality algorithm 220 may enable application of world sensor 704 and/or user sensor 706.
In various embodiments, the display system 710 is configured to display the rendered frames to a user. The display 710 may implement any of various types of display technologies. For example, as discussed above, the display system 710 may include a near-eye display that presents left and right images to create the effect of the three-dimensional view 702. In some embodiments, the near-eye display may use Digital Light Processing (DLP), a Liquid Crystal Display (LCD), liquid crystal on silicon (LCoS), or Light Emitting Diodes (LEDs). As another example, the display system 710 may include a direct retinal projector that directly scans frames including left and right images, pixel by pixel, to the user's eye via a reflective surface (e.g., a mirror lens). To create a three-dimensional effect in view 702, objects at different depths or distances in the two images are shifted to the left or right as a function of triangulation of the distances, with closer objects being shifted much more than more distant objects. The display system 710 may support any medium, such as an optical waveguide, a holographic medium, an optical combiner, an optical reflector, or any combination thereof. In some embodiments, the display system 710 may be transparent or translucent and configured to selectively become opaque.
In various implementations, the controller 720 includes circuitry configured to facilitate operation of the HMD 700. Accordingly, controller 720 may include one or more processors configured to execute program instructions (such as program instructions of application 110, kernel 130, intelligent scheduler 140, etc.) to cause HMD 700 to perform various operations described herein. The processors may be CPUs configured to implement any suitable instruction set architecture and may be configured to execute instructions defined in the instruction set architecture. For example, in various embodiments, controller 720 may include a general purpose processor or an embedded processor implementing any of a variety of Instruction Set Architectures (ISAs), such as an ARM, x86, powerPC, SPARC, RISC, or MIPS ISA, or any other suitable ISA. In a multiprocessor system, each processor may collectively implement the same ISA, but is not required. Controller 720 may employ any microarchitecture, including scalar, superscalar, pipelined, superpipelined, out-of-order, ordered, speculative, non-speculative, etc., or a combination thereof. Controller 720 may include circuitry to implement microcode techniques. Controller 720 may include one or more levels of cache that may take any size and any configuration (set associative, direct mapped, etc.).
In some implementations, controller 720 may include a GPU, which may include any suitable graphics processing circuitry. In general, a GPU may be configured to render objects to be displayed into a frame buffer (e.g., a frame buffer that includes pixel data for an entire frame). The GPU may include one or more graphics processors that may execute graphics software to perform some or all of the graphics operations or hardware acceleration of certain graphics operations. In some implementations, the controller 720 may include one or more other components for processing and rendering video and/or images, such as an Image Signal Processor (ISP), encoder/decoder (codec), and the like. In some embodiments, controller 720 may be implemented as a system on a chip (SOC).
In various embodiments, memory 730 is a non-transitory computer readable medium configured to store data and program instructions (such as data and program instructions in application 110, kernel 130, intelligent scheduler 140, etc.) for execution by a processor in controller 720. Memory 730 may include any type of volatile memory such as Dynamic Random Access Memory (DRAM), synchronous DRAM (SDRAM), double data rate (DDR, DDR2, DDR3, etc.) SDRAM (including mobile versions of SDRAM, such as mDDR3, etc., or low power versions of SDRAM, such as LPDDR2, etc.), RAMBUSDRAM (RDRAM), static RAM (SRAM), etc. Memory 730 may also be any type of non-volatile memory, such as NAND flash memory, NOR flash memory, nano RAM (NRAM), magnetoresistive RAM (MRAM), phase change RAM (PRAM), racetrack memory, memristor memory. In some implementations, one or more memory devices may be coupled to a circuit board to form a memory module, such as a single in-line memory module (SIMM), dual in-line memory module (DIMM), or the like. Alternatively, the devices may be mounted with integrated circuit implementation systems in a chip stack configuration, a package stack configuration, or a multi-chip module configuration.
In various embodiments, the Secure Element (SE) 740 is a secure circuit configured to perform various secure operations on the HMD 700. As used herein, the term "secure circuitry" refers to circuitry that protects isolated internal resources from direct access by external circuitry, such as controller 720. The internal resource may be a memory that stores sensitive data such as personal information (e.g., biometric information, credit card information, etc.), encryption keys, a random number generator seed, etc. The internal resource may also be circuitry that performs services/operations associated with sensitive data, such as encryption, decryption, generation of digital signatures, etc. For example, SE 740 may maintain one or more cryptographic keys for encrypting data stored in memory 730 in order to improve security of HMD 700. As another example, the secure element 740 may also maintain one or more cryptographic keys to establish a secure connection, authenticate the HMD 700 or a user of the HMD 700, and so forth. For another example, SE 740 may maintain biometric data of a user and be configured to perform biometric authentication by comparing the maintained biometric data with biometric data collected by one or more user sensors 706. As used herein, "biometric data" refers to data that uniquely identifies a user (at least to a high degree of accuracy) among others based on the physical or behavioral characteristics of the user, such as fingerprint data, voice recognition data, facial data, iris scan data, and the like.
In various embodiments, network interface 750 includes one or more interfaces configured to communicate with external entities. The network interface 750 may support any suitable wireless technology, such as Long term evolution TM, etc., or any suitable wired technology, such as ethernet, fibre channel, universal serial bus TM (USB), etc. In some implementations, interface 750 may implement proprietary wireless communication technology (e.g., 90 gigahertz (GHz) wireless technology) that provides highly directional wireless connectivity. In some implementations, HMD 700 may select between different available network interfaces based on connectivity of the interfaces and the particular user experience delivered by HMD 700. For example, if a particular user experience requires a large amount of bandwidth, HMD 700 may select a radio component that supports proprietary wireless technology when wirelessly communicating to stream higher quality content. However, if the user is simply a lower quality movie, then/>May be sufficient and selected by the HMD 700. In some implementations, HMD 700 may use compression for communication in situations such as limited bandwidth.
***
Although specific embodiments have been described above, these embodiments are not intended to limit the scope of the disclosure, even where only a single embodiment is described with respect to a particular feature. The characteristic examples provided in this disclosure are intended to be illustrative, not limiting, unless stated differently. The above description is intended to cover such alternatives, modifications, and equivalents as will be apparent to those skilled in the art having the benefit of this disclosure.
The scope of the present disclosure includes any feature or combination of features disclosed herein (either explicitly or implicitly) or any generalisation thereof, whether or not it mitigates any or all of the problems addressed herein. Accordingly, new claims may be formulated to any such combination of features during prosecution of the present patent application (or of a patent application claiming priority thereto). In particular, with reference to the appended claims, features from dependent claims may be combined with those of the independent claims and features from respective independent claims may be combined in any appropriate manner and not merely in the specific combinations enumerated in the appended claims.

Claims (39)

1. A non-transitory computer readable medium having stored therein program instructions executable by a computing device to perform operations comprising:
Receiving, at a first scheduler executing at an application layer of the computing device, a computational graph defining interrelationships of a set of tasks to be performed by the computing device to provide an augmented reality (XR) experience to a user;
Determining, by the first scheduler, a schedule for implementing the set of tasks based on the correlations defined in the computational graph; and
One or more instructions are issued by the first scheduler for causing a second scheduler executing at a kernel layer of the computing device to schedule execution of the set of tasks according to the determined schedule.
2. The computer-readable medium of claim 1, wherein the operations further comprise:
Receiving, by the first scheduler, health information aggregated from one or more sensors in the computing device and indicative of a current health of the computing device;
determining, by the first scheduler, the schedule based on the health information; and
In response to the health information indicating the current health change of the computing device, the first scheduler modifies the schedule based on the health information.
3. The computer-readable medium of claim 2, wherein the health information includes thermal information indicative of one or more temperatures measured relative to the computing device; and
Wherein the health information includes power consumption information indicating power being consumed by the computing device.
4. The computer-readable medium of claim 1, wherein the operations further comprise:
Receiving, by the first scheduler, timing constraint information identifying one or more timing constraints for one or more tasks in the set of tasks from a process requesting execution of the set of tasks; and
Wherein determining the schedule comprises:
WO 2023/049287A1
determining, by the first scheduler, whether a schedule satisfying the one or more timing constraints can be determined; and
The process is notified by the first scheduler in response to determining that at least one of the timing constraints cannot be met.
5. The computer-readable medium of claim 4, wherein the computational graph is received from the process; and
Wherein the determining comprises:
In response to the notification, the first scheduler receives an updated computational graph specifying an updated set of tasks; and
The schedule is determined by the first scheduler based on correlations defined in the updated computational graph.
6. The computer-readable medium of claim 1, wherein the issuing comprises:
Issuing, by the first scheduler, one or more instructions to a kernel of the computing device for requesting one or more threads having a particular execution priority according to the determined schedule, wherein a process requesting execution of the set of tasks cannot request threads having the particular execution priority; and
Providing, by the first scheduler, tasks of the set of tasks to the one or more threads according to the determined schedule, wherein execution of the one or more threads is scheduled by the second scheduler.
7. The computer-readable medium of claim 6, wherein the issuing comprises:
tracking, by the first scheduler, execution of the set of tasks; and
When it is determined that the task is ready to execute based on the tracking, queuing, by the first scheduler, tasks of the set of tasks in a ready queue; and
Wherein the providing comprises: the one or more threads dequeue tasks from the ready queue to perform the dequeued tasks.
8. The computer-readable medium of claim 6, wherein the operations further comprise:
Receiving, by the kernel from the process, a request for a thread to perform a task in a manner independent of using the first scheduler; and
The requested thread is dispatched by the kernel at a lower execution priority than the particular execution priority.
9. The computer-readable medium of claim 1, wherein the computational graph comprises graph nodes that are for a first task of the set of tasks: 1) Designating a second task of the set of tasks to provide input to be used in executing the first task, and 2) designating a third task of the set of tasks to receive output resulting from executing the first task.
10. The computer-readable medium of claim 1, wherein the second scheduler is one of a plurality of schedulers executing at a kernel layer and associated with a plurality of resources, wherein the plurality of resources comprise a Central Processing Unit (CPU) and a Graphics Processing Unit (GPU), and wherein the operations further comprise:
Instructions for causing a scheduler of the plurality of schedulers to schedule execution of tasks by the plurality of resources according to the determined schedule are issued by the first scheduler.
11. A non-transitory computer readable medium having stored therein program instructions executable by a computing device to perform operations comprising:
Providing a set of tasks to a first scheduler executing at an application layer of the computing device to generate augmented reality (XR) content, wherein the providing comprises:
Identifying, to the first scheduler, a correlation of the set of tasks, wherein the correlation can be used by the first scheduler to determine a schedule for implementing the set of tasks; and
Results from executing the set of tasks according to the schedule are received, wherein the results are generated by threads scheduled by a second scheduler executing at a kernel layer of the computing device.
12. The computer-readable medium of claim 11, wherein the providing comprises:
A node providing a computational graph analyzed by the first scheduler, wherein the node comprises a first node that is directed to a first task of the set of tasks: a second task of the set of tasks is identified as providing input for the first task, and a third task of the set of tasks is identified as receiving output from the first task.
13. The computer-readable medium of claim 11, wherein the providing comprises:
One or more time constraints for performing the set of tasks are provided, wherein the time constraints are usable by the first scheduler to determine the schedule.
14. The computer-readable medium of claim 13, wherein the operations further comprise:
receive, from the first scheduler, a notification indicating that at least one of the time constraints cannot be complied with based on a current health of the computing device;
in response to the notification, determining to alter the set of tasks; and
The modified set of tasks is provided to the first scheduler.
15. The computer-readable medium of claim 14, wherein the determining comprises: in coordination with one or more other processes providing tasks to the first scheduler for execution.
16. A method, comprising:
Receiving, at a kernel of a computing device, one or more instructions from a first scheduler executing at an application layer of the computing device for facilitating scheduling execution of a set of tasks according to a schedule to provide a visual environment to a user, wherein the first scheduler determines the schedule by analyzing a computational graph defining interrelationships of the set of tasks; and
The execution of the set of tasks is scheduled by a second scheduler executing at a kernel layer of the computing device according to the determined schedule.
17. The method of claim 16, wherein the one or more instructions comprise instructions requesting one or more threads at a particular execution priority to facilitate execution of tasks in the set of tasks, and wherein the method further comprises:
dispatching, by the kernel, the requested one or more threads to the application layer to perform tasks of the set of tasks, wherein the scheduling comprises: the second scheduler schedules execution of the assigned one or more threads.
18. The method of claim 17, wherein the kernel does not allow a process providing the set of tasks to the first scheduler to request use of the particular execution priority.
19. The method of claim 16, further comprising:
Health information identifying a current health of the computing device is provided by the kernel, wherein the first scheduler determines the schedule based on the health information.
20. The method of claim 16, further comprising:
System information regarding one or more hardware resources of the computing device available to perform tasks is provided by the kernel to the first scheduler, wherein the first scheduler determines the schedule based on the system information.
21. A non-transitory computer readable medium having stored therein program instructions executable by a computing device to perform operations comprising:
Receiving, at a global scheduler of the computing device, a set of related tasks to be performed using heterogeneous resources of the computing device to provide an augmented reality (XR) experience to a user;
Determining, by the global scheduler, a schedule for implementing the set of tasks based on correlations between tasks in the set of tasks; and
Issuing, by the global scheduler, instructions to a set of resource-specific schedulers for scheduling execution of the set of tasks according to the determined schedule, wherein a resource-specific scheduler in the set is capable of being executed to schedule tasks for corresponding ones of the heterogeneous resources.
22. The computer-readable medium of claim 21, wherein one scheduler of the set of resource-specific schedulers comprises a scheduler for a Graphics Processing Unit (GPU).
23. The computer-readable medium of claim 21, wherein one scheduler of the set of resource-specific schedulers comprises a scheduler for a neural engine configured to implement one or more neural networks.
24. The computer-readable medium of claim 21, wherein the operations further comprise:
receiving a computational graph defining the interrelationships of the set of related tasks; and
Wherein the schedule is generated based on the received computational graph identifying the first task as an output of the received second task.
25. The computer-readable medium of claim 21, wherein determining the schedule comprises:
Tracking, by the global scheduler, execution of the set of tasks; and
When the task is determined to be ready for execution based on the tracking, the tasks of the set of tasks are queued into a ready queue by the global scheduler.
26. The computer-readable medium of claim 21, wherein the operations further comprise:
receiving, by the global scheduler, health information indicating a current health of the computing device; and
In response to the health information indicating the current health change of the computing device, the global scheduler modifies the schedule based on the health information.
27. The computer-readable medium of claim 26, wherein the operations further comprise:
determining, by the global scheduler, whether a schedule satisfying the one or more timing constraints can be determined; and
A process requested by the global scheduler to provide one or more tasks of the set of tasks provides a different one or more tasks.
28. The computer-readable medium of claim 21, wherein the operations further comprise:
An execution priority is assigned to the set of tasks by the set of resource-specific schedulers, wherein the execution priority is only available for tasks scheduled by the global scheduler.
29. The computer-readable medium of claim 21, wherein the set of resource-specific schedulers comprises one or more schedulers executing at a kernel layer; and
Wherein the global scheduler executes an application layer.
30. A non-transitory computer readable medium having stored therein program instructions executable by a computing device to perform operations comprising:
Receiving, at a scheduler, a set of related tasks to be performed by the computing device to provide an augmented reality (XR) experience to a user;
Generating, by the scheduler, a schedule for implementing the set of tasks based on current health telemetry data indicating the ability of the computing device to execute the set of tasks; after generating the schedule, determining, by the scheduler, that the schedule is no longer capable of being implemented based on the change in the current health telemetry data; and
Responsive to the determination, a request for a modified set of tasks is sent by the scheduler to account for the change in the current health telemetry data.
31. The computer-readable medium of claim 30, wherein the operations further comprise:
Receiving current health telemetry data comprising one or more power consumption statistics indicative of current power consumption of the computing device; and
Wherein the determining is based on the one or more power consumption statistics.
32. The computer-readable medium of claim 30, wherein the operations further comprise:
Receiving current health telemetry data comprising one or more thermal statistics indicative of one or more temperatures measured by one or more sensors of the computing device; and
Wherein the determining is based on the one or more thermal statistics.
33. The computer-readable medium of claim 30, wherein the operations further comprise:
Receiving current health telemetry data comprising one or more performance statistics indicative of a current utilization of resources for performing one or more tasks of the set of tasks; and
Wherein the determining is based on the one or more performance statistics.
34. The computer-readable medium of claim 30, wherein the operations further comprise:
determining respective availability of a plurality of resources for performing the set of tasks based on the current health telemetry data; and
Wherein it is determined that the schedule is no longer capable of being implemented based on the determined availability of one or more of the resources.
35. The computer-readable medium of claim 30, wherein the operations further comprise:
Determining availability of the computing device to provide power to perform the set of tasks based on the current health telemetry data; and
Wherein it is determined that the schedule can no longer be implemented based on the determined availability.
36. The computer-readable medium of claim 30, wherein the operations further comprise:
receiving a computational graph defining the interrelationships of the set of related tasks; and
Wherein the schedule is generated based on the received computational graph.
37. The computer-readable medium of claim 36, wherein the computational graph defines one or more timing constraints for the set of related tasks; and
Wherein the determining is based on one or more defined timing constraints.
38. The computer-readable medium of claim 36, wherein the operations further comprise:
in response to sending the request, a modified portion of the computational graph is received from a process requesting execution of one or more tasks of the set of tasks.
39. The computer-readable medium of claim 38, wherein the operations further comprise:
a modified schedule is determined by the scheduler based on the modified portion of the computational graph.
CN202280062283.5A 2021-09-23 2022-09-22 Intelligent dispatcher Pending CN117940902A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202163247567P 2021-09-23 2021-09-23
US63/247,564 2021-09-23
US63/247,567 2021-09-23
PCT/US2022/044430 WO2023049287A1 (en) 2021-09-23 2022-09-22 Intelligent scheduler

Publications (1)

Publication Number Publication Date
CN117940902A true CN117940902A (en) 2024-04-26

Family

ID=90765212

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280062283.5A Pending CN117940902A (en) 2021-09-23 2022-09-22 Intelligent dispatcher

Country Status (1)

Country Link
CN (1) CN117940902A (en)

Similar Documents

Publication Publication Date Title
US11816776B2 (en) Distributed processing in computer generated reality system
Yi et al. Heimdall: mobile GPU coordination platform for augmented reality applications
JP6348176B2 (en) Adaptive event recognition
US20210312586A1 (en) Contextual configuration adjuster for graphics
CN112639943A (en) Recessed color correction for improving color uniformity of head-mounted displays
US20220075633A1 (en) Method and Device for Process Data Sharing
US20230328302A1 (en) Dynamically reducing stutter and latency in video streaming applications
CN117940902A (en) Intelligent dispatcher
KR20240046896A (en) intelligent scheduler
CN116420129A (en) Multi-protocol synchronization
US11533351B2 (en) Efficient delivery of multi-camera interactive content
US11722540B2 (en) Distributed encoding
US20230034884A1 (en) Video compression techniques for reliable transmission
JP2021093125A (en) Method and device for tracking eye based on eye reconstruction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination