CN110494848A - Task processing method, equipment and machine readable storage medium - Google Patents

Task processing method, equipment and machine readable storage medium Download PDF

Info

Publication number
CN110494848A
CN110494848A CN201880012037.2A CN201880012037A CN110494848A CN 110494848 A CN110494848 A CN 110494848A CN 201880012037 A CN201880012037 A CN 201880012037A CN 110494848 A CN110494848 A CN 110494848A
Authority
CN
China
Prior art keywords
task
processed
processing
graphics processor
queue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201880012037.2A
Other languages
Chinese (zh)
Inventor
李庆
夏昌奇
张晓炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SZ DJI Technology Co Ltd
Original Assignee
SZ DJI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SZ DJI Technology Co Ltd filed Critical SZ DJI Technology Co Ltd
Publication of CN110494848A publication Critical patent/CN110494848A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Abstract

A kind of task processing method, equipment and machine readable storage medium, which comprises select waiting task from processing queue;At least one targeted graphical processor corresponding with the waiting task is selected from multiple graphics processors;The waiting task is distributed into the targeted graphical processor;The waiting task is handled by the targeted graphical processor.Using the embodiment of the present invention, guarantee the real-time of data processing, multiple graphics processors can reasonably be managed, multiple waiting tasks are effectively dispatched, the utilization rate of graphics processor is set to be utilized, it is optimal processing speed, the precision and validity of processing result is effectively ensured, increases reliability.

Description

Task processing method, device and machine-readable storage medium Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a task processing method and apparatus, and a machine-readable storage medium.
Background
At present, the Processing process of data such as images or radars can be based on platforms such as ARM (Advanced RISC Machines, Advanced reduced instruction set processors), DSP (Digital Signal Processing ) or CPU (Central Processing Unit), the data amount that can be processed by the Processing platform is affected by factors such as dominant frequency of the processor, memory size, transmission bandwidth, etc., in artificial intelligence application scenarios such as autopilot, sensor data and acquisition frequency make the data magnitude reach GB/s or more, so that the platforms such as ARM, DSP or CPU can not meet the requirement of real-time Processing.
Disclosure of Invention
The invention provides a task processing method, a task processing device and a machine-readable storage medium.
In a first aspect of the present invention, a task processing method is provided, which is applied to a device including a plurality of graphics processors, and the method includes: selecting a task to be processed from a processing queue;
selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors; distributing the task to be processed to the target graphics processor;
and processing the task to be processed by the target graphics processor.
In a second aspect of the present invention, there is provided a task processing device comprising a scheduler and a plurality of graphics processors; the scheduler is used for selecting a task to be processed from a processing queue, selecting at least one target graphics processor corresponding to the task to be processed from the multiple graphics processors, and distributing the task to be processed to the target graphics processor;
and the target graphics processor is used for processing the task to be processed.
In a third aspect of the present invention, a computer-readable storage medium is provided, on which computer instructions are stored, and when the computer instructions are executed, the task processing method is implemented.
Based on the technical scheme, in the embodiment of the invention, the target graphics processor corresponding to the task to be processed can be selected from the multiple graphics processors, the task to be processed is distributed to the target graphics processor, and the task to be processed is processed through the target graphics processor, so that the real-time processing of data is carried out based on the graphics processors, the real-time performance of data processing is ensured, and the real-time processing can be carried out aiming at the data with the GB/s magnitude. In addition, a plurality of graphic processors can be reasonably managed, a plurality of tasks to be processed are effectively scheduled, the utilization rate of the graphic processors is utilized to the maximum extent, the processing speed is optimized, the precision and effectiveness of processing results are effectively guaranteed, and the reliability is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments of the present invention or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present invention, and for those skilled in the art, other drawings may be obtained according to the drawings of the embodiments of the present invention.
FIG. 1 is a schematic diagram of an application scenario of an embodiment of the present invention;
FIG. 2 is a flow diagram of one embodiment of a task processing method of the present invention;
FIG. 3 is a diagram of the present invention for processing a pending task by a target graphics processor;
FIG. 4 is a schematic diagram of the present invention for data sharing by different processing threads;
fig. 5 is a block diagram of one embodiment of a task processing device of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. In addition, the features in the embodiments and the examples described below may be combined with each other without conflict.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein and in the claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should be understood that the term "and/or" as used herein is meant to encompass any and all possible combinations of one or more of the associated listed items.
Although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited to these terms. These terms are used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present invention. Depending on the context, moreover, the word "if" may be used is interpreted as "at … …," or "at … …," or "in response to a determination.
The embodiment of the present invention provides a task Processing method, which may be applied to a Processing device including multiple Graphics Processing units (GPUs for short), and the type of the Processing device is not limited as long as the Processing device includes multiple GPUs. The task processing method can be a multi-task real-time processing method based on a plurality of graphic processors, takes multi-channel sensor data as input, and can be applied to application scenes such as automatic driving, auxiliary driving, indoor and outdoor operation robots and the like. Further, the sensor data may be image data such as image data collected by a camera, image data collected by a radar (e.g., a laser radar, a millimeter wave radar, etc.), or the like, and the type of data is not limited.
In an artificial intelligence application scenario, such as an artificial intelligence application scenario in which data needs to be processed in real time, as the number of sensors increases and the acquisition frame rate increases, the data amount increases in multiples, and the ARM, the DSP, or the CPU and the like cannot process data in real time, so that the present embodiment provides a multi-task real-time processing method based on a plurality of graphic processors, which can ensure the real-time processing of data, and the accuracy and the validity of processing results.
In the embodiment, with the help of strong computing performance of the graphics processor, real-time task processing based on the graphics processor is provided, multi-task parallel scheduling can be realized, functions of task fairness, priority, independence, interruption, recovery and the like can be realized, reasonable management can be performed on the graphics processors, a plurality of tasks to be processed are effectively scheduled, the utilization rate of the graphics processors is utilized to the maximum extent, the processing speed is optimized, the precision and the effectiveness of processing results are effectively ensured, and the reliability is increased.
Referring to fig. 1, which is a schematic view of an application scenario according to an embodiment of the present invention, a multi-task scheduling module is used to cache a task, complete task scheduling, respond to a task processing result to an upper layer application, and the like. The processing module is composed of a plurality of graphic processors, and the real-time processing of tasks is completed through the graphic processors. The statistic monitoring module is used for monitoring and counting the state information of the tasks and the resource information of the graphics processor, feeding the state information of the tasks and the resource information of the graphics processor back to the multitask scheduling module, and the multitask scheduling module performs task scheduling by using the state information of the tasks and the resource information of the graphics processor.
The first embodiment is as follows:
as shown in fig. 2, which is a flowchart of a task processing method in an embodiment of the present invention, the method includes:
step 201, selecting a task to be processed from a processing queue.
Specifically, after receiving the task, the task may be stored in a processing queue, and when the task needs to be processed, the task may be selected from the processing queue, and the selection manner is not limited. For the sake of distinction, a task selected from the processing queue may be referred to as a task to be processed.
At step 202, at least one target graphics processor corresponding to the task to be processed is selected from a plurality of graphics processors (i.e., all graphics processors). Among them, for the sake of convenience of distinction, a graphics processor selected from a plurality of graphics processors may be referred to as a target graphics processor corresponding to a task to be processed.
Specifically, after the to-be-processed task is selected from the processing queue, at least one target graphics processor corresponding to the to-be-processed task may be selected from all the graphics processors according to the state information of the to-be-processed task and/or the resource information of each graphics processor, and the selection manner is not limited.
Step 203, the task to be processed is allocated to the target graphics processor.
Specifically, after a task to be processed is selected from the processing queue and a target graphics processor is selected for the task to be processed, the task to be processed may be allocated to the target graphics processor, that is, the task to be processed may be allocated to an idle processing thread of the target graphics processor. For example, the pending task may be assigned to all or a portion of the idle processing threads of the target graphics processor.
Step 204, the task to be processed is processed by the target graphics processor.
Specifically, after the task to be processed is allocated to the idle processing thread of the target graphics processor, the task to be processed can be processed through the idle processing thread, and the processing procedure is not limited. For example, the pending task may include data (e.g., sensor data, etc.) and a task type, and the idle processing thread of the target graphics processor may perform processing corresponding to the task type based on the data.
Based on the technical scheme, in the embodiment of the invention, the target graphics processor corresponding to the task to be processed can be selected from the multiple graphics processors, the task to be processed is distributed to the target graphics processor, and the task to be processed is processed through the target graphics processor, so that the real-time processing of data is carried out based on the graphics processors, the real-time performance of data processing is ensured, and the real-time processing can be carried out aiming at the data with the GB/s magnitude. In addition, a plurality of graphic processors can be reasonably managed, a plurality of tasks to be processed are effectively scheduled, the utilization rate of the graphic processors is utilized to the maximum extent, the processing speed is optimized, the precision and effectiveness of processing results are effectively guaranteed, and the reliability is improved.
Example two:
before step 201, after receiving a task to be processed, if the task to be processed is a static scheduling task whose time sequence and/or data have a dependency relationship, the task to be processed may be cached in a static processing queue; if the task to be processed is a time-series and/or data-independent dynamic scheduling task, the task to be processed may be buffered in a dynamic processing queue. On this basis, in step 201, selecting a task to be processed from the processing queue may include: if the processing queue is a static processing queue, selecting a task to be processed with a time sequence and/or data having a dependency relationship from the static processing queue; if the processing queue is a dynamic processing queue, time-series and/or data-independent tasks to be processed may be selected from the dynamic processing queue.
In this embodiment, the processing queue may be divided into a Static processing queue (Static) and a Dynamic processing queue (Dynamic). The static processing queue can be used for storing tasks with dependency relationship in time series, or storing tasks with dependency relationship in data, or storing tasks with dependency relationship in time series and data; for the sake of convenience of distinction, the tasks having the dependency relationship described above may be referred to as statically scheduled tasks. In addition, the dynamic processing queue can be used for storing time-series independent tasks, or storing data independent tasks, or storing time-series and data independent tasks; for the sake of distinction, the above-described independent tasks may be referred to as dynamically scheduled tasks.
Moreover, all tasks stored in the static processing queue can be processed in the same graphics processor, or can be processed in related graphics processors, that is, there is a limit to the graphics processors; the correspondence between the static processing queue and the graphics processor (e.g., graphics processor 1) may be pre-configured, and therefore, all tasks of the static processing queue need to be processed by the graphics processor 1. In addition, all tasks stored in the dynamic processing queue may be processed in the same graphics processor or may be processed in different graphics processors, that is, there is no limitation on the graphics processors, and all tasks in the dynamic processing queue may be allocated to any graphics processor.
Wherein, the task with the time sequence having the dependency relationship means: the processing of one task depends on the previous task. For example, in H264 coding, the non-I-frame task depends on the last occurring I-frame task. Based on this, after receiving task 1 and task 2, the multitask scheduling module may store both task 1 and task 2 in the static processing queue if task 1 and task 2 have a dependency relationship in the time sequence. Further, the task in which data has a dependency relationship means: the processing of one data is dependent on the previous data, which is not limiting. Furthermore, all tasks stored in the static processing queue may be processed in the same graphics processor or may be processed in an associated graphics processor.
Wherein, the time-series independent tasks refer to: the processing of one task is not dependent on the previous task, i.e. the task is an independent task. For example, tasks such as image conversion and point cloud algorithm processing are independent tasks and do not depend on the previous task. Based on this, after receiving task 3, the multitask scheduling module may store task 3 in the dynamic processing queue if task 3 is independent in time series. Further, data independent tasks refer to: the processing of one data is independent of other data and is not limited thereto. Moreover, all tasks stored in the dynamic processing queue may be processed in the same graphics processor or may be processed in different graphics processors.
For each to-be-processed task in the static processing queue, the multitask scheduling module may select a to-be-processed task from the static processing queue, and may select another to-be-processed task from the static processing queue when the to-be-processed task is completed or not completed. For each to-be-processed task in the dynamic processing queue, the multi-task scheduling module may select a to-be-processed task from the dynamic processing queue, and when the to-be-processed task is completed or not completed, another to-be-processed task may be selected from the dynamic processing queue.
Example three:
prior to step 201, after receiving a Task to be processed, the Task to be processed may be buffered to a Task Queue (which may also be referred to as a large Queue or Task Queue), which may include a plurality of tasks to be processed. For each task to be processed in the task queue, if the task to be processed is a static scheduling task with a time sequence and/or data having a dependency relationship, the task to be processed can be cached in the static processing queue; if the task to be processed is a time-series and/or data-independent dynamic scheduling task, the task to be processed may be buffered in a dynamic processing queue. On this basis, in step 201, selecting a task to be processed from the processing queue may include: if the processing queue is a static processing queue, selecting a task to be processed with a time sequence and/or data having a dependency relationship from the static processing queue; if the processing queue is a dynamic processing queue, time-series and/or data-independent tasks to be processed may be selected from the dynamic processing queue. The processing procedure of the third embodiment is similar to that of the second embodiment, and is not described herein again.
Example four:
in step 201, a task to be processed is selected from the processing queue, which may include but is not limited to: acquiring the priority of each task to be processed in the processing queue; a high priority pending task is selected from the processing queue based on priority. The obtaining the priority of each task to be processed in the processing queue may include: for each task to be processed in the processing queue, the task type of the task to be processed can be obtained; then, a mapping table is inquired through the task type to obtain the priority corresponding to the task type; the mapping table is used for recording the corresponding relation between the task type and the priority.
The multitask scheduling module may configure a mapping table, as shown in table 1, where the mapping table is used to record a corresponding relationship between a task type and a priority. On this basis, assuming that the to-be-processed task 1 includes a task type a, the to-be-processed task 2 includes a task type B, and the to-be-processed task 3 includes a task type C, the multi-task scheduling module queries the mapping table through the task type a to obtain a priority 5 of the to-be-processed task 1, queries the mapping table through the task type B to obtain a priority 3 of the to-be-processed task 2, and queries the mapping table through the task type C to obtain a priority 1 of the to-be-processed task 3. Obviously, since the priority 5 of the task 1 to be processed is higher than the priority 3 of the task 2 to be processed, and the priority 3 of the task 2 to be processed is higher than the priority 1 of the task 3 to be processed, the multitask scheduling module selects the task 1 to be processed from the processing queue first, then selects the task 2 to be processed, then selects the task 3 to be processed, and so on.
TABLE 1
Task type Priority level
Task type A 5
Task type B 3
Task type C 1
Example five: in order to select the target graphics processor, the state information of the task to be processed and/or the resource information of the graphics processor may be obtained first, and the process of obtaining these information will be described in detail below.
In the first situation, the multi-task scheduling module acquires the state information of the task to be processed.
In the process of processing the task to be processed, the statistical monitoring module can acquire the state information of the task to be processed which is already processed, and record the corresponding relation between the task type of the task to be processed and the state information of the task to be processed in a state information table; and sending the state information table to a multitask scheduling module. On this basis, after selecting the task to be processed from the processing queue, the multitask scheduling module firstly obtains the task type of the task to be processed, queries the state information table according to the task type to obtain the state information corresponding to the task type, and determines the obtained state information as the state information of the task to be processed.
For example, after the processing task 1, the to-be-processed task 2, and the to-be-processed task 3 have completed processing, the statistical monitoring module may obtain the state information table shown in table 2, and send the state information table to the multitask scheduling module. On this basis, after the multitask scheduling module selects the to-be-processed task 4 from the processing queue, if the task type of the to-be-processed task 4 is the task type a, the state information a corresponding to the task type a may be obtained, that is, the state information of the to-be-processed task 4 is the state information a.
TABLE 2
Task type Status information
Task type A Status information A
Task type B Status information B
Task type C Status information C
And in the second situation, the multitask scheduling module acquires the resource information of the graphics processor.
In the process of processing the task to be processed, the statistics monitoring module may acquire (e.g., periodically acquire) resource information of each graphics processor, and send the resource information of each graphics processor to the multitask scheduling module, so that the multitask scheduling module may obtain the resource information of each graphics processor.
In the above-described embodiment, the state information may include a task processing time that is a time difference between the task completion time and the task reception time. Further, the resource information may include: the number of idle processing threads; and/or a state of the processing thread (e.g., an engaged state or an idle state).
In the processing process of each task to be processed, the statistics monitoring module can perform statistics on the state information of the task to be processed and inform the multi-task scheduling module of the corresponding relation between the task type and the state information. Specifically, the processing process of the task to be processed may pass through three stages, namely a task receiving and caching stage, a task scheduling and distributing stage, and a task processing stage, so that the statistics monitoring module may perform statistics on time consumed by the three stages, where the time consumed by the three stages is the state information of the task to be processed, that is, the task processing time of the task to be processed. In addition, the task processing time may be a time difference between the task completion time and the task reception time, and therefore, the statistics monitoring module may also perform statistics on the task reception time and the task completion time of the task to be processed, and calculate a time difference between the task completion time and the task reception time, where the time difference is the task processing time of the task to be processed.
When the statistic monitoring module sends the corresponding relation between the task type and the task processing time to the multi-task scheduling module, each task to be processed has one task processing time because the task type corresponds to a plurality of tasks to be processed, and the task processing times corresponding to different tasks to be processed can be the same or different. Based on the task processing time, the maximum value is selected from the task processing time of the tasks to be processed corresponding to the task type, and the maximum value is used as the task processing time corresponding to the task type. Or selecting the minimum value from the task processing time of a plurality of tasks to be processed corresponding to the task type, and taking the minimum value as the task processing time corresponding to the task type. Or calculating the average value of the task processing time of a plurality of tasks to be processed corresponding to the task type, and taking the average value as the task processing time corresponding to the task type. Alternatively, the maximum value, the minimum value, and the average value may all be used as the task processing time corresponding to the task type, which is not limited to this.
In the process of processing each task to be processed, the statistics monitoring module can perform statistics on the resource information of the graphics processor and notify the multi-task scheduling module of the resource information of the graphics processor. Specifically, the graphics processors may support multitask parallel processing, each graphics processor has multiple processing threads, and the tasks to be processed are allocated to the processing threads for processing, so the statistics monitoring module may monitor the processing threads of each graphics processor, such as monitoring the number of idle processing threads of the graphics processor, monitoring the states (such as an occupied state or an idle state) of the processing threads of the graphics processor, and the like.
Example six:
in step 202, selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors may include: and selecting a target graphic processor corresponding to the task to be processed from the plurality of graphic processors according to the state information of the task to be processed. Or selecting a target graphics processor corresponding to the task to be processed from the plurality of graphics processors according to the resource information of each graphics processor. Or selecting a target graphics processor corresponding to the task to be processed from the plurality of graphics processors according to the state information of the task to be processed and the resource information of each graphics processor.
In the first case, selecting a target graphics processor corresponding to a task to be processed from a plurality of graphics processors according to state information of the task to be processed may include: and determining task processing time according to the state information of the tasks to be processed, and selecting different target graphic processors for different tasks to be processed aiming at a plurality of tasks to be processed (which indicate that the task processing time is longer) with the task processing time being greater than a time threshold value.
The time threshold can be set according to experience, and when the task processing time is greater than the time threshold, the processing time is longer, so that a plurality of tasks to be processed, the task processing time of which is greater than the time threshold, are distributed to different graphics processors, each task to be processed is guaranteed to be scheduled in time, resources of all the graphics processors are reasonably distributed and utilized, the task processing is prevented from being overtime, and the task with the longer processing time is guaranteed not to occupy a plurality of processing threads of the same graphics processor at the same time.
In case two, selecting a target graphics processor corresponding to the task to be processed from the plurality of graphics processors according to the resource information of the graphics processors may include: determining the number of idle processing threads according to the resource information of the graphics processors, and selecting the graphics processor with the most idle processing threads from the multiple graphics processors as a target graphics processor. Since the graphics processor with the largest number of idle processing threads is the graphics processor with the least resource utilization, the graphics processor with the largest number of idle processing threads is selected as the target graphics processor, and resources of all the graphics processors can be reasonably allocated and utilized.
Selecting a target graphics processor corresponding to the task to be processed from the plurality of graphics processors according to the resource information of the graphics processors, which may include: and determining an idle processing thread according to the resource information of the graphics processor, and determining the graphics processor corresponding to the idle processing thread as a target graphics processor. Because the graphics processor corresponding to the idle processing thread is determined as the target graphics processor, after the task to be processed is allocated to the target graphics processor, the target graphics processor has the idle processing thread for processing the task to be processed, and the target graphics processor is prevented from not having the idle processing thread for processing the task to be processed.
When the state of the processing thread is an idle state, the processing thread is an idle processing thread, and when the state of the processing thread is an occupied state, the processing thread is an occupied processing thread. Obviously, since the graphics processor corresponding to the idle processing thread is determined as the target graphics processor, the task to be processed is allocated only when the processing thread is idle, and the task to be processed can be guaranteed to be processed in time.
And selecting a target graphics processor corresponding to the task to be processed from the plurality of graphics processors according to the state information of the task to be processed and the resource information of each graphics processor, wherein the selecting comprises: aiming at a plurality of tasks to be processed with task processing time larger than a time threshold, selecting different target graphic processors for different tasks to be processed; when selecting a target graphics processor, a graphics processor having a large number of idle processing threads or a graphics processor having an idle processing thread is selected from among a plurality of graphics processors.
Example seven:
in step 202, selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors may include: and aiming at a plurality of tasks to be processed which are processed in parallel, selecting corresponding target graphic processors for the plurality of tasks to be processed respectively, wherein the target graphic processors corresponding to different tasks to be processed are the same or different. For example, when different tasks to be processed correspond to different target graphics processors, the multiple tasks to be processed that are processed in parallel can be respectively sent to the multiple graphics processors, so that resources of all the graphics processors are reasonably distributed and utilized, and each task to be processed is guaranteed to be processed in time.
In step 202, selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors may include: if the processing queue is a static processing queue, a graphics processor corresponding to the static processing queue may be queried from among the multiple graphics processors (as described in the above embodiment, the correspondence between the static processing queue and the graphics processor may be configured in advance, and based on the correspondence, the graphics processor corresponding to the static processing queue may be queried), and the queried graphics processor is determined as the target graphics processor corresponding to the task to be processed. If the processing queue is a dynamic processing queue, then a target graphics processor is selected from all graphics processors, see the above embodiment.
Example eight:
in step 204, processing the task to be processed by the target graphics processor may include: the target graphic processor can process the task to be processed by utilizing algorithms such as image processing, feature point tracking, semi-global stereo block matching, radar and camera self-calibration, point cloud tracking, local maps, deep learning and the like, and the process is not limited. For example, when processing a task to be processed by using an image processing algorithm, the source image may be subjected to format conversion and distortion correction, and image data in a desired format or a special format may be output. For another example, when the feature point tracking algorithm is used to process the task to be processed, the corresponding relationship between the previous frame and the current frame can be obtained by using the change of the pixels in the image sequence in the time domain and the correlation between the adjacent frames, so as to calculate the motion information of the object between the adjacent frames. For another example, when the processing task is processed by using the depth learning algorithm, the depth of the target object relative to the camera may be calculated according to the parallax of the same target object in binocular imaging and the positional relationship between the two eyes.
The following description will be made with reference to specific cases for processing a task to be processed by a target graphics processor.
In the first situation, when a task to be processed (such as a task to be processed with a low priority) is processed by a target graphics processor, if the task to be processed with the higher priority than the task to be processed with the low priority exists, the task to be processed with the low priority is interrupted, and the task to be processed with the higher priority is processed by the target graphics processor; and after the task to be processed with higher priority is processed, recovering the task to be processed with lower priority.
Referring to FIG. 3, assume that the target graphics processor is processing pending task 1, at which time a new pending task 2 is received. The target graphics processor determines whether an idle processing thread exists. If so, then pending task 2 is assigned to the idle processing thread. If not, the priority of the task 1 to be processed is compared with the priority of the task 2 to be processed. If the priority of the task 1 to be processed is high, continuing to process the task 1 to be processed, and keeping the task 2 to be processed in a waiting state; after the processing of the task to be processed 1 is completed, the task to be processed 2 is processed. If the priority of the task 2 to be processed is high, the task 1 to be processed can be interrupted, and the task 2 to be processed is processed; and after the processing of the task 2 to be processed is finished, recovering the task 1 to be processed.
And in the second situation, when the to-be-processed task is processed by the target graphic processor, if the to-be-processed task is abnormal, the to-be-processed task is interrupted, the priority of the to-be-processed task is improved, and the to-be-processed task is cached in the processing queue. The priority of the task to be processed is improved, so that the task to be processed can be preferentially selected from the processing queue, and waiting timeout of the task to be processed is avoided.
Thirdly, parallelly processing a plurality of tasks to be processed distributed to the target graphics processor through the target graphics processor; the parallel processing may include synchronous serial processing and kernel asynchronous processing.
When the target graphic processor processes a plurality of tasks to be processed, the plurality of tasks to be processed can be processed in parallel by adopting modes such as synchronous serial processing or kernel (kernel) asynchronous processing, and the like, and the processing mode is not limited, so that the processing time can be effectively saved, and the processing efficiency is improved.
And fourthly, when the target graphic processor processes the task to be processed, latching the address of the central processing unit, and transmitting interactive data between the target graphic processor and the central processing unit through the DMA controller. For example, when the data volume of the task to be processed is large, the cudaHostregister interface is called to perform page latching on the address of the central processing unit, and the DMA controller is used for realizing data interaction between the central processing unit and the graphics processing unit, so that the bandwidth is remarkably improved, and unnecessary data copying is reduced.
And fifthly, when the target graphics processor processes the task to be processed, data sharing is performed through different processing threads. Specifically, if a plurality of tasks to be processed are completed by a plurality of processing threads, data sharing between the processing threads can be performed, copying of the memory and the memory for many times is avoided, sharing of the memory and the memory can be achieved in the processing threads, a copying process is omitted, and a large amount of processing time can be saved.
In this embodiment, the target graphics processor may process the task to be processed by using algorithms such as image processing, feature point tracking, semi-global stereo block matching, radar and camera self-calibration, point cloud tracking, local map, and deep learning, and therefore, the processing thread may include: the method comprises an image processing thread, a feature point tracking processing thread, a Semi-Global Block Matching processing thread (SGBM for short), a radar and camera self-calibration processing thread (self-calibration processing thread for the following), a point cloud tracking processing thread, a map processing thread and a Deep Learning processing thread (Deep Learning).
The image processing thread is used for receiving the original data and processing the original data to obtain a gray level image and an RGB image after distortion removal and epipolar line correction. And the characteristic point tracking processing thread is used for detecting and tracking the characteristic points of the image. The semi-global stereo block matching processing thread is used for matching binocular images to obtain a disparity map (disparity map), and calculating a three-dimensional point cloud. The self-calibration processing thread is used for calibrating external parameters between the camera and the radar. And the point cloud tracking processing thread is used for dividing the three-dimensional point cloud into different objects, carrying out target tracking and area detection and sending the result to the map processing thread. The deep learning processing thread is used for detecting the RGB image, tracking the target and sending the result to the map processing thread. The map processing thread is used for receiving the processing result of the point cloud tracking processing thread and the processing result of the deep learning processing thread and generating a local map by using the received information.
Referring to fig. 4, a schematic diagram of data sharing by different processing threads is shown.
When the image processing thread processes the task to be processed, the output data can be provided for the self-calibration processing thread, and the self-calibration processing thread processes the task to be processed according to the output data; or, the output data can be provided to the feature point tracking processing thread, and the feature point tracking processing thread processes the task to be processed according to the output data; or, the output data can be provided to a semi-global stereo block matching processing thread, and the semi-global stereo block matching processing thread processes the task to be processed according to the output data; alternatively, the output data may be provided to a deep learning processing thread, which processes the task to be processed according to the output data.
When the semi-global stereo block matching processing thread processes the task to be processed, the output data can be provided for the point cloud tracking processing thread, and the point cloud tracking processing thread processes the task to be processed according to the output data. Further, when the point cloud tracking processing thread processes the task to be processed, output data can be provided for the map processing thread; in addition, when the deep learning processing thread processes the task to be processed, the output data can also be provided for the map processing thread; based on the point cloud tracking processing method, the map processing thread can process the task to be processed according to the input data provided by the point cloud tracking processing thread and the input data provided by the deep learning processing thread.
Example eight:
based on the same inventive concept as the method, an embodiment of the present invention further provides a task processing device, as shown in fig. 5, the task processing device includes a scheduler and a plurality of graphics processors; the scheduler is used for selecting a task to be processed from a processing queue, selecting at least one target graphics processor corresponding to the task to be processed from the multiple graphics processors, and distributing the task to be processed to the target graphics processor; and the target graphics processor is used for processing the task to be processed.
The scheduler is specifically configured to, when selecting a task to be processed from the processing queue: and if the processing queue is a static processing queue, selecting the tasks to be processed with time sequences and/or data having dependency relationship from the static processing queue.
The scheduler is specifically configured to, when selecting a task to be processed from the processing queue: and if the processing queue is a dynamic processing queue, selecting a time sequence and/or data independent task to be processed from the dynamic processing queue.
The scheduler is also used for receiving the tasks to be processed; if the task to be processed is a static scheduling task with a time sequence and/or data having a dependency relationship, caching the task to be processed into a static processing queue; and if the task to be processed is a dynamic scheduling task independent of time sequence and/or data, caching the task to be processed into a dynamic processing queue.
The scheduler is also used for receiving the tasks to be processed and caching the tasks to be processed into a task queue; for the task to be processed in the task queue, if the task to be processed is a static scheduling task with a time sequence and/or data having a dependency relationship, caching the task to be processed into a static processing queue; and if the task to be processed is a time sequence and/or data independent dynamic scheduling task, caching the task to be processed into a dynamic processing queue.
When the scheduler selects the task to be processed from the processing queue, the scheduler is used for: acquiring the priority of a task to be processed in a processing queue; a high priority pending task is selected from the processing queue based on priority.
The scheduler is specifically configured to, when acquiring the priority of the task to be processed in the processing queue: acquiring the task type of the task to be processed in the processing queue; inquiring a mapping table according to the task type to obtain the priority corresponding to the task type; the mapping table is used for recording the corresponding relation between the task type and the priority.
In one example, the scheduler is specifically configured to, when selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors: and selecting at least one target graphic processor corresponding to the task to be processed from the plurality of graphic processors according to the state information of the task to be processed and/or the resource information of the graphic processor.
The scheduler is also used for acquiring the task type of the task to be processed; querying a state information table through the task type to obtain state information corresponding to the task type; the state information table is used for recording the corresponding relation between the task type and the state information; and determining the obtained state information as the state information of the task to be processed.
In one example, referring to fig. 4, the task processing device further includes: the monitor is used for acquiring the state information of the to-be-processed task which is processed, recording the corresponding relation between the task type of the to-be-processed task and the state information of the to-be-processed task in a state information table, and sending the state information table to the scheduler.
In the above embodiment, the state information includes a task processing time; the task processing time is a time difference between the task completion time and the task reception time.
The monitor is further configured to obtain resource information of a graphics processor, and send the resource information of the graphics processor to the scheduler; the scheduler is also used for acquiring resource information of the graphics processor;
wherein the resource information includes: the number of idle processing threads; and/or, the state of the processing thread; the state is either an occupied state or an idle state.
The scheduler is specifically configured to, when selecting at least one target graphics processor corresponding to the task to be processed from the multiple graphics processors according to the state information of the task to be processed and/or the resource information of the graphics processors: and determining task processing time according to the state information, and selecting different target graphic processors for different tasks to be processed aiming at a plurality of tasks to be processed with the task processing time being greater than a time threshold.
The scheduler is specifically configured to, when selecting at least one target graphics processor corresponding to the task to be processed from the multiple graphics processors according to the state information of the task to be processed and/or the resource information of the graphics processors: and determining the number of idle processing threads according to the resource information, and selecting the graphics processor with the most idle processing threads from the multiple graphics processors as a target graphics processor.
The scheduler is specifically configured to, when selecting at least one target graphics processor corresponding to the task to be processed from the multiple graphics processors according to the state information of the task to be processed and/or the resource information of the graphics processors: and determining an idle processing thread according to the resource information, and determining a graphics processor corresponding to the idle processing thread as a target graphics processor corresponding to the task to be processed.
In one example, the scheduler is specifically configured to, when selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors: and aiming at a plurality of tasks to be processed which are processed in parallel, respectively selecting corresponding target graphic processors for the plurality of tasks to be processed, wherein the target graphic processors corresponding to different tasks to be processed are the same or different.
The scheduler is specifically configured to, when selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors: if the processing queue is a static processing queue, inquiring a graphics processor corresponding to the static processing queue from the plurality of graphics processors, and determining the inquired graphics processor as a target graphics processor corresponding to the task to be processed.
The target graphics processor, when processing the task to be processed, is specifically configured to: if the task to be processed with the priority higher than that of the task to be processed exists, interrupting the task to be processed and processing the task to be processed with the priority higher than that of the task to be processed; and after the task to be processed with the higher priority is processed, recovering the task to be processed. The target graphics processor, when processing the task to be processed, is specifically configured to: and if the task to be processed is abnormal, interrupting the task to be processed, improving the priority of the task to be processed, and caching the task to be processed into the processing queue. The target graphics processor, when processing the task to be processed, is specifically configured to: processing a plurality of tasks to be processed distributed to the target graphics processor in parallel; the parallel processing comprises synchronous serial processing and kernel asynchronous processing.
The target graphics processor, when processing the task to be processed, is specifically configured to: and when the task to be processed is processed, the address of the central processing unit is latched, and the interactive data between the target graphic processor and the central processing unit is transmitted through the DMA controller.
The target graphics processor, when processing the task to be processed, is specifically configured to: and when the task to be processed is processed, carrying out data sharing through different processing threads.
The target graphics processor is specifically configured to, when performing data sharing through different processing threads:
when the image processing thread processes the task to be processed, the output data is provided for the self-calibration processing thread, and the self-calibration processing thread processes the task to be processed according to the output data; alternatively, the first and second electrodes may be,
providing output data to a feature point tracking processing thread, and processing a task to be processed by the feature point tracking processing thread according to the output data; alternatively, the first and second electrodes may be,
providing output data to a semi-global stereo block matching processing thread, and processing a task to be processed by the semi-global stereo block matching processing thread according to the output data; alternatively, the first and second electrodes may be,
and providing the output data for a deep learning processing thread, and processing the task to be processed by the deep learning processing thread according to the output data.
The target graphics processor is specifically configured to, when performing data sharing through different processing threads: and when the semi-global stereo block matching processing thread processes the task to be processed, the output data is provided for the point cloud tracking processing thread, and the point cloud tracking processing thread processes the task to be processed according to the output data. The target graphics processor is specifically configured to, when performing data sharing through different processing threads: when the point cloud tracking processing thread processes the task to be processed, output data are provided for the map processing thread; when the deep learning processing thread processes the task to be processed, the output data is provided for the map processing thread; and the map processing thread processes the task to be processed according to the input data provided by the point cloud tracking processing thread and the input data provided by the deep learning processing thread.
Example nine:
based on the same inventive concept as the above method, an embodiment of the present invention further provides a computer-readable storage medium, on which computer instructions are stored, and when the computer instructions are executed, the task processing method described in the claims is implemented.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by an article of manufacture with certain functionality. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functions of the units may be implemented in the same software and/or hardware or in a plurality of software and/or hardware when implementing the invention.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Furthermore, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (which may include, but is not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above description is only an example of the present invention, and is not intended to limit the present invention. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims (51)

  1. A task processing method applied to a device including a plurality of graphics processors, the method comprising:
    selecting a task to be processed from a processing queue;
    selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors; distributing the task to be processed to the target graphics processor;
    and processing the task to be processed by the target graphics processor.
  2. The method of claim 1,
    the selecting the task to be processed from the processing queue comprises the following steps:
    and if the processing queue is a static processing queue, selecting the tasks to be processed with time sequences and/or data having dependency relationship from the static processing queue.
  3. The method of claim 1,
    the selecting the task to be processed from the processing queue comprises the following steps: and if the processing queue is a dynamic processing queue, selecting a time sequence and/or data independent task to be processed from the dynamic processing queue.
  4. The method according to claim 2 or 3,
    before the task to be processed is selected from the processing queue, the method further comprises the following steps:
    receiving a task to be processed;
    if the task to be processed is a static scheduling task with a time sequence and/or data having a dependency relationship, caching the task to be processed into a static processing queue;
    and if the task to be processed is a dynamic scheduling task independent of time sequence and/or data, caching the task to be processed into a dynamic processing queue.
  5. The method according to claim 2 or 3,
    before the task to be processed is selected from the processing queue, the method further comprises the following steps:
    receiving a task to be processed, and caching the task to be processed into a task queue;
    for a task to be processed in a task queue, if the task to be processed is a static scheduling task with a time sequence and/or data having a dependency relationship, caching the task to be processed into a static processing queue;
    and if the task to be processed is a dynamic scheduling task independent of time sequence and/or data, caching the task to be processed into a dynamic processing queue.
  6. The method of claim 1,
    the selecting the task to be processed from the processing queue comprises the following steps:
    acquiring the priority of the tasks to be processed in the processing queue;
    and selecting the high-priority to-be-processed task from the processing queue based on the priority.
  7. The method of claim 6,
    the obtaining the priority of the task to be processed in the processing queue includes:
    acquiring the task type of the task to be processed in the processing queue;
    inquiring a mapping table according to the task type to obtain the priority corresponding to the task type;
    the mapping table is used for recording the corresponding relation between the task type and the priority.
  8. The method of claim 1, wherein selecting at least one target graphics processor from the plurality of graphics processors that corresponds to the pending task comprises:
    and selecting at least one target graphic processor corresponding to the task to be processed from the plurality of graphic processors according to the state information of the task to be processed and/or the resource information of the graphic processor.
  9. The method according to claim 8, wherein before selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors according to the state information of the task to be processed and/or the resource information of the graphics processors, further comprising:
    acquiring the task type of the task to be processed;
    querying a state information table through the task type to obtain state information corresponding to the task type; the state information table is used for recording the corresponding relation between the task type and the state information;
    and determining the obtained state information as the state information of the task to be processed.
  10. The method according to claim 9, wherein before querying a status information table by the task type to obtain the status information corresponding to the task type, the method further comprises:
    acquiring state information of a to-be-processed task which is processed, and recording a corresponding relation between a task type of the to-be-processed task and the state information of the to-be-processed task in a state information table.
  11. The method of claim 8, wherein the status information includes a task processing time; the task processing time is a time difference between the task completion time and the task reception time.
  12. The method according to claim 8, wherein before selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors according to the state information of the task to be processed and/or the resource information of the graphics processors, further comprising:
    acquiring resource information of a graphic processor; wherein the resource information includes: the number of idle processing threads; and/or, the state of the processing thread; the state is either an occupied state or an idle state.
  13. The method according to claim 8, wherein the selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors according to the state information of the task to be processed and/or the resource information of the graphics processors comprises:
    and determining task processing time according to the state information, and selecting different target graphic processors for different tasks to be processed aiming at a plurality of tasks to be processed with the task processing time being greater than a time threshold.
  14. The method according to claim 8, wherein the selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors according to the state information of the task to be processed and/or the resource information of the graphics processors comprises:
    and determining the number of idle processing threads according to the resource information, and selecting the graphics processor with the most idle processing threads from the multiple graphics processors as the target graphics processor.
  15. The method according to claim 8, wherein the selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors according to the state information of the task to be processed and/or the resource information of the graphics processors comprises:
    and determining an idle processing thread according to the resource information, and determining a graphics processor corresponding to the idle processing thread as a target graphics processor corresponding to the task to be processed.
  16. The method of claim 1, wherein selecting at least one target graphics processor from the plurality of graphics processors that corresponds to the pending task comprises:
    and aiming at a plurality of tasks to be processed which are processed in parallel, respectively selecting corresponding target graphic processors for the plurality of tasks to be processed, wherein the target graphic processors corresponding to different tasks to be processed are the same or different.
  17. The method of claim 1, wherein selecting at least one target graphics processor from the plurality of graphics processors that corresponds to the pending task comprises:
    if the processing queue is a static processing queue, inquiring a graphics processor corresponding to the static processing queue from the plurality of graphics processors, and determining the inquired graphics processor as a target graphics processor corresponding to the task to be processed.
  18. The method of claim 1, wherein while processing the task to be processed by the target graphics processor, the method further comprises:
    if the task to be processed with the priority higher than that of the task to be processed exists, interrupting the task to be processed, and processing the task to be processed with the priority higher than that of the task to be processed through the target graphics processor;
    and after the task to be processed with the higher priority is processed, recovering the task to be processed.
  19. The method of claim 1, wherein while processing the task to be processed by the target graphics processor, the method further comprises:
    and if the task to be processed is abnormal, interrupting the task to be processed, improving the priority of the task to be processed, and caching the task to be processed into the processing queue.
  20. The method of claim 1, wherein the processing the task to be processed by the target graphics processor comprises:
    processing, by the target graphics processor, a plurality of tasks to be processed assigned to the target graphics processor in parallel; the parallel processing comprises synchronous serial processing and kernel asynchronous processing.
  21. The method of claim 1, wherein the processing the task to be processed by the target graphics processor comprises:
    and when the task to be processed is processed, the address of a central processing unit is latched, and interactive data between the target graphic processor and the central processing unit is transmitted through a DMA controller.
  22. The method of claim 1, wherein the processing the task to be processed by the target graphics processor comprises:
    and when the task to be processed is processed, carrying out data sharing through different processing threads.
  23. The method of claim 22, wherein the sharing of data by different processing threads comprises:
    when the image processing thread processes the task to be processed, the output data is provided for the self-calibration processing thread, and the self-calibration processing thread processes the task to be processed according to the output data; alternatively, the first and second electrodes may be,
    providing the output data to a characteristic point tracking processing thread, and processing the task to be processed by the characteristic point tracking processing thread according to the output data; alternatively, the first and second electrodes may be,
    providing output data to a semi-global stereo block matching processing thread, and processing a task to be processed by the semi-global stereo block matching processing thread according to the output data; alternatively, the first and second electrodes may be,
    and providing the output data for a deep learning processing thread, and processing the task to be processed by the deep learning processing thread according to the output data.
  24. The method of claim 22, wherein the sharing of data by different processing threads comprises: when the semi-global stereo block matching processing thread processes the task to be processed, output data are provided for the point cloud tracking processing thread, and the point cloud tracking processing thread processes the task to be processed according to the output data.
  25. The method of claim 22, wherein the sharing of data by different processing threads comprises:
    when the point cloud tracking processing thread processes the task to be processed, output data are provided for the map processing thread; when the deep learning processing thread processes the task to be processed, the output data is provided for the map processing thread; and the map processing thread processes the task to be processed according to the input data provided by the point cloud tracking processing thread and the input data provided by the deep learning processing thread.
  26. A task processing apparatus comprising a scheduler and a plurality of graphics processors;
    the scheduler is used for selecting a task to be processed from a processing queue, selecting at least one target graphics processor corresponding to the task to be processed from the multiple graphics processors, and distributing the task to be processed to the target graphics processor;
    and the target graphics processor is used for processing the task to be processed.
  27. The apparatus according to claim 26, wherein the scheduler, when selecting the task to be processed from the processing queue, is specifically configured to: and if the processing queue is a static processing queue, selecting the tasks to be processed with time sequences and/or data having dependency relationship from the static processing queue.
  28. The apparatus according to claim 26, wherein the scheduler, when selecting the task to be processed from the processing queue, is specifically configured to: and if the processing queue is a dynamic processing queue, selecting a time sequence and/or data independent task to be processed from the dynamic processing queue.
  29. The apparatus of claim 27 or 28,
    the scheduler is also used for receiving the tasks to be processed; if the task to be processed is a static scheduling task with a time sequence and/or data having a dependency relationship, caching the task to be processed into a static processing queue; and if the task to be processed is a dynamic scheduling task independent of time sequence and/or data, caching the task to be processed into a dynamic processing queue.
  30. The apparatus according to claim 27 or 28, wherein the scheduler is further configured to receive a pending task, buffer the pending task to a task queue; for a task to be processed in a task queue, if the task to be processed is a static scheduling task with a time sequence and/or data having a dependency relationship, caching the task to be processed into a static processing queue; and if the task to be processed is a dynamic scheduling task independent of time sequence and/or data, caching the task to be processed into a dynamic processing queue.
  31. The apparatus according to claim 26, wherein the scheduler, when selecting the task to be processed from the processing queue, is specifically configured to: acquiring the priority of the tasks to be processed in the processing queue; and selecting the high-priority to-be-processed task from the processing queue based on the priority.
  32. The device according to claim 31, wherein the scheduler, when obtaining the priority of the to-be-processed task in the processing queue, is specifically configured to: acquiring the task type of the task to be processed in the processing queue; inquiring a mapping table according to the task type to obtain the priority corresponding to the task type; the mapping table is used for recording the corresponding relation between the task type and the priority.
  33. The apparatus according to claim 26, wherein the scheduler, when selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors, is specifically configured to: and selecting at least one target graphic processor corresponding to the task to be processed from the plurality of graphic processors according to the state information of the task to be processed and/or the resource information of the graphic processor.
  34. The apparatus of claim 33, wherein the scheduler is further configured to obtain a task type of the pending task; querying a state information table through the task type to obtain state information corresponding to the task type; the state information table is used for recording the corresponding relation between the task type and the state information; and determining the obtained state information as the state information of the task to be processed.
  35. The apparatus of claim 34, further comprising:
    the monitor is used for acquiring the state information of the to-be-processed task which is processed, recording the corresponding relation between the task type of the to-be-processed task and the state information of the to-be-processed task in a state information table, and sending the state information table to the scheduler.
  36. The apparatus of claim 33, wherein the status information comprises a task processing time; the task processing time is a time difference between the task completion time and the task reception time.
  37. The apparatus of claim 33, further comprising a monitor for obtaining resource information of a graphics processor, and sending the resource information of the graphics processor to the scheduler;
    the scheduler is also used for acquiring resource information of the graphics processor;
    wherein the resource information includes: the number of idle processing threads; and/or, the state of the processing thread; the state is either an occupied state or an idle state.
  38. The apparatus according to claim 33, wherein the scheduler is configured to, when selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors according to the state information of the task to be processed and/or the resource information of the graphics processors:
    and determining task processing time according to the state information, and selecting different target graphic processors for different tasks to be processed aiming at a plurality of tasks to be processed with the task processing time being greater than a time threshold.
  39. The apparatus according to claim 33, wherein the scheduler is configured to, when selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors according to the state information of the task to be processed and/or the resource information of the graphics processors:
    and determining the number of idle processing threads according to the resource information, and selecting the graphics processor with the most idle processing threads from the multiple graphics processors as the target graphics processor.
  40. The apparatus according to claim 33, wherein the scheduler is configured to, when selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors according to the state information of the task to be processed and/or the resource information of the graphics processors:
    and determining an idle processing thread according to the resource information, and determining a graphics processor corresponding to the idle processing thread as a target graphics processor corresponding to the task to be processed.
  41. The apparatus according to claim 26, wherein the scheduler, when selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors, is specifically configured to: and aiming at a plurality of tasks to be processed which are processed in parallel, respectively selecting corresponding target graphic processors for the plurality of tasks to be processed, wherein the target graphic processors corresponding to different tasks to be processed are the same or different.
  42. The apparatus of claim 26,
    the scheduler is specifically configured to, when selecting at least one target graphics processor corresponding to the task to be processed from the plurality of graphics processors: if the processing queue is a static processing queue, inquiring a graphics processor corresponding to the static processing queue from the plurality of graphics processors, and determining the inquired graphics processor as a target graphics processor corresponding to the task to be processed.
  43. The device of claim 26, wherein the target graphics processor, when processing the task to be processed, is specifically configured to: if the task to be processed with the priority higher than that of the task to be processed exists, interrupting the task to be processed and processing the task to be processed with the priority higher than that of the task to be processed; and after the task to be processed with the higher priority is processed, recovering the task to be processed.
  44. The apparatus of claim 26,
    the target graphics processor, when processing the task to be processed, is specifically configured to: and if the task to be processed is abnormal, interrupting the task to be processed, improving the priority of the task to be processed, and caching the task to be processed into the processing queue.
  45. The device of claim 26, wherein the target graphics processor, when processing the task to be processed, is specifically configured to: processing a plurality of tasks to be processed distributed to the target graphics processor in parallel; the parallel processing comprises synchronous serial processing and kernel asynchronous processing.
  46. The apparatus of claim 26,
    the target graphics processor, when processing the task to be processed, is specifically configured to: and when the task to be processed is processed, the address of the central processing unit is latched, and the interactive data between the target graphic processor and the central processing unit is transmitted through the DMA controller.
  47. The apparatus of claim 26,
    the target graphics processor, when processing the task to be processed, is specifically configured to: and when the task to be processed is processed, carrying out data sharing through different processing threads.
  48. The apparatus of claim 47,
    the target graphics processor is specifically configured to, when performing data sharing through different processing threads:
    when the image processing thread processes the task to be processed, the output data is provided for the self-calibration processing thread, and the self-calibration processing thread processes the task to be processed according to the output data; alternatively, the first and second electrodes may be,
    providing output data to a feature point tracking processing thread, and processing a task to be processed by the feature point tracking processing thread according to the output data; alternatively, the first and second electrodes may be,
    providing output data to a semi-global stereo block matching processing thread, and processing a task to be processed by the semi-global stereo block matching processing thread according to the output data; alternatively, the first and second electrodes may be,
    and providing the output data for a deep learning processing thread, and processing the task to be processed by the deep learning processing thread according to the output data.
  49. The device of claim 47, wherein the target graphics processor, when sharing data via different processing threads, is further configured to: and when the semi-global stereo block matching processing thread processes the task to be processed, the output data is provided for the point cloud tracking processing thread, and the point cloud tracking processing thread processes the task to be processed according to the output data.
  50. The apparatus of claim 47,
    the target graphics processor is specifically configured to, when performing data sharing through different processing threads:
    when the point cloud tracking processing thread processes the task to be processed, output data are provided for the map processing thread; when the deep learning processing thread processes the task to be processed, the output data is provided for the map processing thread; and the map processing thread processes the task to be processed according to the input data provided by the point cloud tracking processing thread and the input data provided by the deep learning processing thread.
  51. A computer-readable storage medium having stored thereon computer instructions which, when executed, implement the task processing method of any one of claims 1 to 25.
CN201880012037.2A 2018-03-28 2018-03-28 Task processing method, equipment and machine readable storage medium Pending CN110494848A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/080970 WO2019183861A1 (en) 2018-03-28 2018-03-28 Method, device, and machine readable storage medium for task processing

Publications (1)

Publication Number Publication Date
CN110494848A true CN110494848A (en) 2019-11-22

Family

ID=68062482

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201880012037.2A Pending CN110494848A (en) 2018-03-28 2018-03-28 Task processing method, equipment and machine readable storage medium

Country Status (2)

Country Link
CN (1) CN110494848A (en)
WO (1) WO2019183861A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111190712A (en) * 2019-12-25 2020-05-22 北京推想科技有限公司 Task scheduling method, device, equipment and medium
CN111209112A (en) * 2019-12-31 2020-05-29 杭州迪普科技股份有限公司 Exception handling method and device
CN111625358A (en) * 2020-05-25 2020-09-04 浙江大华技术股份有限公司 Resource allocation method and device, electronic equipment and storage medium
CN111694648A (en) * 2020-06-09 2020-09-22 北京百度网讯科技有限公司 Task scheduling method and device and electronic equipment
CN111708639A (en) * 2020-06-22 2020-09-25 中国科学技术大学 Task scheduling system and method, storage medium and electronic device
CN111858078A (en) * 2020-07-15 2020-10-30 江门市俐通环保科技有限公司 High-concurrency data erasing method, device, equipment and storage medium
CN112214020A (en) * 2020-09-23 2021-01-12 北京特种机械研究所 Method and device for establishing task framework and processing tasks of AGV (automatic guided vehicle) scheduling system
CN113010301A (en) * 2019-12-20 2021-06-22 辉达公司 User-defined measured priority queue
CN115955550A (en) * 2023-03-15 2023-04-11 浙江宇视科技有限公司 Image analysis method and system of GPU (graphics processing Unit) cluster
CN116954954A (en) * 2023-09-20 2023-10-27 摩尔线程智能科技(北京)有限责任公司 Method and device for processing multi-task queues, storage medium and electronic equipment

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112015553A (en) * 2020-08-27 2020-12-01 深圳壹账通智能科技有限公司 Data processing method, device, equipment and medium based on machine learning model
CN113342493B (en) * 2021-06-15 2022-09-20 上海哔哩哔哩科技有限公司 Task execution method and device and computer equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104156264A (en) * 2014-08-01 2014-11-19 西北工业大学 Baseband signal processing task parallelism real-time scheduling method based on multiple GPUs
CN107122243A (en) * 2017-04-12 2017-09-01 杭州远算云计算有限公司 Heterogeneous Cluster Environment and CFD computational methods for CFD simulation calculations
US20170300361A1 (en) * 2016-04-15 2017-10-19 Intel Corporation Employing out of order queues for better gpu utilization

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110161675A1 (en) * 2009-12-30 2011-06-30 Nvidia Corporation System and method for gpu based encrypted storage access
CN107273331A (en) * 2017-06-30 2017-10-20 山东超越数控电子有限公司 A kind of heterogeneous computing system and method based on CPU+GPU+FPGA frameworks

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104156264A (en) * 2014-08-01 2014-11-19 西北工业大学 Baseband signal processing task parallelism real-time scheduling method based on multiple GPUs
US20170300361A1 (en) * 2016-04-15 2017-10-19 Intel Corporation Employing out of order queues for better gpu utilization
CN107122243A (en) * 2017-04-12 2017-09-01 杭州远算云计算有限公司 Heterogeneous Cluster Environment and CFD computational methods for CFD simulation calculations

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113010301A (en) * 2019-12-20 2021-06-22 辉达公司 User-defined measured priority queue
US11954518B2 (en) 2019-12-20 2024-04-09 Nvidia Corporation User-defined metered priority queues
CN111190712A (en) * 2019-12-25 2020-05-22 北京推想科技有限公司 Task scheduling method, device, equipment and medium
CN111209112A (en) * 2019-12-31 2020-05-29 杭州迪普科技股份有限公司 Exception handling method and device
CN111625358A (en) * 2020-05-25 2020-09-04 浙江大华技术股份有限公司 Resource allocation method and device, electronic equipment and storage medium
CN111625358B (en) * 2020-05-25 2023-06-20 浙江大华技术股份有限公司 Resource allocation method and device, electronic equipment and storage medium
CN111694648A (en) * 2020-06-09 2020-09-22 北京百度网讯科技有限公司 Task scheduling method and device and electronic equipment
CN111694648B (en) * 2020-06-09 2023-08-15 阿波罗智能技术(北京)有限公司 Task scheduling method and device and electronic equipment
CN111708639A (en) * 2020-06-22 2020-09-25 中国科学技术大学 Task scheduling system and method, storage medium and electronic device
CN111858078A (en) * 2020-07-15 2020-10-30 江门市俐通环保科技有限公司 High-concurrency data erasing method, device, equipment and storage medium
CN112214020A (en) * 2020-09-23 2021-01-12 北京特种机械研究所 Method and device for establishing task framework and processing tasks of AGV (automatic guided vehicle) scheduling system
CN115955550A (en) * 2023-03-15 2023-04-11 浙江宇视科技有限公司 Image analysis method and system of GPU (graphics processing Unit) cluster
CN116954954A (en) * 2023-09-20 2023-10-27 摩尔线程智能科技(北京)有限责任公司 Method and device for processing multi-task queues, storage medium and electronic equipment

Also Published As

Publication number Publication date
WO2019183861A1 (en) 2019-10-03

Similar Documents

Publication Publication Date Title
CN110494848A (en) Task processing method, equipment and machine readable storage medium
US11611715B2 (en) System and method for event camera data processing
US10909394B2 (en) Real-time multiple vehicle detection and tracking
US10338879B2 (en) Synchronization object determining method, apparatus, and system
JP2013500536A5 (en)
KR20160106338A (en) Apparatus and Method of tile based rendering for binocular disparity image
CN111027438A (en) Human body posture migration method, mobile terminal and computer storage medium
CN111274019A (en) Data processing method and device and computer readable storage medium
US9946645B2 (en) Information processing apparatus and memory control method
JP7273975B2 (en) Data processing method, device, equipment and storage medium
US9983911B2 (en) Analysis controller, analysis control method and computer-readable medium
US9323995B2 (en) Image processor with evaluation layer implementing software and hardware algorithms of different precision
US20180144521A1 (en) Geometric Work Scheduling of Irregularly Shaped Work Items
US11856284B2 (en) Method of controlling a portable device and a portable device
US11281935B2 (en) 3D object detection from calibrated 2D images
CN114387324A (en) Depth imaging method, depth imaging device, electronic equipment and computer readable storage medium
CN115525356A (en) Keep-alive method of video application and electronic equipment
WO2015149587A1 (en) Graph calculation preprocessing device, method and system
CN114446077B (en) Device and method for parking space detection, storage medium and vehicle
CN114721834B (en) Resource allocation processing method, device, equipment, vehicle and medium
CN111314189B (en) Service message sending method and device
CN109101188B (en) Data processing method and device
CN115619631A (en) Graph data processing method and device, electronic equipment and storage medium
CN116974743A (en) Data reading method, device, equipment and storage medium
CN114416335A (en) Data processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20191122