WO2019228285A1 - Task scheduling method and device - Google Patents

Task scheduling method and device Download PDF

Info

Publication number
WO2019228285A1
WO2019228285A1 PCT/CN2019/088450 CN2019088450W WO2019228285A1 WO 2019228285 A1 WO2019228285 A1 WO 2019228285A1 CN 2019088450 W CN2019088450 W CN 2019088450W WO 2019228285 A1 WO2019228285 A1 WO 2019228285A1
Authority
WO
WIPO (PCT)
Prior art keywords
delay
task
sensitive
power consumption
tasks
Prior art date
Application number
PCT/CN2019/088450
Other languages
French (fr)
Chinese (zh)
Inventor
余国生
代成成
董晓文
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2019228285A1 publication Critical patent/WO2019228285A1/en

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B19/00Programme-control systems
    • G05B19/02Programme-control systems electric
    • G05B19/418Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS], computer integrated manufacturing [CIM]
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B19/00Programme-control systems
    • G05B19/02Programme-control systems electric
    • G05B19/418Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS], computer integrated manufacturing [CIM]
    • G05B19/41865Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS], computer integrated manufacturing [CIM] characterised by job scheduling, process planning, material flow
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/30Nc systems
    • G05B2219/32Operator till task planning
    • G05B2219/32252Scheduling production, machining, job shop
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Definitions

  • the embodiments of the present application relate to the field of artificial intelligence, and in particular, to a method and a device for task scheduling.
  • Automated driving systems use technologies such as computers, modern sensing, information fusion, communication, artificial intelligence, and automatic control to achieve functions such as environmental perception, behavioral decision-making, path planning, and motion control.
  • the processing flow of the automatic driving system is: the vehicle-mounted device collects image data collected by various cameras distributed on the vehicle body, and transmits these image data to the computing device of the vehicle-mounted device. After deep learning of these image data, the surrounding environment of the vehicle body (such as: the number of pedestrians, the distance from pedestrians, etc.) is identified. Finally, decisions can be made based on the identified environment, such as controlling vehicle speed and changing paths.
  • the task delay is demanding, that is, the computing device of the vehicle-mounted device is required to perform the task within a strict time, such as identifying the surrounding environment of the vehicle body.
  • the calculation amount of the computing unit increases greatly, which will greatly increase the energy consumption of the vehicle-mounted device.
  • in-vehicle devices are usually sensitive to energy consumption, and the driving energy consumption should be minimized during the driving process.
  • the computing energy consumption is kept low during the driving process, it is difficult to meet the strict delay requirements of the driving process. It can be seen that the existing technology cannot simultaneously meet the requirements for energy consumption and delay in a vehicle environment.
  • the embodiments of the present application provide a task scheduling method and device, which can comprehensively consider delay requirements and power consumption requirements in a driving state, and improve the working efficiency of a vehicle-mounted device.
  • a task scheduling method is disclosed.
  • the vehicle-mounted device first determines a motion state of a mobile device, determines M tasks to be executed according to the motion state of the mobile device, and determines N delay-sensitive tasks of the M tasks. .
  • M is an integer greater than or equal to 1
  • N is an integer less than or equal to M.
  • the power consumption requirements of the delay-sensitive tasks are calculated, and the target power consumption margin range is matched with the power consumption requirements of the delay-sensitive tasks.
  • the delay of a delay-sensitive task requires that the processor that has executed the delay-sensitive task within the time period is determined as the target processor to execute the delay-sensitive task, and then assigns the delay-sensitive task to the target processor of the delay-sensitive task.
  • the target power consumption margin range is determined according to the target power consumption range of the mobile device's computing unit and the current power consumption of the computing unit; the power consumption requirements of delay-sensitive tasks are required to perform delay-sensitive tasks.
  • the delay required for delay-sensitive tasks is the time required to execute delay-sensitive tasks.
  • the in-vehicle device can reasonably schedule tasks that need to be executed during the travel of the vehicle, which not only meets the delay requirement of the task, but also enables the computing unit to work under the target power consumption state. Considering the requirements of driving state, delay and power consumption, the working efficiency of the vehicle platform is improved.
  • the determination of the N delay-sensitive tasks of the M tasks by the vehicle-mounted device specifically includes: determining a delay of each of the M tasks according to a motion state. Sensitivity; determine the priority level of each of the M tasks according to the delay sensitivity of each of the M tasks; the priority level is proportional to the delay sensitivity; finally, the priority of the M tasks
  • the N tasks with higher ranks are determined as N delay-sensitive tasks.
  • a task with high delay sensitivity may be determined as a delay sensitive task. Further, the vehicle-mounted device first considers assigning an operator to the delay sensitive task in subsequent processing to ensure that the delay sensitive task can be executed preferentially. .
  • the power consumption requirements of delay-sensitive tasks are calculated, and the target power consumption margin range and time are calculated.
  • Delay-sensitive tasks that match the power requirements of the delay-sensitive tasks and that can be executed within the delay-required duration of the processor are determined to be the target processor for delay-sensitive tasks.
  • Specific tasks include: Power requirements; for each operator of the mobile device, determine the duration of the delay-sensitive tasks performed by the operator and the target power consumption margin of the operator; determine whether the power requirements of the delay-sensitive tasks fall within the target power of the operator Whether the delay time of the delay-sensitive task is not exceeded by the delay range of the processor and the delay-sensitive task is performed; if the power consumption requirement of the delay-sensitive task falls within the target power consumption margin of the processor and the operation If the duration of the delay-sensitive task is not exceeded by the delay-sensitive task, the processor is determined to be the target processor of the delay-sensitive task.
  • the target power consumption range refers to that when the power consumption of the computing unit is maintained within the range, the performance of the computing unit reaches the optimal state. If the power consumption requirement of the task falls within the target power consumption margin range , It means that the computing unit can perform the task in an optimal state, thereby improving the computing power of the computing unit and further improving the working efficiency of the vehicle platform.
  • the method further includes: if the power consumption requirement of the delay-sensitive task and the target power of all the computing devices of the mobile device If the consumption ranges do not match, the input data of the delay-sensitive task is split into at least two sub-data blocks; for each of the at least one sub-data block, the power consumption required to calculate the sub-data block is determined If the target power consumption margin range of one operator of the mobile device matches the power consumption, determine the operator as the target operator corresponding to the sub-data block; determine the target operator corresponding to all sub-data blocks as delay-sensitive The target computing unit corresponding to the task.
  • delay-sensitive tasks can be split and executed by multiple processors. This can ensure that delay-sensitive tasks can be executed quickly, ensure that the task's delay requirements are met, and will not affect the follow-up of the vehicle-mounted device. decision making.
  • delay-sensitive tasks are assigned to delay-sensitive tasks
  • the target operator specifically includes: calculating the execution time of N delay-sensitive tasks, and determining the delay request duration of the delay-sensitive task execution time according to the execution time of the delay-sensitive task; generating a task list, and the task list is used to record N
  • Each target delay-sensitive task corresponds to the target operator and the delay requirement corresponding to each target operator, so that the mobile device's operator determines the delay-sensitive task to be executed according to the task list and waits for the execution.
  • Delay-sensitive tasks require that the delay-sensitive tasks to be executed are completed within the time period.
  • a task may be assigned to a corresponding computing unit in a task list manner, and the computing unit may perform calculation of the corresponding task according to the task list.
  • the method provided by the embodiment of the present invention further includes: delaying each of the N delay-sensitive tasks After the sensitive task is assigned to the corresponding target computing unit, for each task of the M tasks except N delay-sensitive tasks, the task's power consumption requirement is calculated, and the target power consumption margin range is related to the task's power consumption requirement.
  • the computing unit that is matched and capable of executing the task within the delay required time of the task is determined as the target computing unit to execute the task, and the task is allocated to the target computing unit of the delay-sensitive task.
  • the power consumption requirement of the task is the power consumption required to execute the task
  • the delay requirement of the task is the time required to execute the task.
  • an operator may also be assigned to the non-delay-sensitive task to ensure that the non-delay-sensitive task can be executed and avoid affecting the subsequent decision-making of the vehicle-mounted device.
  • the computing unit can perform non-time delay sensitive tasks in the best state, which improves the computing power of the computing unit and further improves the working efficiency of the vehicle platform.
  • the duration T of the processor performing the delay-sensitive task satisfies:
  • Nc and Nm are the number of basic computing units inside the computing unit, f 1 is the frequency of the core on the computing unit; f 2 is the frequency of the memory of the computing unit, and C i is the double-precision floating point required to perform delay-sensitive tasks.
  • the number of operations, m j is the frequency of memory access required to complete the delay-sensitive task; the memory access frequency is the number of times the memory is accessed per second.
  • the time required for each computing unit to perform a certain task can be calculated by using the foregoing formula, and then it can be determined whether the computing unit meets the delay requirement of the task.
  • the power consumption required by each computing unit to perform a certain task can be calculated by using the foregoing formula, and then it can be determined whether the computing unit meets the power consumption requirement of the task.
  • an apparatus including: a determining unit for determining a motion state of a mobile device, determining M tasks to be performed according to the motion state of the mobile device, and determining N delay-sensitive tasks among the M tasks ; M is an integer greater than or equal to 1, N is an integer less than or equal to M; a calculation unit for calculating the power consumption requirements of the delay-sensitive tasks for each delay-sensitive task of the N delay-sensitive tasks; a determination unit It is also used to determine that the target power consumption margin range matches the power consumption requirements of the delay-sensitive task and that the processor capable of executing the delay-sensitive task within the delay-required duration of the delay-sensitive task is determined to execute the delay-sensitive task Target arithmetic unit; an allocation unit for assigning delay-sensitive tasks to the target arithmetic unit of delay-sensitive tasks.
  • the target power consumption margin range is determined according to the target power consumption range of the mobile device's computing unit and the current power consumption of the computing unit; the power consumption requirements of delay-sensitive tasks are required to perform delay-sensitive tasks.
  • the delay required for delay-sensitive tasks is the time required to execute delay-sensitive tasks.
  • the device provided in the embodiment of the present invention can reasonably schedule tasks that need to be performed during the travel of the vehicle, not only meeting the delay time required for the task, but also enabling the computing unit to work under the target power consumption state, taking into account the driving state, The requirements of delay and power consumption improve the working efficiency of the vehicle platform.
  • the determining unit is specifically configured to determine the delay sensitivity of each of the M tasks according to the motion state; and according to each of the M tasks
  • the delay sensitivity determines the priority level of each of the M tasks; the priority level is proportional to the delay sensitivity; the N tasks with a higher priority level among the M tasks are determined as N delays Sensitive tasks.
  • the computing unit is specifically configured to calculate the power consumption requirements of delay-sensitive tasks; for mobile Each operator of the device determines the duration of the delay-sensitive tasks performed by the operator and the target power consumption margin range of the operator; the determination unit is specifically used to determine whether the power requirements of the delay-sensitive tasks fall within the target of the operator Within the power consumption margin and whether the delay time of the delay-sensitive task performed by the component does not exceed the delay requirement length of the delay-sensitive task; if the power consumption requirement of the delay-sensitive task falls within the target power consumption margin of the component and The execution time of the delay-sensitive task by the arithmetic unit does not exceed the delay required time of the delay-sensitive task. Then, the arithmetic unit is determined as the target arithmetic unit of the delay-sensitive task.
  • the determining unit is further configured to, if the power consumption requirements of the delay-sensitive task and all the computing devices of the mobile device If the target power consumption margin ranges do not match, the input data of the delay-sensitive task is split into at least two sub-data blocks; for each sub-data block in the at least one sub-data block, the required calculation of the sub-data block is determined. If the target power consumption margin range of a computing unit of the mobile device matches the power consumption, the computing unit is determined as the target computing unit corresponding to the sub-data block; the target computing unit corresponding to all sub-data blocks is determined as Target processor for delay-sensitive tasks.
  • the allocation unit is specifically configured to calculate N delays The execution time of sensitive tasks. Determine the delay requirement of the execution time of delay-sensitive tasks according to the execution time of delay-sensitive tasks.
  • the task list is used to record the target processor corresponding to each delay-sensitive task in the N delay-sensitive tasks and the delay requirement duration corresponding to each target processor, so that the mobile device's processor can determine it according to the task list.
  • the delay-sensitive task to be executed and the delay-sensitive task to be executed are completed within the required time delay.
  • the computing unit is further configured to allocate each delay-sensitive task of the N delay-sensitive tasks After giving the corresponding target computing unit, for each task of the M tasks except N delay-sensitive tasks, calculate the power consumption requirements of the task; the allocation unit is also used to compare the target power consumption margin range with the task's power
  • the computing unit that matches the power consumption requirements and is able to complete the task within the required delay time of the task is determined as the target computing unit to execute the task, and the task is allocated to the target computing unit of the delay-sensitive task; among them, the power consumption requirement of the task is The power required to execute a task.
  • the delay required for a task is the time required to execute the task.
  • the duration T of the processor performing the delay-sensitive task satisfies:
  • Nc and Nm are the number of basic computing units inside the computing unit, f 1 is the frequency of the core on the computing unit; f 2 is the frequency of the memory of the computing unit, and C i is the double-precision floating point required to perform delay-sensitive tasks.
  • the number of operations, m j is the frequency of memory access required to complete the delay-sensitive task; the memory access frequency is the number of times the memory is accessed per second.
  • FIG. 1 is a structural diagram of a vehicle-mounted system provided by an embodiment of the present application.
  • FIG. 2 is another architecture diagram of a vehicle-mounted system provided by an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of preprocessing provided by an embodiment of the present invention.
  • FIG. 4 is a schematic flowchart of a task scheduling method according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of input data of a task according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of high and low speeds provided by an embodiment of the present invention.
  • FIG. 7 is a schematic flowchart of a task allocation method according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of a task state according to an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of another task state according to an embodiment of the present invention.
  • FIG. 10 is a structural block diagram of a device according to an embodiment of the present invention.
  • FIG. 11 is another structural block diagram of an apparatus according to an embodiment of the present invention.
  • FIG. 12 is another structural block diagram of a device provided by an embodiment of the present invention.
  • the vehicle body is provided with a monitoring device, and a vehicle-mounted device is installed inside the vehicle.
  • the monitoring device may be a sensor, such as a camera, a laser sensor, and the like.
  • the monitoring device collects environmental information around the vehicle body in real time, performs calculations based on the collected information (such as images taken by a camera), identifies objects around the vehicle body based on the calculation results, and provides input for subsequent decision-making. For example, according to the calculation results, the speed of the vehicle and the route of the vehicle are controlled.
  • An embodiment of the present invention provides a task scheduling method.
  • An in-vehicle device of a mobile device first determines a motion state of the mobile device, and determines M tasks to be performed according to the motion state of the mobile device. Then, N of the M tasks can also be determined. Delay-sensitive tasks. Further, for each of the above N delay-sensitive tasks, the following steps are performed: the vehicle-mounted device calculates the power consumption requirements of the delay-sensitive tasks, and compares the target power consumption margin range with the power consumption of the delay-sensitive tasks. An operator that matches the requirements and is capable of executing the delay-sensitive task within the delay required time of the delay-sensitive task determines a target processor that executes the delay-sensitive task. Further, the delay-sensitive task is assigned a corresponding target computing unit.
  • the in-vehicle device can reasonably schedule tasks that need to be performed during the travel of the vehicle, which not only meets the delay time required for the task, but also enables the computing unit to work at the target power consumption state, taking into account driving The requirements of state, delay and power consumption improve the working efficiency of the vehicle platform.
  • the vehicle-mounted system includes: a vehicle-mounted device 10 and at least one detection device 20.
  • the vehicle-mounted device 10 includes input / output (IO) logic 101, a scheduling module 102, and at least one computing unit 103.
  • the scheduling module 102 may be a central processing unit (CPU)
  • the computing unit 103 may be an ACC (Accumulator) or a graphics processor (Graphics Processing Unit, GPU), or a neural network chip.
  • the scheduling module 102 may allocate one or more tasks to be executed to the processor 103.
  • the detection device 20 may be a sensing device such as a camera, a laser sensor, or a millimeter wave radar.
  • the camera installed on the mobile device includes the following types:
  • Long-distance camera It is located in front of the vehicle body and is used to identify distant targets, such as traffic lights;
  • Medium-range camera It is located in front of the vehicle body and is mainly used to identify lane lines, people, cars, non-motor vehicles, obstacles, etc., and can also be used for distance measurement;
  • Close-range camera located around the vehicle body, mainly identifying people, cars, non-motor vehicles, obstacles, and driveable areas;
  • cameras can be installed on the mobile device, such as a camera located behind the vehicle body, which can be used for parking and parking; a camera installed near the driver's seat inside the vehicle, which can be used to monitor fatigue driving.
  • the detection device 20 collects information about the surrounding environment of the vehicle body, for example, the cameras of the vehicle body distribution capture images at different angles, breadths, detection distances, frame rates, and resolutions.
  • the detection device 20 passes the collected information to the IO logic 101, and the IO logic 101 performs unified management and preprocessing on the received information.
  • the preprocessing of the received information (such as images) by the IO logic 101 may be to classify the images captured by the camera. As shown in FIG. 3, each camera captures m frames of images, and the images captured by all cameras can be classified in order according to the number of frames of the images. For example, the first frame (frame0) of the images captured by all cameras is grouped, the second frame (frame1) of all images is grouped, and so on. Images are classified.
  • the IO logic 101 passes the pre-processed image to the scheduling module 102.
  • the scheduling module 102 generates a computing task according to the information input by the IO logic 101, and sends the computing task to the processor 103 for processing.
  • the processor 103 runs the deep nerve Network (Deep Neural Network, DNN) algorithm, and returns the calculation result to the scheduling module 102.
  • the scheduling module 102 can make subsequent decisions according to the calculation results.
  • An embodiment of the present invention provides a task scheduling method, which can provide support for task scheduling of the scheduling module 102. As shown in Figure 4, the method includes the following steps:
  • the scheduling module 102 determines a current motion state of the mobile device, and determines M tasks to be executed according to the motion state of the mobile device.
  • the scheduling module 102 may receive data sent by the detection device 20 and calculate the current motion state of the mobile device according to the data sent by the detection device 20, such as the running speed of the mobile device.
  • the mobile device in the embodiment of the present invention may be a device with a mobile function such as a vehicle or a drone.
  • M is an integer of 1 or more.
  • the scheduling module 102 determines the DNN tasks that need to be executed according to the vehicle sensing system algorithm, that is, the M tasks described in the embodiment of the present invention, such as: detecting lane lines, detecting pedestrians, detecting motor vehicles, and detecting non-machines. Moving cars and other tasks.
  • each task is inputted with data (for example, an image taken by a camera) obtained by multiple detection devices 20, and different tasks are performed by different processors.
  • the mobile device has three tasks: Task 0, Task 1, and Task 2 under the current motion state.
  • the input data required for the processor to execute Task 0 is taken by camera 0, camera 1, and camera 3.
  • the input data required to execute Task 1 is the images captured by camera 1 and camera 2
  • the input data required to execute Task 2 is the images captured by camera 2 and camera 3.
  • the scheduling module 102 determines N delay-sensitive tasks among the M tasks.
  • the delay sensitivities of tasks are different for different movement states of mobile devices.
  • the in-vehicle device 10 is required to be able to quickly identify vehicles, pedestrians, obstacles, etc. in front of the mobile device. Therefore, tasks such as “identify vehicles in front” and “identify pedestrians in front” have higher priority; in addition, related tasks that identify conditions behind mobile devices have relatively low priorities, such as “identify vehicles in the rear", “identify pedestrians in the rear”, etc. task.
  • N is an integer of M or less.
  • the scheduling module 102 may determine the delay sensitivity of each of the M tasks according to the motion state of the mobile device. Further, since the priority level of the task is proportional to the delay sensitivity of the task, the scheduling module 102 may also determine the priority level of the task according to the delay sensitivity of each task. In addition, the delay sensitivity of the task can also be determined according to the delay requirement duration of the task. The shorter the delay requirement duration, the higher the task's delay sensitivity. For example, the delay requirements of task 1, task 2, and task 3 are 5 s, 10 s, and 8 s, respectively. Therefore, the three tasks are task 1, task 3, and task 2 in descending order of delay sensitivity, that is, According to the priority level, task 1, task 3, and task 2 are in order.
  • the delay required duration of a task may be an estimated time required to perform the task, and the time required to perform the task may generally be determined according to the amount of input data required for a task.
  • the scheduling module 102 may also arrange the M tasks in order of priority from high to low, and determine N tasks with higher priority levels as N delay-sensitive tasks.
  • the motion state of the mobile device can be divided into high speed and low speed, and the delay requirements of tasks under different motion states are different. Specifically, when the motion state of the mobile device is high speed, it is required to be able to process tasks quickly, and the time delay of the task is relatively high; when the motion state of the mobile device is low speed, the time delay requirement of the task is relatively low.
  • high speed and low speed may be defined according to traffic rules in different regions, and of course, high speed and low speed may also be defined according to other factors, and the embodiment of the present invention does not limit this.
  • FIG. 6 is an example of a low speed and a high speed provided by an embodiment of the present invention. Referring to FIG. 6, when the mobile device's speed is 80 to 120 km / h, the movement state of the mobile device is high speed; the mobile device's speed is below 80 km / h , The mobile device's motion state is low speed. Under the high-speed state of the mobile device, the processing frame rate of the computing unit 103 is high.
  • the computing frame rate calculated by the computing unit is not less than 30FPS, that is, the processing time of each frame.
  • the delay does not exceed 33ms.
  • the lower processing frame rate can meet the demand.
  • the scheduling module 102 executes the following steps 403 to 404 for each delay-sensitive task of the N delay-sensitive tasks to determine a target processor for each delay-sensitive task.
  • the scheduling module 102 may assign a target computing unit to each of the delay-sensitive tasks in the order of priority levels from high to low, or may perform calculations from large to large. A small order assigns a target computing unit to each of the N delay-sensitive tasks. If the priority levels of delay-sensitive tasks are the same, the target processors can be allocated in order from large to small.
  • the delay-sensitive task 1 is taken as an example, and it is described in detail that the target processor is allocated to the delay-sensitive task in the embodiment of the present invention.
  • the scheduling module 102 determines a target power consumption margin range of each computing unit 103 of the vehicle-mounted device and a time required for each calculation 103 to execute the delay-sensitive task 1.
  • the scheduling module 102 can collect the current state of each computing unit through the following two steps to determine the target power consumption margin range of an computing unit 103, which specifically includes:
  • Step A The scheduling module 102 first determines the current power consumption of the computing unit, that is, the power consumption occupied by the computing task currently being performed by the computing unit.
  • Step B The scheduling module 102 determines the target power consumption margin range of the computing unit according to the current power consumption of the computing unit and the target power consumption range of the computing unit. It should be noted that the target power consumption range refers to the performance of the computing unit when the power consumption of the computing unit is maintained within this range.
  • the scheduling module 102 can read the target power consumption range of the computing unit from the register.
  • the target power consumption range of the computing unit can be obtained from the chip or test.
  • the scheduling module 102 can also obtain the maximum power consumption limit of the computing unit. When performing the task allocation, the scheduling module 102 needs to consider the maximum power consumption limit of the computing unit. Consumption cannot exceed this limit.
  • the scheduling module 102 can estimate the time T required for an operator to complete a task according to the following formula (1). In the task assignment process, not only must the operator work at the target power consumption, but also the operation must be guaranteed.
  • the processor completes the task within the delay request time t of the task, so T does not exceed the task's delay request time t, that is, T is less than or equal to, so as to ensure that the calculation of the task can be completed.
  • Nc and Nm in formula (1) are the number of basic computing units inside the operator, f 1 is the frequency of the core on the operator; f 2 is the frequency of the memory of the operator, and C i is the double precision required to complete the task Number of floating-point operations, m j is the memory access frequency required to complete the delay-sensitive task; the memory access frequency is the number of times the memory is accessed per second.
  • the capability information of each computing unit may be recorded through the capability information table, including: a target power consumption range margin range of the computing unit and a time required for the computing unit to perform the delay-sensitive task 1.
  • Table 1 is a possible implementation manner of the capability information table. Of course, the embodiment of the present invention does not limit the implementation manner of the capability information table. Referring to Table 1, the capability information table may also include the target power consumption range of the computing unit and the current power consumption.
  • the vehicle-mounted device should include the capability information of all the computing units (the computing unit used to calculate the DNN task).
  • Table 1 takes the vehicle-mounted device including two computing units as an example.
  • Table 1 should include the time required for each computing unit of the vehicle-mounted device to perform the delay-sensitive task.
  • the scheduling module 102 determines a target processor for the delay-sensitive task 1.
  • the target computing unit determined for the delay-sensitive task 1 is the computing unit that performs the calculation corresponding to the delay-sensitive task.
  • the scheduling module 102 may first calculate the power consumption requirement of the delay-sensitive task 1 and estimate the delay required duration for executing the delay-sensitive task 1.
  • the power consumption requirements of delay-sensitive tasks and the length of delay requirements can also be recorded through the requirement information table.
  • Table 2 is a possible implementation of the requirement information table.
  • the scheduling module 102 may calculate a power consumption requirement W of a task (which may be a delay-sensitive task) according to formula (2).
  • n is the number of basic operations required to complete the task
  • w ⁇ is the power consumption of the basic operation performed by the arithmetic unit. If the task is a neural network task (ie, a DNN task), n is the total multiplication and accumulation number of the neural network.
  • the delay required duration of a task refers to the deadline for completing the task at the current vehicle speed, that is, the minimum required time to complete the task.
  • the scheduling module 102 traverses each computing unit 103 of the mobile device to determine whether the power consumption requirement of the delay-sensitive task falls within the target power consumption margin of the computing unit 103, and whether the computing unit 103 is able to meet the delay
  • the delay of sensitive 1 requires the task to be completed within the time period. For example, after the scheduling module 102 determines the requirement information of the delay-sensitive task 1, it looks up the table 1 to determine the target processor of the delay-sensitive task 1. Specifically, for the arithmetic unit G0, it is first determined whether the time period T0 required by the arithmetic unit G0 to execute the delay-sensitive task 1 is less than or equal to the delay required time period t1 of the delay-sensitive task 1.
  • T0 is less than or equal to t1
  • T1 the power consumption requirement of delay-sensitive task 1 fall within the range of (MW0-TW0) to (MW1-TW0)
  • the computing unit G0 is determined as the target computing unit of delay-sensitive task 1.
  • the time period T0 required by the processor G0 to execute the delay-sensitive task 1 is greater than the delay requirement time t1 of the delay-sensitive task 1, or the power consumption requirement of the delay-sensitive task 1 does not fall into (MW0-TW0) ⁇ ( MW0-TW0)
  • the arithmetic unit G1 can be used as the target arithmetic unit of the delay-sensitive task 1.
  • the duration T1 required by the processor G1 to execute the delay-sensitive task 1 is less than or equal to the delay request duration t1 of the delay-sensitive task 1. If T1 is less than or equal to t1, then further determine the delay-sensitive task 1. Whether the power consumption requirement falls within the range of (MW2-TW1) to (MW3-TW1), and if it falls within this range, then the computing unit G1 is determined as the target computing unit of the delay-sensitive task 1.
  • Judging whether the power consumption requirement of the delay-sensitive task falls within the target power consumption margin of an arithmetic unit 103 may be: if the power consumption requirement of the delay-sensitive task is 20W, the target power consumption margin range of the arithmetic unit 1 is 10W ⁇ 30W, the target power consumption margin of Operator 2 is 5W ⁇ 18W. It can be seen that the power consumption requirements of delay-sensitive tasks fall into the target power margin of Operator 1, which is the target of this delay-sensitive task.
  • the computing unit is computing unit 1.
  • the input data of the task can be split and multiple delay-executors can execute one delay-sensitive task. Calculation. Specifically, if the scheduling module 102 traverses all the arithmetic units, the power consumption requirements of the delay-sensitive tasks do not match the target power consumption margin ranges of all the arithmetic units of the mobile device, the scheduling module 102 may The input data is split into at least two sub-data blocks.
  • the scheduling module 102 traverses all the arithmetic units to determine whether the power consumption matches the target power consumption margin range of the arithmetic units. If a target power consumption margin range of an operator of the mobile device matches the power consumption, it is determined that the operator is a target operator corresponding to the sub-data block. Finally, the target computing unit corresponding to all sub-data blocks is determined as the target computing unit corresponding to the delay-sensitive task.
  • the input data for executing Task 2 is an image captured by camera 2 and camera 3, which can be divided according to the camera, the image captured by camera 2 is divided into sub data blocks 1, and the image captured by camera 3 is divided into sub data block 2.
  • the target power consumption margin range of the computing unit G3 meets the power consumption required to calculate the sub-data block 1
  • the target power consumption margin range of the computing unit G4 meets the power consumption required to calculate the sub-data block 2.
  • the target corresponding to Task2 The arithmetic unit is G3, G4.
  • the delay requirement duration corresponding to each sub-data block may be determined according to the data amount of each sub-data block, for example, according to the number of images in each sub-data block.
  • the scheduling module 102 only considers whether the computing unit can meet the task's delay requirement when determining the target computing unit for the task. For example, if the input data of the delay-sensitive task 1 is not allowed to be split, the scheduling module 102 determines that the power requirement W1 of the delay-sensitive task 1 is not within the range of (MW0-TW0) to (MW1-TW0), and it is not ( In the range from MW2-TW1) to (MW3-TW1), if the time period T0 required by the processor G0 to execute the delay-sensitive task 1 is greater than the delay required time t1 of the delay-sensitive task 1, the processor G1 executes the delay-sensitive task. The required duration T1 of 1 is less than the delay required duration t1 of the delay-sensitive task 1, then it is determined that the arithmetic unit G1 is the target arithmetic unit of the delay-sensitive task 1.
  • the scheduling module executes steps 403 to 404 for each delay-sensitive task, so as to determine a target processor for each delay-sensitive task.
  • the scheduling module 102 assigns each delay-sensitive task to a corresponding target computing unit, and instructs the target computing unit corresponding to the delay-sensitive task to execute the delay-sensitive task within the delay-required duration of the delay-sensitive task.
  • the scheduling module 102 generates a task list, which is used to record a target operator corresponding to each delay-sensitive task of the N delay-sensitive tasks and a delay requirement duration corresponding to each target operator. .
  • the arithmetic unit 103 determines the delay-sensitive task to be executed according to the task list, and completes the delay-sensitive task to be executed within a delay required period of the delay-sensitive task to be executed. Table 3 below is one possible implementation of the task list.
  • task Target component Delay required Task0 G2 t0 Task1 G1 t1 Task2 G3, G4 t2, t3
  • the arithmetic unit G2 needs to execute all calculations of the task Task0 within t0, and the arithmetic unit G1 needs to execute all calculations of the task Task1 within t1.
  • a delay-sensitive task is performed by multiple operators, that is, the data of a delay-sensitive task is split into multiple sub-data blocks, and different sub-data blocks are calculated by different operators, then Each sub-data block also has a corresponding delay request time, and the delay request time is determined according to the length of the sub-data block calculated by an operator.
  • Task2 in Table 3 is jointly executed by the arithmetic units G3 and G4, where G3 needs to calculate the sub-data block 1 of Task 2 within t2, and G4 needs to calculate the sub-data block 2 of Task 2 within t2.
  • the scheduling module 102 assigns a corresponding operator to the non-delay-sensitive tasks.
  • the scheduling module 102 can also assign an arithmetic unit to the non-delay-sensitive task. Specifically, for each of the M tasks except the N delay-sensitive tasks (that is, the non-delay-sensitive tasks), the power consumption requirement of the task is calculated, and the target power consumption margin is calculated. An operator whose range matches the power consumption requirement of the task and is capable of executing the task within the delay request duration of the task is determined to be the target operator to execute the task, and the task is assigned to the task Target operator for delay-sensitive tasks.
  • the power consumption requirement of the task is the power consumption required to execute the task
  • the power consumption requirement of the non-delay-sensitive task can be calculated by referring to the above formula 2.
  • the delay request duration of the task is the time required to execute the task.
  • the delay request duration of a non-delay-sensitive task can be estimated according to the amount of input data of the non-delay-sensitive task. For details, refer to step 402 above. A description of how to delay the delay-sensitive task requires a time estimation, which is not described in the embodiment of the present invention.
  • non-delay-sensitive tasks can be assigned to be within the delay requirements of non-delay-sensitive tasks. An operator that finishes the non-delay-sensitive task.
  • the scheduling module 102 may combine the input data of at least two non-delay-sensitive tasks into one data block.
  • an operator of the mobile device calculates the power consumption required for the data block, and if the target power consumption margin range of an operator of the mobile device matches the power consumption, then the A data block is allocated to the arithmetic unit, and the arithmetic unit is instructed to finish calculating the data block within a required delay time corresponding to the data block.
  • the delay requirement of the data block is an estimated time required to complete the calculation of the data block, and the delay requirement of the data block may be estimated according to the data amount of the data block.
  • the merged data includes 10 images.
  • the processing time of each image cannot exceed 33ms.
  • the required time is 330ms.
  • Task1 and Task2 are both non-delay-sensitive tasks, and the data corresponding to Task1 and Task2 are combined into A data block (ie, images taken by cameras 3 and 2). If the power consumption required to calculate the images taken by cameras 3 and 2 falls within the target power consumption margin of the processor G5, instruct the processor G5 to execute Task1 And Task2 to calculate the images taken by cameras 3 and 2.
  • merging the input data of non-delay-sensitive tasks requires the computational resources of the operator, and the merged data block should be as small as possible.
  • the data of two of the non-delay-sensitive tasks Task4 and Task5 can be merged first, and an operator is assigned to the merged data.
  • an arithmetic unit can be assigned to Task6. For example, some arithmetic units have remaining power consumption after performing calculations, and Task6 can be assigned to the arithmetic unit.
  • the scheduling module 102 may perform task allocation with reference to the method shown in FIG. 7. Specifically, referring to FIG. 7, the following steps are included:
  • step S1 the scheduling module 102 sorts the tasks to be executed according to the priority order from high to low: Task0, Task1, and Task2.
  • step S2 the scheduling module 102 takes out the task Task0 with the highest priority level, and calculates the power consumption requirement w0 and the delay requirement duration t0 of the task0.
  • step S3 the scheduling module 102 obtains the target power consumption margin range of each operator and the time required for each operator to execute Task0.
  • step S4 it is determined whether the power consumption requirement of Task0 falls within the target power consumption margin range of the computing unit, and whether the time period for which the computing unit executes Task0 is less than or equal to the delay required time period of Task0.
  • Step S5 If Task0 falls within a target power consumption margin range of a certain computing unit, Task0 is assigned to the computing unit, indicating that the computing unit must return the calculation result of Task0 at time t (current time) + t0.
  • Step S6 If Task0 does not fall within the target power consumption margin range of all the arithmetic units, the data corresponding to Task0 is split into multiple sub-data blocks, and the sub-data blocks with a large amount of calculation are preferentially allocated to the arithmetic units with more resources .
  • the delay-sensitive tasks that need to be executed are Task0, Task1, and Task2 in order of priority level from high to low.
  • the power requirements of Task0, Task1, and Task2 are 10W, 20W, and 30W, respectively.
  • the vehicle-mounted device has two arithmetic units G0 and G1.
  • the maximum power consumption limit of the arithmetic unit G0 is 40W
  • the target power consumption is 30W
  • the maximum power consumption limit of the arithmetic unit G1 is 60W
  • the target power consumption is 40W. It is assumed that G0 and G1 both meet the delay requirements of Task0, Task1, and Task2.
  • an arithmetic unit may be allocated to each task in order from the priority level.
  • the occupied power consumption of the arithmetic unit G0 is 10W
  • the target power consumption margin is 20W
  • the arithmetic unit G1 is idle
  • the target power consumption margin is 40W.
  • the target power consumption margin of the arithmetic unit G0 is 20W
  • the power consumption requirement of Task0 is 10W.
  • the target power consumption margin of the arithmetic unit G0 can meet the 10W power consumption requirement of Task0. Therefore, Task0 can be assigned to the arithmetic unit. G0.
  • the target power consumption margin of the arithmetic unit G0 becomes 10W
  • the target power consumption margin of the arithmetic unit G1 is still 40W.
  • the target power consumption margin of the computing unit G1 can meet the 10W power consumption requirement of Task1, and Task1 can also be assigned to the computing unit G1.
  • the target power consumption margin of the computing unit G0 is still 10W, and the computing unit G1 The target power consumption margin becomes 20W.
  • the power consumption requirement of Task2 is greater than the target power consumption margin of the arithmetic unit G0 of 10W, and greater than the target power consumption margin of the arithmetic unit G0 of 20W, so the data of Task2 can be split into two parts.
  • a part of the power consumption requirement is 10W, which matches the target power consumption margin of the arithmetic unit G0 of 10W, so this part of the data can be allocated to the arithmetic unit G0 for calculation.
  • the power consumption requirement of the other part is 20W, which matches the target power consumption margin of 20W of the arithmetic unit G1. Then, this part of data can be allocated to the arithmetic unit G1 for calculation.
  • the delay-sensitive tasks that need to be executed are Task0, Task1, and Task2 in order of priority level from high to low.
  • the power requirements of Task0, Task1, and Task2 are 10W, 20W, and 30W, respectively.
  • the vehicle-mounted device has two processors, G0 and G1.
  • the maximum power consumption limit of the processor G0 is 40W
  • the target power consumption is 30W
  • the maximum power consumption limit of the processor G1 is 60W
  • the target power consumption is 40W.
  • the occupant power consumption of the arithmetic unit G0 is 10W
  • the target power consumption margin is 20W
  • the arithmetic unit G1 is idle
  • the target power consumption margin is 40W.
  • tasks with lower delay requirements may be merged, and the merged tasks may be allocated to a processor.
  • the data of Task0 and Task1 can be merged into one task.
  • the target power consumption margin of the arithmetic unit G1 is 10W
  • the target power consumption margin of the arithmetic unit G0 is 20W.
  • Task2 assigns a component.
  • An embodiment of the present invention provides a device, which may be a scheduling module in an in-vehicle device according to an embodiment of the present invention, such as the scheduling module 102 of the in-vehicle device 10 shown in FIG. 2.
  • a device which may be a scheduling module in an in-vehicle device according to an embodiment of the present invention, such as the scheduling module 102 of the in-vehicle device 10 shown in FIG. 2.
  • FIG. 10 shows a possible structural diagram of the foregoing communication device. As shown in FIG. 10, the device includes a determination unit 1001, a calculation unit 1002, and an allocation unit 1003.
  • a determining unit 1001 is configured to support the apparatus to perform step 401, step 402, and / or other processes for the technology described herein.
  • a computing unit 1002 configured to support the apparatus to perform step 403 in the above embodiments, and / or other processes used in the technology described herein;
  • An allocating unit 1003, configured to support the apparatus to perform step 405 in the above embodiments, and / or other processes for the technology described herein;
  • the apparatus includes a processing module 1101 and a communication module 1102.
  • the processing module 1101 is used to control and manage the actions of the device, for example, to execute the steps performed by the determination unit 1001, the calculation unit 1002, and the distribution unit 1003, and / or other processes for performing the techniques described herein.
  • the communication module 1102 is used to support interaction between the device and other devices.
  • the device may further include a storage module 1103.
  • the storage module 1103 is configured to store program code and data of the device.
  • the task scheduling method provided by the embodiment of the present invention may be applied to the method shown in FIG. 12, and the device may be the scheduling module 102 according to the embodiment of the present invention.
  • the device may include at least one processor 1201, a memory 1202, a transceiver 1203, and a communication bus 1204.
  • the processor 1201 is a control center of the device, and may be a processor or a collective name of a plurality of processing elements.
  • the processor 120201 is a central processing unit (CPU), or may be an application specific integrated circuit (ASIC), or one or more integrated circuits configured to implement an embodiment of the present invention.
  • CPU central processing unit
  • ASIC application specific integrated circuit
  • one or more microprocessors digital signal processors, DSPs
  • one or more field programmable gate arrays Field Programmable Gate Arrays, FPGAs.
  • the processor 1201 can execute various functions of the device by running or executing a software program stored in the memory 1202 and calling data stored in the memory 1202.
  • the processor 1201 may include one or more CPUs, such as CPU0 and CPU1 shown in FIG. 12.
  • the device may include multiple processors, such as the processor 1201 and the processor 1205 shown in FIG. 12.
  • processors can be a single-core processor (single-CPU) or a multi-core processor (multi-CPU).
  • a processor herein may refer to one or more devices, circuits, and / or processing cores for processing data (such as computer program instructions).
  • the memory 1202 may be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, a random access memory (random access memory, RAM), or other types that can store information and instructions
  • the dynamic storage device can also be Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc (Read-Only Memory, CD-ROM) or other optical disk storage, optical disk storage (Including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), disk storage media or other magnetic storage devices, or can be used to carry or store desired program code in the form of instructions or data structures and can be used by a computer Any other media accessed, but not limited to this.
  • the memory 1202 may exist independently, and is connected to the processor 1201 through a communication bus 1204.
  • the memory 1202 may also be integrated with the processor 1201.
  • the memory 1202 is configured to store a software program that executes the solution of the present invention, and is controlled and executed by the processor 1201.
  • the transceiver 1203 uses any device such as a transceiver to communicate with other nodes in the system shown in FIG. 1, such as other relay nodes, core nodes, target nodes, or source nodes. Or used to implement communication between the device and the base station in FIG. 1. It can also be used to communicate with communication networks, such as Ethernet, radio access network (RAN), wireless local area networks (WLAN), and so on.
  • the transceiver 1203 may include a receiving unit to implement a receiving function, and a transmitting unit to implement a transmitting function.
  • the communication bus 1204 may be an Industry Standard Architecture (ISA) bus, an External Device Component (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus.
  • ISA Industry Standard Architecture
  • PCI External Device Component
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only a thick line is used in FIG. 12, but it does not mean that there is only one bus or one type of bus.
  • the device structure shown in FIG. 12 does not constitute a limitation on the device, and may include more or fewer parts than shown, or some parts may be combined, or different parts may be arranged.
  • the processor 1201 may run code in the memory 1202 to execute the methods shown in FIG. 4 and FIG. 7 of the embodiment of the present invention.
  • the disclosed database access device and method can be implemented in other ways.
  • the embodiments of the database access device described above are merely schematic.
  • the division of the modules or units is only a logical function division.
  • Components can be combined or integrated into another device, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection of the database access device or unit through some interfaces, which may be electrical, mechanical or other forms.
  • the unit described as a separate component may or may not be physically separated, and the component displayed as a unit may be a physical unit or multiple physical units, that is, may be located in one place, or may be distributed to multiple different places. . Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
  • the integrated unit When the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a readable storage medium.
  • the technical solutions of the embodiments of the present application essentially or partly contribute to the existing technology or all or part of the technical solutions may be embodied in the form of a software product, which is stored in a storage medium
  • the instructions include a number of instructions for causing a device (which can be a single-chip microcomputer, a chip, or the like) or a processor to execute all or part of the steps of the method described in the embodiments of the present application.
  • the foregoing storage medium includes various media that can store program codes, such as a U disk, a mobile hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.

Abstract

Embodiments of the present application relate to the field of artificial intelligence. Disclosed are a task scheduling method and device, capable of improving the working efficiency of a vehicle-mounted device in comprehensive consideration of delay requirements and power consumption requirements in a running state. The method comprises: determining a motion state of a mobile device, and determining, according to the motion state of the mobile device, M tasks to be carried out; determining N delay-sensitive tasks in the M tasks; for the delay-sensitive tasks, computing power consumption demands of the delay-sensitive tasks; determining operators having target power consumption headroom ranges matching the power consumption demands of the delay-sensitive tasks and being capable of finishing the delay-sensitive tasks within delay-required durations of the delay-sensitive tasks as target operators for carrying out the delay-sensitive tasks; and allocating the delay-sensitive tasks to the target operators for delay-sensitive tasks.

Description

一种任务调度方法及装置Method and device for task scheduling 技术领域Technical field
本申请实施例涉及人工智能领域,尤其涉及一种任务调度方法及装置。The embodiments of the present application relate to the field of artificial intelligence, and in particular, to a method and a device for task scheduling.
背景技术Background technique
自动驾驶系统运用了计算机、现代传感、信息融合、通讯、人工智能及自动控制等技术,实现了环境感知、行为决策,路径规划以及运动控制等功能,集中。目前,自动驾驶系统的处理流程为:车载装置收集车身分布的各个摄像头采集到的图像数据,将这些图像数据传输给车载装置的运算器。经过对这些图像数据的深度学习,识别出车身周边环境(如:行人数量、与行人的距离等)。最后可以根据识别出的环境作出决策,如:控制车速、改变路径等。Automated driving systems use technologies such as computers, modern sensing, information fusion, communication, artificial intelligence, and automatic control to achieve functions such as environmental perception, behavioral decision-making, path planning, and motion control. At present, the processing flow of the automatic driving system is: the vehicle-mounted device collects image data collected by various cameras distributed on the vehicle body, and transmits these image data to the computing device of the vehicle-mounted device. After deep learning of these image data, the surrounding environment of the vehicle body (such as: the number of pedestrians, the distance from pedestrians, etc.) is identified. Finally, decisions can be made based on the identified environment, such as controlling vehicle speed and changing paths.
自动驾驶场景下,对任务时延的要求苛刻,即需要车载装置的运算器在严格的时长内执行完任务,如:识别出车身周边环境。伴随着苛刻的时延要求,运算器的计算量大幅升高,就会大大增加车载装置的能耗。另外,车载装置通常对能耗敏感,行车过程应尽可能降低计算能耗。但是,如果在行车过程维持较低的计算能耗,则难以满足行车过程严格的时延要求。可见,现有技术无法同时满足车载环境下对能耗与时延的要求。In the autonomous driving scenario, the task delay is demanding, that is, the computing device of the vehicle-mounted device is required to perform the task within a strict time, such as identifying the surrounding environment of the vehicle body. Along with the demanding time delay requirements, the calculation amount of the computing unit increases greatly, which will greatly increase the energy consumption of the vehicle-mounted device. In addition, in-vehicle devices are usually sensitive to energy consumption, and the driving energy consumption should be minimized during the driving process. However, if the computing energy consumption is kept low during the driving process, it is difficult to meet the strict delay requirements of the driving process. It can be seen that the existing technology cannot simultaneously meet the requirements for energy consumption and delay in a vehicle environment.
发明内容Summary of the Invention
本申请实施例提供一种任务调度方法及装置,能够综合考虑行驶状态下的时延要求和功耗要求,提升车载装置的工作效率。The embodiments of the present application provide a task scheduling method and device, which can comprehensively consider delay requirements and power consumption requirements in a driving state, and improve the working efficiency of a vehicle-mounted device.
为达到上述目的,本申请实施例采用如下技术方案:To achieve the above purpose, the embodiments of the present application adopt the following technical solutions:
第一方面,公开了一种任务调度方法,包括:车载装置首先确定移动设备的运动状态,根据移动设备的运动状态确定需要执行的M个任务,确定M个任务中的N个时延敏感任务。其中,M为大于等于1的整数,N为小于等于M的整数。进一步,针对N个时延敏感任务中的每一个时延敏感任务,计算时延敏感任务的功耗需求,将目标功耗余量范围与时延敏感任务的功耗需求相匹配且能够在时延敏感任务的延迟要求时长内执行完时延敏感任务的运算器确定为执行时延敏感任务的目标运算器,随后并将时延敏感任务分配给时延敏感任务的目标运算器。需要说明的是,目标功耗余量范围是根据移动设备的运算器的目标功耗范围与该运算器的当前功耗确定的;时延敏感任务的功耗需求是执行时延敏感任务所需的功耗;时延敏感任务的延迟要求时长是执行时延敏感任务所需的时长。In a first aspect, a task scheduling method is disclosed. The vehicle-mounted device first determines a motion state of a mobile device, determines M tasks to be executed according to the motion state of the mobile device, and determines N delay-sensitive tasks of the M tasks. . Among them, M is an integer greater than or equal to 1, and N is an integer less than or equal to M. Further, for each delay-sensitive task of the N delay-sensitive tasks, the power consumption requirements of the delay-sensitive tasks are calculated, and the target power consumption margin range is matched with the power consumption requirements of the delay-sensitive tasks. The delay of a delay-sensitive task requires that the processor that has executed the delay-sensitive task within the time period is determined as the target processor to execute the delay-sensitive task, and then assigns the delay-sensitive task to the target processor of the delay-sensitive task. It should be noted that the target power consumption margin range is determined according to the target power consumption range of the mobile device's computing unit and the current power consumption of the computing unit; the power consumption requirements of delay-sensitive tasks are required to perform delay-sensitive tasks. The delay required for delay-sensitive tasks is the time required to execute delay-sensitive tasks.
本发明实施例提的任务调度方法中,车载装置可以在车辆行进过程中对需要执行的任务进行合理调度,不仅满足任务的延迟要求时长,并且使得运算器可以工作在目标功耗状态下,综合考虑了行驶状态、时延、功耗的需求,提升车载平台的工作效率。In the task scheduling method provided in the embodiment of the present invention, the in-vehicle device can reasonably schedule tasks that need to be executed during the travel of the vehicle, which not only meets the delay requirement of the task, but also enables the computing unit to work under the target power consumption state. Considering the requirements of driving state, delay and power consumption, the working efficiency of the vehicle platform is improved.
结合第一方面,在第一方面的第一种可能的实现方式中,车载装置确定M个任务中的N个时延敏感任务具体包括:根据运动状态确定M个任务中每一个任务的时延敏感度;根据M个任务中每一个任务的时延敏感度确定M个任务中每一个任务的优先级等级;优先级等级与时延敏感度成正比例关系;最后,将M个任务中优先级等级较高的N个任务 确定为N个时延敏感任务。With reference to the first aspect, in a first possible implementation manner of the first aspect, the determination of the N delay-sensitive tasks of the M tasks by the vehicle-mounted device specifically includes: determining a delay of each of the M tasks according to a motion state. Sensitivity; determine the priority level of each of the M tasks according to the delay sensitivity of each of the M tasks; the priority level is proportional to the delay sensitivity; finally, the priority of the M tasks The N tasks with higher ranks are determined as N delay-sensitive tasks.
本发明实施例中,可以将时延敏感度较高的任务确定为时延敏感任务,进一步车载装置在后续处理中首先考虑为时延敏感任务分配运算器,保证时延敏感任务能够优先被执行。In the embodiment of the present invention, a task with high delay sensitivity may be determined as a delay sensitive task. Further, the vehicle-mounted device first considers assigning an operator to the delay sensitive task in subsequent processing to ensure that the delay sensitive task can be executed preferentially. .
结合第一方面或第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,计算时延敏感任务的功耗需求,将目标功耗余量范围与时延敏感任务的功耗需求相匹配且能够在时延敏感任务的延迟要求时长内执行完时延敏感任务的运算器确定为执行时延敏感任务的目标运算器具体包括:计算时延敏感任务的功耗需求;针对移动设备每一个运算器,确定运算器执行时延敏感任务的时长以及运算器的目标功耗余量范围;判断时延敏感任务的功耗需求是否落入运算器的目标功耗余量范围内以及运算器执行时延敏感任务的时长是否未超过时延敏感任务的延迟要求时长;若时延敏感任务的功耗需求落入运算器的目标功耗余量范围内且运算器执行时延敏感任务的时长未超过时延敏感任务的延迟要求时长,则确定运算器为时延敏感任务的目标运算器。With reference to the first aspect or the first possible implementation manner of the first aspect, in the second possible implementation manner of the first aspect, the power consumption requirements of delay-sensitive tasks are calculated, and the target power consumption margin range and time are calculated. Delay-sensitive tasks that match the power requirements of the delay-sensitive tasks and that can be executed within the delay-required duration of the processor are determined to be the target processor for delay-sensitive tasks. Specific tasks include: Power requirements; for each operator of the mobile device, determine the duration of the delay-sensitive tasks performed by the operator and the target power consumption margin of the operator; determine whether the power requirements of the delay-sensitive tasks fall within the target power of the operator Whether the delay time of the delay-sensitive task is not exceeded by the delay range of the processor and the delay-sensitive task is performed; if the power consumption requirement of the delay-sensitive task falls within the target power consumption margin of the processor and the operation If the duration of the delay-sensitive task is not exceeded by the delay-sensitive task, the processor is determined to be the target processor of the delay-sensitive task.
本发明实施例中,目标功耗范围指的是运算器的功耗维持在该范围内时,运算器的性能达到最佳状态,如果任务的功耗需求落入运算器目标功耗余量范围,则表示该运算器可以在最佳状态下执行该任务,因此提高了运算器的计算功效,进一步提升车载平台的工作效率。In the embodiment of the present invention, the target power consumption range refers to that when the power consumption of the computing unit is maintained within the range, the performance of the computing unit reaches the optimal state. If the power consumption requirement of the task falls within the target power consumption margin range , It means that the computing unit can perform the task in an optimal state, thereby improving the computing power of the computing unit and further improving the working efficiency of the vehicle platform.
结合第一方面的第二种可能的实现方式,在第一方面的第三种可能的实现方式中,方法还包括:若时延敏感任务的功耗需求与移动设备的所有运算器的目标功耗余量范围均不匹配,则将时延敏感任务的输入数据拆分为至少两个子数据块;针对至少一个子数据块中的每一个子数据块,确定计算子数据块所需的功耗,若移动设备的一个运算器的目标功耗余量范围与功耗相匹配,则确定运算器为子数据块对应的目标运算器;将所有子数据块对应的目标运算器确定为时延敏感任务对应的目标运算器。With reference to the second possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, the method further includes: if the power consumption requirement of the delay-sensitive task and the target power of all the computing devices of the mobile device If the consumption ranges do not match, the input data of the delay-sensitive task is split into at least two sub-data blocks; for each of the at least one sub-data block, the power consumption required to calculate the sub-data block is determined If the target power consumption margin range of one operator of the mobile device matches the power consumption, determine the operator as the target operator corresponding to the sub-data block; determine the target operator corresponding to all sub-data blocks as delay-sensitive The target computing unit corresponding to the task.
本发明实施例中,可以将时延敏感任务拆分,由多个运算器来执行,如此可以保证时延敏感任务能够快速被执行,保证满足任务的时延要求,不会影响车载装置的后续决策。In the embodiment of the present invention, delay-sensitive tasks can be split and executed by multiple processors. This can ensure that delay-sensitive tasks can be executed quickly, ensure that the task's delay requirements are met, and will not affect the follow-up of the vehicle-mounted device. decision making.
结合第一方面或第一方面的第一至第三种可能的实现方式中的任意一种,在第一方面的第四种可能的实现方式中,将时延敏感任务分配给时延敏感任务的目标运算器具体包括:计算N个时延敏感任务的执行时长,根据时延敏感任务的执行时长确定时延敏感任务的执行时长的延迟要求时长;生成任务列表,任务列表用于记录N个时延敏感任务中的每一个时延敏感任务对应的目标运算器以及每一个目标运算器对应的延迟要求时长,以便移动设备的运算器根据任务列表确定待执行的时延敏感任务并在待执行的时延敏感任务的延迟要求时长内执行完待执行的时延敏感任务。In combination with the first aspect or any one of the first to third possible implementation manners of the first aspect, in a fourth possible implementation manner of the first aspect, delay-sensitive tasks are assigned to delay-sensitive tasks The target operator specifically includes: calculating the execution time of N delay-sensitive tasks, and determining the delay request duration of the delay-sensitive task execution time according to the execution time of the delay-sensitive task; generating a task list, and the task list is used to record N Each target delay-sensitive task corresponds to the target operator and the delay requirement corresponding to each target operator, so that the mobile device's operator determines the delay-sensitive task to be executed according to the task list and waits for the execution. Delay-sensitive tasks require that the delay-sensitive tasks to be executed are completed within the time period.
本发明实施例中,可以通过任务列表的方式将任务分配给相应的运算器,运算器可以根据任务列表执行相应的任务的计算。In the embodiment of the present invention, a task may be assigned to a corresponding computing unit in a task list manner, and the computing unit may perform calculation of the corresponding task according to the task list.
结合第一方面的第一种可能的实现方式,在第一方面的第五种可能的实现方式中,本发明实施例提供的方法还包括:将N个时延敏感任务中的每一个时延敏感任务分配给对应的目标运算器后,针对M个任务中除N个时延敏感任务外的每一个任务,计算任务的功耗需求,将目标功耗余量范围与任务的功耗需求相匹配且能够在任务的延迟要求时长内执行完任务的运算器确定为执行任务的目标运算器,并将任务分配给时延敏感任务 的目标运算器。其中,任务的功耗需求是执行任务所需的功耗,任务的延迟要求时长是执行任务所需的时长。With reference to the first possible implementation manner of the first aspect, in a fifth possible implementation manner of the first aspect, the method provided by the embodiment of the present invention further includes: delaying each of the N delay-sensitive tasks After the sensitive task is assigned to the corresponding target computing unit, for each task of the M tasks except N delay-sensitive tasks, the task's power consumption requirement is calculated, and the target power consumption margin range is related to the task's power consumption requirement. The computing unit that is matched and capable of executing the task within the delay required time of the task is determined as the target computing unit to execute the task, and the task is allocated to the target computing unit of the delay-sensitive task. Among them, the power consumption requirement of the task is the power consumption required to execute the task, and the delay requirement of the task is the time required to execute the task.
本发明实施例中,在为时延敏感任务分配好运算器后,还可以为非时延敏感任务分配运算器,保证非时延敏感任务能够被执行,避免对车载装置的后续决策造成影响。并且,运算器可以在最佳状态下执行非时延敏感任务,提高了运算器的计算功效,进一步提升车载平台的工作效率。In the embodiment of the present invention, after an operator is assigned to the delay-sensitive task, an operator may also be assigned to the non-delay-sensitive task to ensure that the non-delay-sensitive task can be executed and avoid affecting the subsequent decision-making of the vehicle-mounted device. In addition, the computing unit can perform non-time delay sensitive tasks in the best state, which improves the computing power of the computing unit and further improves the working efficiency of the vehicle platform.
结合第一方面的第二种可能的实现方式,在第一方面的第六种可能的实现方式中,运算器执行时延敏感任务的时长T满足:With reference to the second possible implementation manner of the first aspect, in a sixth possible implementation manner of the first aspect, the duration T of the processor performing the delay-sensitive task satisfies:
Figure PCTCN2019088450-appb-000001
其中,Nc和Nm是运算器内部基本计算单元数量,f 1为运算器上核心的频率;f 2为运算器的内存的频率,C i是执行完时延敏感任务所需的双精度浮点运算次数,m j是执行完时延敏感任务所需的内存访问频率;内存访问频率是每秒访问内存的次数。
Figure PCTCN2019088450-appb-000001
Among them, Nc and Nm are the number of basic computing units inside the computing unit, f 1 is the frequency of the core on the computing unit; f 2 is the frequency of the memory of the computing unit, and C i is the double-precision floating point required to perform delay-sensitive tasks. The number of operations, m j is the frequency of memory access required to complete the delay-sensitive task; the memory access frequency is the number of times the memory is accessed per second.
本发明实施例中,可以通过上述公式计算出各个运算器执行某个任务所需的时长,进而可以判断运算器是否满足任务的时延要求。In the embodiment of the present invention, the time required for each computing unit to perform a certain task can be calculated by using the foregoing formula, and then it can be determined whether the computing unit meets the delay requirement of the task.
结合第一方面或第一方面的第一至第六种可能的实现方式中的任意一种,在第一方面的第七种可能的实现方式中,时延敏感任务的功耗需求W满足:W=n*w^,其中,n执行完时延敏感任务的所需的基本操作数,w^为运算器进行一次基本操作的功耗。With reference to the first aspect or any one of the first to sixth possible implementation manners of the first aspect, in a seventh possible implementation manner of the first aspect, the power consumption requirement W of the delay-sensitive task satisfies: W = n * w ^, where n is the number of basic operations required to complete the delay-sensitive task, and w ^ is the power consumption of the basic operation performed by the arithmetic unit.
本发明实施例中,可以通过上述公式计算出各个运算器执行某个任务所需的功耗,进而可以判断运算器是否满足任务的功耗需求。In the embodiment of the present invention, the power consumption required by each computing unit to perform a certain task can be calculated by using the foregoing formula, and then it can be determined whether the computing unit meets the power consumption requirement of the task.
第二方面,公开了一种装置,包括:确定单元,用于确定移动设备的运动状态,根据移动设备的运动状态确定需要执行的M个任务,确定M个任务中的N个时延敏感任务;M为大于等于1的整数,N为小于等于M的整数;计算单元,用于针对N个时延敏感任务中的每一个时延敏感任务,计算时延敏感任务的功耗需求;确定单元还用于,将目标功耗余量范围与时延敏感任务的功耗需求相匹配且能够在时延敏感任务的延迟要求时长内执行完时延敏感任务的运算器确定为执行时延敏感任务的目标运算器;分配单元,用于将时延敏感任务分配给时延敏感任务的目标运算器。需要说明的是,目标功耗余量范围是根据移动设备的运算器的目标功耗范围与该运算器的当前功耗确定的;时延敏感任务的功耗需求是执行时延敏感任务所需的功耗;时延敏感任务的延迟要求时长是执行时延敏感任务所需的时长。In a second aspect, an apparatus is disclosed, including: a determining unit for determining a motion state of a mobile device, determining M tasks to be performed according to the motion state of the mobile device, and determining N delay-sensitive tasks among the M tasks ; M is an integer greater than or equal to 1, N is an integer less than or equal to M; a calculation unit for calculating the power consumption requirements of the delay-sensitive tasks for each delay-sensitive task of the N delay-sensitive tasks; a determination unit It is also used to determine that the target power consumption margin range matches the power consumption requirements of the delay-sensitive task and that the processor capable of executing the delay-sensitive task within the delay-required duration of the delay-sensitive task is determined to execute the delay-sensitive task Target arithmetic unit; an allocation unit for assigning delay-sensitive tasks to the target arithmetic unit of delay-sensitive tasks. It should be noted that the target power consumption margin range is determined according to the target power consumption range of the mobile device's computing unit and the current power consumption of the computing unit; the power consumption requirements of delay-sensitive tasks are required to perform delay-sensitive tasks. The delay required for delay-sensitive tasks is the time required to execute delay-sensitive tasks.
本发明实施例提的装置,可以在车辆行进过程中对需要执行的任务进行合理调度,不仅满足任务的延迟要求时长,并且使得运算器可以工作在目标功耗状态下,综合考虑了行驶状态、时延、功耗的需求,提升车载平台的工作效率。The device provided in the embodiment of the present invention can reasonably schedule tasks that need to be performed during the travel of the vehicle, not only meeting the delay time required for the task, but also enabling the computing unit to work under the target power consumption state, taking into account the driving state, The requirements of delay and power consumption improve the working efficiency of the vehicle platform.
结合第二方面,在第二方面的第一种可能的实现方式中,确定单元具体用于,根据运动状态确定M个任务中每一个任务的时延敏感度;根据M个任务中每一个任务的时延敏感度确定M个任务中每一个任务的优先级等级;优先级等级与时延敏感度成正比例关系;将M个任务中优先级等级较高的N个任务确定为N个时延敏感任务。With reference to the second aspect, in a first possible implementation manner of the second aspect, the determining unit is specifically configured to determine the delay sensitivity of each of the M tasks according to the motion state; and according to each of the M tasks The delay sensitivity determines the priority level of each of the M tasks; the priority level is proportional to the delay sensitivity; the N tasks with a higher priority level among the M tasks are determined as N delays Sensitive tasks.
结合第二方面或第二方面的第一种可能的实现方式中,在第二方面的第二种可能的实现方式中,计算单元具体用于,计算时延敏感任务的功耗需求;针对移动设备的每一 个运算器,确定运算器执行时延敏感任务的时长以及运算器的目标功耗余量范围;确定单元具体用于,判断时延敏感任务的功耗需求是否落入运算器的目标功耗余量范围内以及运算器执行时延敏感任务的时长是否未超过时延敏感任务的延迟要求时长;若时延敏感任务的功耗需求落入运算器的目标功耗余量范围内且运算器执行时延敏感任务的时长未超过时延敏感任务的延迟要求时长,则确定运算器为时延敏感任务的目标运算器。With reference to the second aspect or the first possible implementation manner of the second aspect, in the second possible implementation manner of the second aspect, the computing unit is specifically configured to calculate the power consumption requirements of delay-sensitive tasks; for mobile Each operator of the device determines the duration of the delay-sensitive tasks performed by the operator and the target power consumption margin range of the operator; the determination unit is specifically used to determine whether the power requirements of the delay-sensitive tasks fall within the target of the operator Within the power consumption margin and whether the delay time of the delay-sensitive task performed by the component does not exceed the delay requirement length of the delay-sensitive task; if the power consumption requirement of the delay-sensitive task falls within the target power consumption margin of the component and The execution time of the delay-sensitive task by the arithmetic unit does not exceed the delay required time of the delay-sensitive task. Then, the arithmetic unit is determined as the target arithmetic unit of the delay-sensitive task.
结合第二方面的第二种可能的实现方式中,在第二方面的第三种可能的实现方式中,确定单元还用于,若时延敏感任务的功耗需求与移动设备的所有运算器的目标功耗余量范围均不匹配,则将时延敏感任务的输入数据拆分为至少两个子数据块;针对至少一个子数据块中的每一个子数据块,确定计算子数据块所需的功耗,若移动设备的一个运算器的目标功耗余量范围与功耗相匹配,则确定运算器为子数据块对应的目标运算器;将所有子数据块对应的目标运算器确定为时延敏感任务对应的目标运算器。With reference to the second possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the determining unit is further configured to, if the power consumption requirements of the delay-sensitive task and all the computing devices of the mobile device If the target power consumption margin ranges do not match, the input data of the delay-sensitive task is split into at least two sub-data blocks; for each sub-data block in the at least one sub-data block, the required calculation of the sub-data block is determined. If the target power consumption margin range of a computing unit of the mobile device matches the power consumption, the computing unit is determined as the target computing unit corresponding to the sub-data block; the target computing unit corresponding to all sub-data blocks is determined as Target processor for delay-sensitive tasks.
结合第二方面或第二方面的第一至第三种可能的实现方式中的任意一种,在第二方面的第四种可能的实现方式中,分配单元具体用于,计算N个时延敏感任务的执行时长,根据时延敏感任务的执行时长确定时延敏感任务的执行时长的延迟要求时长;With reference to the second aspect or any one of the first to third possible implementation manners of the second aspect, in a fourth possible implementation manner of the second aspect, the allocation unit is specifically configured to calculate N delays The execution time of sensitive tasks. Determine the delay requirement of the execution time of delay-sensitive tasks according to the execution time of delay-sensitive tasks.
生成任务列表,任务列表用于记录N个时延敏感任务中的每一个时延敏感任务对应的目标运算器以及每一个目标运算器对应的延迟要求时长,以便移动设备的运算器根据任务列表确定待执行的时延敏感任务并在待执行的时延敏感任务的延迟要求时长内执行完待执行的时延敏感任务。Generate a task list. The task list is used to record the target processor corresponding to each delay-sensitive task in the N delay-sensitive tasks and the delay requirement duration corresponding to each target processor, so that the mobile device's processor can determine it according to the task list. The delay-sensitive task to be executed and the delay-sensitive task to be executed are completed within the required time delay.
结合第二方面的第一种可能的实现方式中,在第二方面的第五种可能的实现方式中,计算单元还用于,将N个时延敏感任务中的每一个时延敏感任务分配给对应的目标运算器后,针对M个任务中除N个时延敏感任务外的每一个任务,计算任务的功耗需求;分配单元还用于,将目标功耗余量范围与任务的功耗需求相匹配且能够在任务的延迟要求时长内执行完任务的运算器确定为执行任务的目标运算器,并将任务分配给时延敏感任务的目标运算器;其中,任务的功耗需求是执行任务所需的功耗,任务的延迟要求时长是执行任务所需的时长。With reference to the first possible implementation manner of the second aspect, in the fifth possible implementation manner of the second aspect, the computing unit is further configured to allocate each delay-sensitive task of the N delay-sensitive tasks After giving the corresponding target computing unit, for each task of the M tasks except N delay-sensitive tasks, calculate the power consumption requirements of the task; the allocation unit is also used to compare the target power consumption margin range with the task's power The computing unit that matches the power consumption requirements and is able to complete the task within the required delay time of the task is determined as the target computing unit to execute the task, and the task is allocated to the target computing unit of the delay-sensitive task; among them, the power consumption requirement of the task is The power required to execute a task. The delay required for a task is the time required to execute the task.
结合第二方面的第二种可能的实现方式中,在第二方面的第六种可能的实现方式中,运算器执行时延敏感任务的时长T满足:With reference to the second possible implementation manner of the second aspect, in a sixth possible implementation manner of the second aspect, the duration T of the processor performing the delay-sensitive task satisfies:
Figure PCTCN2019088450-appb-000002
其中,Nc和Nm是运算器内部基本计算单元数量,f 1为运算器上核心的频率;f 2为运算器的内存的频率,C i是执行完时延敏感任务所需的双精度浮点运算次数,m j是执行完时延敏感任务所需的内存访问频率;内存访问频率是每秒访问内存的次数。
Figure PCTCN2019088450-appb-000002
Among them, Nc and Nm are the number of basic computing units inside the computing unit, f 1 is the frequency of the core on the computing unit; f 2 is the frequency of the memory of the computing unit, and C i is the double-precision floating point required to perform delay-sensitive tasks. The number of operations, m j is the frequency of memory access required to complete the delay-sensitive task; the memory access frequency is the number of times the memory is accessed per second.
结合第二方面或第二方面的第一至第六种可能的实现方式中的任意一种,在第二方面的第七种可能的实现方式中,时延敏感任务的功耗需求W满足:W=n*w^,其中,n执行完时延敏感任务的所需的基本操作数,w^为运算器进行一次基本操作的功耗。With reference to the second aspect or any one of the first to sixth possible implementation manners of the second aspect, in a seventh possible implementation manner of the second aspect, the power consumption requirement of the delay-sensitive task W satisfies: W = n * w ^, where n is the number of basic operations required to complete the delay-sensitive task, and w ^ is the power consumption of the basic operation performed by the arithmetic unit.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本申请实施例提供的车载系统的架构图;FIG. 1 is a structural diagram of a vehicle-mounted system provided by an embodiment of the present application; FIG.
图2为本发明实施例提供车载系统的另一架构图;FIG. 2 is another architecture diagram of a vehicle-mounted system provided by an embodiment of the present invention; FIG.
图3为本发明实施例提供的预处理示意图;FIG. 3 is a schematic diagram of preprocessing provided by an embodiment of the present invention; FIG.
图4为本发明实施例提供的任务调度方法的流程示意图;4 is a schematic flowchart of a task scheduling method according to an embodiment of the present invention;
图5为本发明实施例提供的任务的输入数据的示意图;5 is a schematic diagram of input data of a task according to an embodiment of the present invention;
图6为本发明实施例提供的高低速示意图;6 is a schematic diagram of high and low speeds provided by an embodiment of the present invention;
图7为本发明实施例提供的任务分配方法的流程示意图;7 is a schematic flowchart of a task allocation method according to an embodiment of the present invention;
图8为本发明实施例提供的任务状态示意图;8 is a schematic diagram of a task state according to an embodiment of the present invention;
图9为本发明实施例提供的另一任务状态示意图;FIG. 9 is a schematic diagram of another task state according to an embodiment of the present invention; FIG.
图10为本发明实施例提供的装置的结构框图;FIG. 10 is a structural block diagram of a device according to an embodiment of the present invention; FIG.
图11为本发明实施例提供的装置的另一结构框图;FIG. 11 is another structural block diagram of an apparatus according to an embodiment of the present invention; FIG.
图12为本发明实施例提供的装置的另一结构框图。FIG. 12 is another structural block diagram of a device provided by an embodiment of the present invention.
具体实施方式Detailed ways
目前,车载系统的架构如图1所示。参考图1,车身设置有监测装置,车辆内部安装有车载装置。监测装置可以是传感器,如:摄像头、激光传感器等。在自动驾驶场景下,监测装置实时采集车身周围的环境信息,根据采集到的信息(如:摄像头拍摄到的图像)进行计算,根据计算的结果识别车身周围物体,为后续的决策提供输入。如:根据计算结果控制车辆行驶速度、车辆行驶路线等。Currently, the architecture of the vehicle-mounted system is shown in Figure 1. Referring to FIG. 1, the vehicle body is provided with a monitoring device, and a vehicle-mounted device is installed inside the vehicle. The monitoring device may be a sensor, such as a camera, a laser sensor, and the like. In an autonomous driving scenario, the monitoring device collects environmental information around the vehicle body in real time, performs calculations based on the collected information (such as images taken by a camera), identifies objects around the vehicle body based on the calculation results, and provides input for subsequent decision-making. For example, according to the calculation results, the speed of the vehicle and the route of the vehicle are controlled.
自动驾驶场景下对时延的要求极高,这是由于车身周围环境复杂,需要车载装置在车辆高速行驶过程中快速进行大量的计算,以便能及时识别车身周围的物体。大量的计算会导致车载装置产生极大的能耗。但是,车载装置通常对能耗敏感,行车过程应尽可能降低能耗。可见,现有技术并不能均衡车载环境下车载装置对能耗与时延的要求。The requirements for time delay in an automatic driving scenario are extremely high. This is because the surrounding environment of the vehicle body is complex, and the vehicle-mounted device needs to perform a large number of calculations quickly during high-speed driving of the vehicle, so that objects around the vehicle body can be identified in time. A large amount of calculations can cause a huge amount of energy consumption for on-board devices. However, in-vehicle devices are usually sensitive to energy consumption, and the driving process should reduce energy consumption as much as possible. It can be seen that the prior art cannot balance the requirements of vehicle-mounted devices on energy consumption and delay in a vehicle-mounted environment.
本发明实施例提供一种任务调度方法,移动设备的车载装置首先确定移动设备的运动状态,并根据移动设备的运动状态确定需要执行的M个任务,随后还可以确定这M个任务中的N个时延敏感任务。进一步,针对上述N个时延敏感任务中的每一个时延敏感任务执行以下步骤:车载装置计算时延敏感任务的功耗需求,将目标功耗余量范围与该时延敏感任务的功耗需求相匹配且能够在该时延敏感任务的延迟要求时长内执行完该时延敏感任务的运算器确定执行该时延敏感任务的目标运算器。进一步,将该时延敏感任务分配对应的目标运算器。可见,本发明实施例中,车载装置可以在车辆行进过程中对需要执行的任务进行合理调度,不仅满足任务的延迟要求时长,并且使得运算器可以工作在目标功耗状态下,综合考虑了行驶状态、时延、功耗的需求,提升车载平台的工作效率。An embodiment of the present invention provides a task scheduling method. An in-vehicle device of a mobile device first determines a motion state of the mobile device, and determines M tasks to be performed according to the motion state of the mobile device. Then, N of the M tasks can also be determined. Delay-sensitive tasks. Further, for each of the above N delay-sensitive tasks, the following steps are performed: the vehicle-mounted device calculates the power consumption requirements of the delay-sensitive tasks, and compares the target power consumption margin range with the power consumption of the delay-sensitive tasks. An operator that matches the requirements and is capable of executing the delay-sensitive task within the delay required time of the delay-sensitive task determines a target processor that executes the delay-sensitive task. Further, the delay-sensitive task is assigned a corresponding target computing unit. It can be seen that, in the embodiment of the present invention, the in-vehicle device can reasonably schedule tasks that need to be performed during the travel of the vehicle, which not only meets the delay time required for the task, but also enables the computing unit to work at the target power consumption state, taking into account driving The requirements of state, delay and power consumption improve the working efficiency of the vehicle platform.
本发明实施例还提供了一种车载系统,如图2所示,所述车载系统包括:车载装置10和至少一个检测装置20。其中,车载装置10包括包括输入/输出(Input/Output,IO)逻辑101、调度模块102以及至少一个运算器103。具体地,调度模块102可以是中央处理器(Central Process Unit,CPU),运算器103可以是ACC(Accumulator)或图形处理器(Graphics Processing Unit,GPU),也可以是神经网络芯片。调度模块102可以为运算器103分配一个或多个待执行的任务。检测装置20可以是摄像头、激光传感器、毫米波雷达等传感装置。An embodiment of the present invention further provides a vehicle-mounted system. As shown in FIG. 2, the vehicle-mounted system includes: a vehicle-mounted device 10 and at least one detection device 20. The vehicle-mounted device 10 includes input / output (IO) logic 101, a scheduling module 102, and at least one computing unit 103. Specifically, the scheduling module 102 may be a central processing unit (CPU), the computing unit 103 may be an ACC (Accumulator) or a graphics processor (Graphics Processing Unit, GPU), or a neural network chip. The scheduling module 102 may allocate one or more tasks to be executed to the processor 103. The detection device 20 may be a sensing device such as a camera, a laser sensor, or a millimeter wave radar.
需要说明的是,以移动设备为车辆作为示例,移动设备上安装的摄像头包括以下几种:It should be noted that, taking a mobile device as a vehicle as an example, the camera installed on the mobile device includes the following types:
(1)长距摄像头:位于车身前方,用于识别较远的目标,如:红绿灯;(1) Long-distance camera: It is located in front of the vehicle body and is used to identify distant targets, such as traffic lights;
(2)中距摄像头:位于车身前方,主要用于识别车道线、人、车、非机动车、障碍物等,还可以用于测距;(2) Medium-range camera: It is located in front of the vehicle body and is mainly used to identify lane lines, people, cars, non-motor vehicles, obstacles, etc., and can also be used for distance measurement;
(3)近距摄像头:位于车身周围,主要识别人、车、非机动车、障碍物、可行驶区域等;(3) Close-range camera: located around the vehicle body, mainly identifying people, cars, non-motor vehicles, obstacles, and driveable areas;
除此之外,移动设备上还可以安装其他的摄像头,如:位于车身后方的摄像头,可用于停车、泊车;安装于车辆内部驾驶位附近的摄像头,可用于监测疲劳驾驶等。In addition, other cameras can be installed on the mobile device, such as a camera located behind the vehicle body, which can be used for parking and parking; a camera installed near the driver's seat inside the vehicle, which can be used to monitor fatigue driving.
在本发明实施例中,检测装置20采集车身周围环境信息,如:车身分布的摄像头在不同角度、广度、探测距离、帧率、分辨率下拍摄图像。检测装置20将采集到的信息传递给IO逻辑101,IO逻辑101对接收到的信息进行统一管理和预处理。IO逻辑101对接收到的信息(如:图像)的预处理可以是将摄像头拍摄到的图像进行分类。如图3所示,每个摄像头都拍摄了m帧图像,可以按照图像的帧数将所有摄像头拍摄的图像依次分类。示例的,将所有摄像头拍摄的图像中的第一帧(frame0)图像分为一组,将所有图像中的第二帧(frame1)图像分为一组,以此类推,可以将摄像头拍摄的所有图像进行分类。In the embodiment of the present invention, the detection device 20 collects information about the surrounding environment of the vehicle body, for example, the cameras of the vehicle body distribution capture images at different angles, breadths, detection distances, frame rates, and resolutions. The detection device 20 passes the collected information to the IO logic 101, and the IO logic 101 performs unified management and preprocessing on the received information. The preprocessing of the received information (such as images) by the IO logic 101 may be to classify the images captured by the camera. As shown in FIG. 3, each camera captures m frames of images, and the images captured by all cameras can be classified in order according to the number of frames of the images. For example, the first frame (frame0) of the images captured by all cameras is grouped, the second frame (frame1) of all images is grouped, and so on. Images are classified.
随后,IO逻辑101将预处理后的图像传递给调度模块102,调度模块102根据IO逻辑101输入的信息生成计算任务,并将计算任务送入运算器103进行处理处理,运算器103运行深度神经网络(Deep Neural Network,DNN)算法,并将计算结果传回调度模块102。调度模块102可以根据计算结果进行后续决策。Subsequently, the IO logic 101 passes the pre-processed image to the scheduling module 102. The scheduling module 102 generates a computing task according to the information input by the IO logic 101, and sends the computing task to the processor 103 for processing. The processor 103 runs the deep nerve Network (Deep Neural Network, DNN) algorithm, and returns the calculation result to the scheduling module 102. The scheduling module 102 can make subsequent decisions according to the calculation results.
本发明实施例提供一种任务调度方法,可以为调度模块102的任务调度提供支持。如图4所示,所述方法包括以下步骤:An embodiment of the present invention provides a task scheduling method, which can provide support for task scheduling of the scheduling module 102. As shown in Figure 4, the method includes the following steps:
401、调度模块102确定移动设备当前的运动状态,根据所述移动设备的运动状态确定需要执行的M个任务。401. The scheduling module 102 determines a current motion state of the mobile device, and determines M tasks to be executed according to the motion state of the mobile device.
需要说明的是,调度模块102可以接收检测装置20发送的数据,根据检测装置20发送的数据计算出移动设备当前的运动状态,如:移动设备的运行速度。另外,本发明实施例中的移动设备可以车辆、无人机等具有移动功能的设备。另外,M为大于等于1的整数。It should be noted that the scheduling module 102 may receive data sent by the detection device 20 and calculate the current motion state of the mobile device according to the data sent by the detection device 20, such as the running speed of the mobile device. In addition, the mobile device in the embodiment of the present invention may be a device with a mobile function such as a vehicle or a drone. In addition, M is an integer of 1 or more.
具体实现中,调度模块102根据车载感知系统算法确定需要执行的DNN任务,即本发明实施例所述的M个任务(task),如:检测车道线,检测行人,检测机动车,检测非机动车等任务。在本发明实施例中,每个任务由多个检测装置20获得的数据(如:摄像头拍摄的图像)作为输入,不同的任务交由不同的运算器执行。示例的,参考图5,移动设备在当前的运动状态下有Task 0,Task 1,Task 2三个任务,其中,运算器执行Task0所需的输入数据是摄像头0、摄像头1以及摄像头3拍摄的图像,执行Task 1所需的输入数据是摄像头1以及摄像头2拍摄的图像,执行Task 2所需的输入数据是摄像头2以及摄像头3拍摄的图像。In specific implementation, the scheduling module 102 determines the DNN tasks that need to be executed according to the vehicle sensing system algorithm, that is, the M tasks described in the embodiment of the present invention, such as: detecting lane lines, detecting pedestrians, detecting motor vehicles, and detecting non-machines. Moving cars and other tasks. In the embodiment of the present invention, each task is inputted with data (for example, an image taken by a camera) obtained by multiple detection devices 20, and different tasks are performed by different processors. For example, referring to FIG. 5, the mobile device has three tasks: Task 0, Task 1, and Task 2 under the current motion state. The input data required for the processor to execute Task 0 is taken by camera 0, camera 1, and camera 3. For images, the input data required to execute Task 1 is the images captured by camera 1 and camera 2, and the input data required to execute Task 2 is the images captured by camera 2 and camera 3.
402、调度模块102确定上述M个任务中的N个时延敏感任务。402. The scheduling module 102 determines N delay-sensitive tasks among the M tasks.
需要说明的是,移动设备的运动状态不同,任务的时延敏感度是不同的,示例的,移动设备高速行驶时要求车载装置10能够快速识别移动设备正前方的车辆,行人,障碍物等,因此“识别前方车辆”、“识别前方行人”等任务的优先级较高;另外,识别移动设备后方情况的相关任务优先级相对较低,如:“识别后方车辆”、“识别后方行人”等任务。另外,N为小于等于M的整数。It should be noted that the delay sensitivities of tasks are different for different movement states of mobile devices. For example, when the mobile device is traveling at high speed, the in-vehicle device 10 is required to be able to quickly identify vehicles, pedestrians, obstacles, etc. in front of the mobile device. Therefore, tasks such as "identify vehicles in front" and "identify pedestrians in front" have higher priority; in addition, related tasks that identify conditions behind mobile devices have relatively low priorities, such as "identify vehicles in the rear", "identify pedestrians in the rear", etc. task. In addition, N is an integer of M or less.
具体实现中,调度模块102可以根据移动设备的运动状态确定M个任务中每一个任务的时延敏感度。进一步,由于任务的优先级等级与任务的时延敏感度成正比例关系,因此调度模块102还可以根据每一个任务的时延敏感度确定任务的优先级等级。另外,还可以根据任务的延迟要求时长确定任务的时延敏感度,延迟要求时长越小的任务时延敏感度越高。示例的,任务1、任务2、任务3的延迟要求时长分别是5s、10s、8s,因此,这三个任务按照时延敏感度从高到低依次为任务1、任务3、任务2,即按照优先级等级从高到低依次为任务1、任务3、任务2。In specific implementation, the scheduling module 102 may determine the delay sensitivity of each of the M tasks according to the motion state of the mobile device. Further, since the priority level of the task is proportional to the delay sensitivity of the task, the scheduling module 102 may also determine the priority level of the task according to the delay sensitivity of each task. In addition, the delay sensitivity of the task can also be determined according to the delay requirement duration of the task. The shorter the delay requirement duration, the higher the task's delay sensitivity. For example, the delay requirements of task 1, task 2, and task 3 are 5 s, 10 s, and 8 s, respectively. Therefore, the three tasks are task 1, task 3, and task 2 in descending order of delay sensitivity, that is, According to the priority level, task 1, task 3, and task 2 are in order.
在本发明实施例中,任务的延迟要求时长可以是预估的执行该任务所需的时间,通常可以根据一个任务所需的输入数据的数量确定执行该任务所需的时间。示例的,执行任务1所需的输入数据是摄像头1、摄像头2在1s内拍摄的图像,假设摄像头1、摄像头2在1s内各自拍摄了8张图像,因此,执行任务1所需的输入数据是16张图像。如果规定一张图像的处理时间不能超过2ms,则执行该任务所需的时间不能超过2*16=32ms,因此,该任务的延迟要求时长为32ms。In the embodiment of the present invention, the delay required duration of a task may be an estimated time required to perform the task, and the time required to perform the task may generally be determined according to the amount of input data required for a task. For example, the input data required to perform task 1 are the images taken by camera 1 and camera 2 within 1 s. It is assumed that camera 1 and camera 2 each took 8 images within 1 s. Therefore, the input data required to perform task 1 It is 16 images. If it is specified that the processing time of an image cannot exceed 2ms, the time required to execute the task cannot exceed 2 * 16 = 32ms, so the delay required for the task is 32ms.
随后,调度模块102还可以将所述M个任务按照优先级从高到低的顺序排列,将其中优先级等级较高的N个任务确定为N个时延敏感任务。Subsequently, the scheduling module 102 may also arrange the M tasks in order of priority from high to low, and determine N tasks with higher priority levels as N delay-sensitive tasks.
在一些实施例中,可以将移动设备的运动状态分为高速和低速,不同运动状态下任务的时延要求时长不同。具体地,移动设备的运动状态为高速时,要求能够快速处理任务,对任务的时延要求比较高;移动设备的运动状态为低速时,对任务的时延要求相对较低。In some embodiments, the motion state of the mobile device can be divided into high speed and low speed, and the delay requirements of tasks under different motion states are different. Specifically, when the motion state of the mobile device is high speed, it is required to be able to process tasks quickly, and the time delay of the task is relatively high; when the motion state of the mobile device is low speed, the time delay requirement of the task is relatively low.
在本发明实施例中,可以根据不同地区的交通规则界定高速和低速,当然也可以根据其他因素来界定高速和低速,本发明实施例对此不做限制。图6为本发明实施例提供的低速和高速的一种示例,参考图6,移动设备的时速在80~120km/h时,移动设备的运动状态为高速;移动设备的时速在80km/h以下,移动设备的运动状态为低速。移动设备在高速状态下,对运算器103的处理帧率要求较高,如:移动设备的时速在80~120km/h时,运算器计算的帧率不低于30FPS,即每帧处理的时延不超过33ms。移动设备在低速状态下,较低的处理帧率即可满足需求,如:移动设备的时速为20km/h时,计算的帧率为5FPS,即每帧处理的时间为200ms。进一步,可以根据每帧处理的时延来确定一个任务的延迟要求时长。示例的,移动设备高速行驶下,任务1需要的输入数据是摄像头1、摄像头2拍摄的10张图像,那么可以预估执行任务1所需的时长为10*33=330ms,即任务1的延迟要求时长为330ms。In the embodiment of the present invention, high speed and low speed may be defined according to traffic rules in different regions, and of course, high speed and low speed may also be defined according to other factors, and the embodiment of the present invention does not limit this. FIG. 6 is an example of a low speed and a high speed provided by an embodiment of the present invention. Referring to FIG. 6, when the mobile device's speed is 80 to 120 km / h, the movement state of the mobile device is high speed; the mobile device's speed is below 80 km / h , The mobile device's motion state is low speed. Under the high-speed state of the mobile device, the processing frame rate of the computing unit 103 is high. For example, when the mobile device's speed is 80 ~ 120km / h, the computing frame rate calculated by the computing unit is not less than 30FPS, that is, the processing time of each frame. The delay does not exceed 33ms. In the low-speed state of the mobile device, the lower processing frame rate can meet the demand. For example, when the mobile device's speed is 20km / h, the calculated frame rate is 5FPS, that is, the processing time of each frame is 200ms. Further, the delay required duration of a task can be determined according to the processing delay of each frame. For example, when the mobile device is driving at a high speed, the input data required for task 1 is 10 images taken by camera 1 and camera 2. Then it can be estimated that the time required to perform task 1 is 10 * 33 = 330 ms, that is, the delay of task 1 The required time is 330ms.
随后,调度模块102针对N个时延敏感任务中的每一个时延敏感任务执行以下步骤403~步骤404,为各个时延敏感任务确定目标运算器。在本发明实施例中,调度模块102可以按照优先级等级从高到低的顺序依次为N个时延敏感任务中的每一个时延敏感任务分配目标运算器,也可以按照计算量从大到小的顺序为N个时延敏感任务中的每一个时延敏感任务分配目标运算器。如果时延敏感任务的优先级等级相同,则可以按照计算量从大到小的顺序依次分配目标运算器。Subsequently, the scheduling module 102 executes the following steps 403 to 404 for each delay-sensitive task of the N delay-sensitive tasks to determine a target processor for each delay-sensitive task. In the embodiment of the present invention, the scheduling module 102 may assign a target computing unit to each of the delay-sensitive tasks in the order of priority levels from high to low, or may perform calculations from large to large. A small order assigns a target computing unit to each of the N delay-sensitive tasks. If the priority levels of delay-sensitive tasks are the same, the target processors can be allocated in order from large to small.
以下步骤403~步骤404相关说明中以时延敏感任务1作为示例,详细说明在本发明实施例中为时延敏感任务分配目标运算器。In the following description of steps 403 to 404, the delay-sensitive task 1 is taken as an example, and it is described in detail that the target processor is allocated to the delay-sensitive task in the embodiment of the present invention.
403、调度模块102确定车载装置的各个运算器103的目标功耗余量范围以及各个计算103执行时延敏感任务1所需的时长。403. The scheduling module 102 determines a target power consumption margin range of each computing unit 103 of the vehicle-mounted device and a time required for each calculation 103 to execute the delay-sensitive task 1.
具体实现中,调度模块102可以通过以下两个步骤采集各运算器当前的状态,确定一个运算器103的目标功耗余量范围,具体包括:In specific implementation, the scheduling module 102 can collect the current state of each computing unit through the following two steps to determine the target power consumption margin range of an computing unit 103, which specifically includes:
步骤A:调度模块102首先确定运算器的当前功耗,即运算器当前正在进行的计算任务所占用的功耗。Step A: The scheduling module 102 first determines the current power consumption of the computing unit, that is, the power consumption occupied by the computing task currently being performed by the computing unit.
步骤B:调度模块102根据运算器的当前功耗和运算器的目标功耗范围,确定运算器的目标功耗余量范围。需要说明的是,目标功耗范围指的是运算器的功耗维持在该范围内时,运算器的性能达到最佳状态,调度模块102可以从寄存器读取运算器的目标功耗范围,也可以由芯片或测试得到运算器的目标功耗范围。Step B: The scheduling module 102 determines the target power consumption margin range of the computing unit according to the current power consumption of the computing unit and the target power consumption range of the computing unit. It should be noted that the target power consumption range refers to the performance of the computing unit when the power consumption of the computing unit is maintained within this range. The scheduling module 102 can read the target power consumption range of the computing unit from the register. The target power consumption range of the computing unit can be obtained from the chip or test.
需要说明的是,调度模块102也可以获取运算器的最大功耗限值,调度模块102在进行任务分配的时候需要考虑运算器的最大功耗限值,为运算器分配任务后运算器的功耗是不能超过这个限值的。It should be noted that the scheduling module 102 can also obtain the maximum power consumption limit of the computing unit. When performing the task allocation, the scheduling module 102 needs to consider the maximum power consumption limit of the computing unit. Consumption cannot exceed this limit.
另外,调度模块102可以根据预以下公式(1)预估一个运算器执行完一个任务所需的时长T,在分配任务过程中,不仅要保证运算器工作在目标功耗下,还需要保证运算器在任务的延迟要求时长t内执行完该任务,因此T不超过任务的延迟要求时长t,即T小于或等于,如此才能保证任务的计算能够被执行完。In addition, the scheduling module 102 can estimate the time T required for an operator to complete a task according to the following formula (1). In the task assignment process, not only must the operator work at the target power consumption, but also the operation must be guaranteed. The processor completes the task within the delay request time t of the task, so T does not exceed the task's delay request time t, that is, T is less than or equal to, so as to ensure that the calculation of the task can be completed.
Figure PCTCN2019088450-appb-000003
Figure PCTCN2019088450-appb-000003
公式(1)中的Nc和Nm是运算器内部基本计算单元数量,f 1为运算器上核心的频率;f 2为运算器的内存的频率,C i是执行完该任务所需的双精度浮点运算次数,m j是执行完所述时延敏感任务所需的内存访问频率;所述内存访问频率是每秒访问内存的次数。 Nc and Nm in formula (1) are the number of basic computing units inside the operator, f 1 is the frequency of the core on the operator; f 2 is the frequency of the memory of the operator, and C i is the double precision required to complete the task Number of floating-point operations, m j is the memory access frequency required to complete the delay-sensitive task; the memory access frequency is the number of times the memory is accessed per second.
在本发明实施例中,可以通过能力信息表记录各个运算器的能力信息,包括:运算器的目标功耗范围余量范围以及运算器执行时延敏感任务1所需的时长。表1是所述能力信息表的一种可能的实现方式,当然,本发明实施例对能力信息表的实现方式不做限制。参考表1,能力信息表还可以包括运算器的目标功耗范围以及当前功耗。In the embodiment of the present invention, the capability information of each computing unit may be recorded through the capability information table, including: a target power consumption range margin range of the computing unit and a time required for the computing unit to perform the delay-sensitive task 1. Table 1 is a possible implementation manner of the capability information table. Of course, the embodiment of the present invention does not limit the implementation manner of the capability information table. Referring to Table 1, the capability information table may also include the target power consumption range of the computing unit and the current power consumption.
表1Table 1
Figure PCTCN2019088450-appb-000004
Figure PCTCN2019088450-appb-000004
需要说明的是,表1中应该在车载装置包括所有运算器的能力信息(用于计算DNN任务的运算器),表1中是以车载装置包括两个运算器为例。另外,在为其他时延敏感任务分配目标运算器时,表1中应该包括车载装置的各个运算器执行该时延敏感任务所需的时长。It should be noted that, in Table 1, the vehicle-mounted device should include the capability information of all the computing units (the computing unit used to calculate the DNN task). Table 1 takes the vehicle-mounted device including two computing units as an example. In addition, when assigning a target computing unit to other delay-sensitive tasks, Table 1 should include the time required for each computing unit of the vehicle-mounted device to perform the delay-sensitive task.
404、调度模块102为时延敏感任务1确定目标运算器。404. The scheduling module 102 determines a target processor for the delay-sensitive task 1.
需要说明的是,为时延敏感任务1确定的目标运算器,即执行时延敏感任务对应的计算的运算器。在一些实施例中,调度模块102可以首先计算时延敏感任务1的功耗需求,预估执行时延敏感任务1的延迟要求时长。示例的,还可以通过需求信息表记录时延敏感任务的功耗需求以及延迟要求时长。表2是需求信息表的一种可能的实现方式。It should be noted that the target computing unit determined for the delay-sensitive task 1 is the computing unit that performs the calculation corresponding to the delay-sensitive task. In some embodiments, the scheduling module 102 may first calculate the power consumption requirement of the delay-sensitive task 1 and estimate the delay required duration for executing the delay-sensitive task 1. By way of example, the power consumption requirements of delay-sensitive tasks and the length of delay requirements can also be recorded through the requirement information table. Table 2 is a possible implementation of the requirement information table.
表2Table 2
任务task 时延敏感任务1(Task 1)Delay-sensitive task 1
功耗需求Power requirements W1W1
延迟要求时长Delay required t1t1
具体实现中,调度模块102可以根据公式(2)来计算一个任务(可以是时延敏感任务)的功耗需求W。In specific implementation, the scheduling module 102 may calculate a power consumption requirement W of a task (which may be a delay-sensitive task) according to formula (2).
W=n*w^   (2)W = n * w ^ (2)
公式(2)中的n为执行完该任务所需的基本操作数,w^为运算器进行一次基本操作的功耗。如果该任务是神经网络任务(即DNN任务),则n为该神经网络的总乘累加数。In formula (2), n is the number of basic operations required to complete the task, and w ^ is the power consumption of the basic operation performed by the arithmetic unit. If the task is a neural network task (ie, a DNN task), n is the total multiplication and accumulation number of the neural network.
另外,任务(可以是时延敏感任务)的延迟要求时长是指当前车速下执行完该任务的截止时间,即执行完该任务至少所需的时长。在本发明实施例中,可以根据当前车速确定每帧图处理的时长,进一步再根据每帧图处理的时长预估执行该任务所需的时长,即任务的延迟要求时长。示例的,移动设备高速行驶下,任务1需要的输入数据是摄像头1、摄像头2拍摄的10张图像,那么可以预估执行任务1所需的时长为10*33=330ms,即任务1的延迟要求时长为330ms。In addition, the delay required duration of a task (which may be a delay-sensitive task) refers to the deadline for completing the task at the current vehicle speed, that is, the minimum required time to complete the task. In the embodiment of the present invention, the processing time of each frame can be determined according to the current vehicle speed, and the time required to execute the task, that is, the required delay time of the task, can be further estimated based on the processing time of each frame. For example, when the mobile device is driving at a high speed, the input data required for task 1 is 10 images taken by camera 1 and camera 2. Then it can be estimated that the time required to perform task 1 is 10 * 33 = 330 ms, that is, the delay of task 1 The required time is 330ms.
进一步,调度模块102遍历该移动设备的每一个运算器103,判断该时延敏感任务的功耗需求是否落入运算器103的目标功耗余量范围内,以及运算器103是否能够在时延敏感1的延迟要求时长内执行完该任务。示例的,调度模块102确定时延敏感任务1的需求信息后,查找表1以确定时延敏感任务1的目标运算器。具体地,针对运算器G0,首先判断运算器G0执行时延敏感任务1所需的时长T0是否小于或等于时延敏感任务1的延迟要求时长t1,如果T0小于或等于t1,则进一步判断时延敏感任务1的功耗需求是否落入(MW0-TW0)~(MW1-TW0)区间范围内,如果落入了该区间范围,则将运算器G0确定为时延敏感任务1的目标运算器。当然,如果运算器G0执行时延敏感任务1所需的时长T0大于时延敏感任务1的延迟要求时长t1,或者,时延敏感任务1的功耗需求未落入(MW0-TW0)~(MW0-TW0)区间范围内,则进一步判断运算器G1能否作为时延敏感任务1的目标运算器。示例的,首先判断运算器G1执行时延敏感任务1所需的时长T1是否小于或等于时延敏感任务1的延迟要求时长t1,如果T1小于或等于t1,则进一步判断时延敏感任务1的功耗需求是否落入(MW2-TW1)~(MW3-TW1)区间范围内,如果落入了该区间范围,则将运算器G1确定为时延敏感任务1的目标运算器。Further, the scheduling module 102 traverses each computing unit 103 of the mobile device to determine whether the power consumption requirement of the delay-sensitive task falls within the target power consumption margin of the computing unit 103, and whether the computing unit 103 is able to meet the delay The delay of sensitive 1 requires the task to be completed within the time period. For example, after the scheduling module 102 determines the requirement information of the delay-sensitive task 1, it looks up the table 1 to determine the target processor of the delay-sensitive task 1. Specifically, for the arithmetic unit G0, it is first determined whether the time period T0 required by the arithmetic unit G0 to execute the delay-sensitive task 1 is less than or equal to the delay required time period t1 of the delay-sensitive task 1. If T0 is less than or equal to t1, then further judgment is made. Does the power consumption requirement of delay-sensitive task 1 fall within the range of (MW0-TW0) to (MW1-TW0), and if it falls within this range, then the computing unit G0 is determined as the target computing unit of delay-sensitive task 1. . Of course, if the time period T0 required by the processor G0 to execute the delay-sensitive task 1 is greater than the delay requirement time t1 of the delay-sensitive task 1, or the power consumption requirement of the delay-sensitive task 1 does not fall into (MW0-TW0) ~ ( MW0-TW0), it is further judged whether the arithmetic unit G1 can be used as the target arithmetic unit of the delay-sensitive task 1. For example, first, determine whether the duration T1 required by the processor G1 to execute the delay-sensitive task 1 is less than or equal to the delay request duration t1 of the delay-sensitive task 1. If T1 is less than or equal to t1, then further determine the delay-sensitive task 1. Whether the power consumption requirement falls within the range of (MW2-TW1) to (MW3-TW1), and if it falls within this range, then the computing unit G1 is determined as the target computing unit of the delay-sensitive task 1.
判断延敏感任务的功耗需求是否落入一个运算器103的目标功耗余量范围内可以是:如果时延敏感任务的功耗需求为20W,运算器1的目标功耗余量范围是10W~30W,运算器2的目标功耗余量范围是5W~18W,可见,时延敏感任务的功耗需求落入了运算器1的目标功耗余量范围,即该时延敏感任务的目标运算器为运算器1。Judging whether the power consumption requirement of the delay-sensitive task falls within the target power consumption margin of an arithmetic unit 103 may be: if the power consumption requirement of the delay-sensitive task is 20W, the target power consumption margin range of the arithmetic unit 1 is 10W ~ 30W, the target power consumption margin of Operator 2 is 5W ~ 18W. It can be seen that the power consumption requirements of delay-sensitive tasks fall into the target power margin of Operator 1, which is the target of this delay-sensitive task. The computing unit is computing unit 1.
在一些实施例中,如果没有运算器103的目标功耗余量与时延敏感任务的功耗需求 相匹配,则可以将任务的输入数据拆分,由多个运算器执行一个时延敏感任务的计算。具体地,若调度模块102遍历所有运算器后,时延敏感任务的功耗需求与移动设备的所有运算器的目标功耗余量范围均不匹配,调度模块102则将该时延敏感任务的输入数据拆分为至少两个子数据块。In some embodiments, if the target power consumption margin without the arithmetic unit 103 matches the power consumption requirement of the delay-sensitive task, the input data of the task can be split and multiple delay-executors can execute one delay-sensitive task. Calculation. Specifically, if the scheduling module 102 traverses all the arithmetic units, the power consumption requirements of the delay-sensitive tasks do not match the target power consumption margin ranges of all the arithmetic units of the mobile device, the scheduling module 102 may The input data is split into at least two sub-data blocks.
进一步,针对所述至少一个子数据块中的每一个子数据块,确定一个运算器计算完所述子数据块所需的功耗。随后,调度模块102遍历所有运算器,判断该功耗是否与运算器的目标功耗余量范围相匹配。若所述移动设备的一个运算器的目标功耗余量范围与所述功耗相匹配,则确定所述运算器为所述子数据块对应的目标运算器。最后,将所有子数据块对应的目标运算器确定为该时延敏感任务对应的目标运算器。Further, for each of the at least one sub-data block, determining a power consumption required for an operator to calculate the sub-data block. Subsequently, the scheduling module 102 traverses all the arithmetic units to determine whether the power consumption matches the target power consumption margin range of the arithmetic units. If a target power consumption margin range of an operator of the mobile device matches the power consumption, it is determined that the operator is a target operator corresponding to the sub-data block. Finally, the target computing unit corresponding to all sub-data blocks is determined as the target computing unit corresponding to the delay-sensitive task.
示例的,执行Task2的输入数据为摄像头2、摄像头3拍摄的图像,可以按照摄像头进行划分,将摄像头2拍摄的图像划分为子数据块1,摄像头3拍摄的图像划分为子数据块2。假设运算器G3的目标功耗余量范围满足计算子数据块1所需的功耗,运算器G4的目标功耗余量范围满足计算子数据块2所需的功耗,则Task2对应的目标运算器为G3、G4。另外,各个子数据块对应的延迟要求时长可以根据各个子数据块的数据量来确定,如:根据各个子数据块中图像的数量来确定。For example, the input data for executing Task 2 is an image captured by camera 2 and camera 3, which can be divided according to the camera, the image captured by camera 2 is divided into sub data blocks 1, and the image captured by camera 3 is divided into sub data block 2. Assume that the target power consumption margin range of the computing unit G3 meets the power consumption required to calculate the sub-data block 1, and the target power consumption margin range of the computing unit G4 meets the power consumption required to calculate the sub-data block 2. Then the target corresponding to Task2 The arithmetic unit is G3, G4. In addition, the delay requirement duration corresponding to each sub-data block may be determined according to the data amount of each sub-data block, for example, according to the number of images in each sub-data block.
当然,对于特殊任务不允许拆分任务的输入数据不能拆分,调度模块102则在为任务确定目标运算器时只考虑运算器能否满足任务的延迟要求时长。示例的,如果时延敏感任务1的输入数据不允许拆分,调度模块102判断时延敏感任务1的功率需求W1不在(MW0-TW0)~(MW1-TW0)的区间范围内,也不在(MW2-TW1)~(MW3-TW1)的区间范围内,如果运算器G0执行时延敏感任务1所需的时长T0大于时延敏感任务1的延迟要求时长t1,运算器G1执行时延敏感任务1所需的时长T1小于时延敏感任务1的延迟要求时长t1,则确定运算器G1为时延敏感任务1的目标运算器。Of course, for special tasks, input data that does not allow the task to be split cannot be split, and the scheduling module 102 only considers whether the computing unit can meet the task's delay requirement when determining the target computing unit for the task. For example, if the input data of the delay-sensitive task 1 is not allowed to be split, the scheduling module 102 determines that the power requirement W1 of the delay-sensitive task 1 is not within the range of (MW0-TW0) to (MW1-TW0), and it is not ( In the range from MW2-TW1) to (MW3-TW1), if the time period T0 required by the processor G0 to execute the delay-sensitive task 1 is greater than the delay required time t1 of the delay-sensitive task 1, the processor G1 executes the delay-sensitive task. The required duration T1 of 1 is less than the delay required duration t1 of the delay-sensitive task 1, then it is determined that the arithmetic unit G1 is the target arithmetic unit of the delay-sensitive task 1.
调度模块针对每一个时延敏感任务执行步骤403~步骤404,就可以为每一个时延敏感任务确定目标运算器。The scheduling module executes steps 403 to 404 for each delay-sensitive task, so as to determine a target processor for each delay-sensitive task.
405、调度模块102将每一个时延敏感任务分配给对应的目标运算器,指示时延敏感任务对应的目标运算器在该时延敏感任务的延迟要求时长内执行完所述时延敏感任务。405. The scheduling module 102 assigns each delay-sensitive task to a corresponding target computing unit, and instructs the target computing unit corresponding to the delay-sensitive task to execute the delay-sensitive task within the delay-required duration of the delay-sensitive task.
具体实现中,调度模块102生成任务列表,所述任务列表用于记录所述N个时延敏感任务中的每一个时延敏感任务对应的目标运算器以及每一个目标运算器对应的延迟要求时长。运算器103根据所述任务列表确定自身待执行的时延敏感任务并在待执行的时延敏感任务的延迟要求时长内执行完所述待执行的时延敏感任务。以下表3是任务列表的一种可能的实现方式。In specific implementation, the scheduling module 102 generates a task list, which is used to record a target operator corresponding to each delay-sensitive task of the N delay-sensitive tasks and a delay requirement duration corresponding to each target operator. . The arithmetic unit 103 determines the delay-sensitive task to be executed according to the task list, and completes the delay-sensitive task to be executed within a delay required period of the delay-sensitive task to be executed. Table 3 below is one possible implementation of the task list.
表3table 3
任务task 目标运算器Target component 延迟要求时长Delay required
Task0Task0 G2G2 t0t0
Task1Task1 G1G1 t1t1
Task2Task2 G3、G4G3, G4 t2,t3t2, t3
参考表3,运算器G2需要在t0内执行完任务Task0的所有计算,运算器G1需要在t1内执行完任务Task1的所有计算。需要说明的是,如果一个时延敏感任务由多个运算器共同执行,即一个时延敏感任务的数据被拆分为多个子数据块,不同的子数据块由不 同的运算器来计算,那么每个子数据块也相应有延迟要求时长,该延迟要求时长是根据一个运算器计算完该子数据块的时长来确定的。示例的,表3中Task2由运算器G3和G4共同执行,其中G3需要在t2内计算完Task2的子数据块1,G4需要在t2内计算完Task2的子数据块2。Referring to Table 3, the arithmetic unit G2 needs to execute all calculations of the task Task0 within t0, and the arithmetic unit G1 needs to execute all calculations of the task Task1 within t1. It should be noted that if a delay-sensitive task is performed by multiple operators, that is, the data of a delay-sensitive task is split into multiple sub-data blocks, and different sub-data blocks are calculated by different operators, then Each sub-data block also has a corresponding delay request time, and the delay request time is determined according to the length of the sub-data block calculated by an operator. For example, Task2 in Table 3 is jointly executed by the arithmetic units G3 and G4, where G3 needs to calculate the sub-data block 1 of Task 2 within t2, and G4 needs to calculate the sub-data block 2 of Task 2 within t2.
406、调度模块102为非时延敏感任务分配相应的运算器。406. The scheduling module 102 assigns a corresponding operator to the non-delay-sensitive tasks.
需要说明的是,调度模块102在为每一个时延敏感任务分配运算器后,还可以为非时延敏感任务分配运算器。具体地,针对所述M个任务中除所述N个时延敏感任务外的每一个任务(即所述非时延敏感任务),计算所述任务的功耗需求,将目标功耗余量范围与所述任务的功耗需求相匹配且能够在所述任务的延迟要求时长内执行完所述任务的运算器确定为执行所述任务的目标运算器,并将所述任务分配给所述时延敏感任务的目标运算器。It should be noted that, after the scheduling module 102 assigns an arithmetic unit to each delay-sensitive task, it can also assign an arithmetic unit to the non-delay-sensitive task. Specifically, for each of the M tasks except the N delay-sensitive tasks (that is, the non-delay-sensitive tasks), the power consumption requirement of the task is calculated, and the target power consumption margin is calculated. An operator whose range matches the power consumption requirement of the task and is capable of executing the task within the delay request duration of the task is determined to be the target operator to execute the task, and the task is assigned to the task Target operator for delay-sensitive tasks.
其中,所述任务的功耗需求是执行所述任务所需的功耗,可以参考以上公式2计算非时延敏感任务的功耗需求。所述任务的延迟要求时长是执行所述任务所需的时长,同样,可以根据非时延敏感任务的输入数据量来预估非时延敏感任务的延迟要求时长,具体可以参考上述步骤402关于如何时延敏感任务的延迟要求时长预估的相关说明,本发明实施例对此不做赘述。Wherein, the power consumption requirement of the task is the power consumption required to execute the task, and the power consumption requirement of the non-delay-sensitive task can be calculated by referring to the above formula 2. The delay request duration of the task is the time required to execute the task. Similarly, the delay request duration of a non-delay-sensitive task can be estimated according to the amount of input data of the non-delay-sensitive task. For details, refer to step 402 above. A description of how to delay the delay-sensitive task requires a time estimation, which is not described in the embodiment of the present invention.
需要说明的是,如果非时延敏感任务的功耗需求与所有运算器的目标功耗范围均不匹配,则可以将非时延敏感任务分配给能够在非时延敏感任务的延迟要求时长内执行完该非时延敏感任务的一个运算器。It should be noted that if the power consumption requirements of non-delay-sensitive tasks do not match the target power consumption range of all computing units, non-delay-sensitive tasks can be assigned to be within the delay requirements of non-delay-sensitive tasks. An operator that finishes the non-delay-sensitive task.
当然,也可以先不为非时延敏感任务分配运算器,而是等到再次执行步骤101~405后,将该非时延敏感任务的输入数据与其他非时延敏感任务的数据数据合并,为合并后的数据确定运算器。但是,需要保证当前时刻(本次执行步骤406的时刻)与再次执行完步骤405的时刻的间隔不能超过该时延敏感任务的延迟要求时长。Of course, instead of allocating an operator for non-delay-sensitive tasks, you can wait until steps 101 to 405 are performed again to combine the input data of the non-delay-sensitive task with the data of other non-delay-sensitive tasks. The merged data determines the operator. However, it is necessary to ensure that the interval between the current time (the time at which step 406 is performed this time) and the time at which step 405 is performed again cannot exceed the delay request duration of the delay-sensitive task.
具体实现中,调度模块102可以将至少两个非时延敏感任务的输入数据合并成一个数据块。In a specific implementation, the scheduling module 102 may combine the input data of at least two non-delay-sensitive tasks into one data block.
进一步,确定所述移动设备的一个运算器计算完所述数据块所需的功耗,若所述移动设备的一个运算器的目标功耗余量范围与该功耗相匹配,则将所述数据块分配给该运算器,指示该运算器在所述数据块对应的时延要求时长内计算完所述数据块。Further, it is determined that an operator of the mobile device calculates the power consumption required for the data block, and if the target power consumption margin range of an operator of the mobile device matches the power consumption, then the A data block is allocated to the arithmetic unit, and the arithmetic unit is instructed to finish calculating the data block within a required delay time corresponding to the data block.
其中,所述数据块的时延要求为预估的计算完该数据块所需的时长,可以根据该数据块的数据量来预估数据块的时延要求。示例的,合并后的数据包括10张图像,移动设备高速行驶下,每张图像的处理时长不能超过33ms,那么可以预估数据块所需的时长为10*33=330ms,即数据块的延迟要求时长为330ms。The delay requirement of the data block is an estimated time required to complete the calculation of the data block, and the delay requirement of the data block may be estimated according to the data amount of the data block. For example, the merged data includes 10 images. When the mobile device is driving at a high speed, the processing time of each image cannot exceed 33ms. Then it can be estimated that the required length of the data block is 10 * 33 = 330ms, that is, the delay of the data block. The required time is 330ms.
示例的,假设执行Task1要计算的数据为摄像头3拍摄的图像,执行Task2要计算的数据为摄像头2拍摄的图像;Task1、Task2均为非时延敏感任务,将Task1、Task2对应的数据合并为一个数据块(即摄像头3、2拍摄的图像),若计算摄像头3、2拍摄的图像所需的功耗落入了运算器G5的目标功耗余量范围内,则指示运算器G5执行Task1、Task2,计算摄像头3、2拍摄的图像。For example, suppose that the data to be calculated by executing Task1 is the image taken by camera 3, and the data to be calculated by executing Task2 is the image taken by camera 2. Task1 and Task2 are both non-delay-sensitive tasks, and the data corresponding to Task1 and Task2 are combined into A data block (ie, images taken by cameras 3 and 2). If the power consumption required to calculate the images taken by cameras 3 and 2 falls within the target power consumption margin of the processor G5, instruct the processor G5 to execute Task1 And Task2 to calculate the images taken by cameras 3 and 2.
需要说明的是,合并非时延敏感任务的输入数据是需要考虑运算器的计算资源,合并后的数据块应该尽可能小于运算器的计算资源。示例的,有3个非时延敏感任务Task4、Task5、Task6,可以首先将其中两个非时延敏感任务Task4、Task5的数据合并,并为合 并后的数据分配运算器。等待有新的计算资源释放时可以为Task6分配运算器,如:某些运算器执行完计算后有剩余功耗,可以将Task6分配给该运算器。It should be noted that merging the input data of non-delay-sensitive tasks requires the computational resources of the operator, and the merged data block should be as small as possible. For example, there are three non-delay-sensitive tasks Task4, Task5, and Task6. The data of two of the non-delay-sensitive tasks Task4 and Task5 can be merged first, and an operator is assigned to the merged data. When waiting for new computing resources to be released, an arithmetic unit can be assigned to Task6. For example, some arithmetic units have remaining power consumption after performing calculations, and Task6 can be assigned to the arithmetic unit.
另外,不管是时延敏感任务,还是非时延敏感任务,一旦将任务分配给运算器后,运算器都必须在任务的延迟要求时长内输出计算结果。In addition, no matter it is a delay-sensitive task or a non-delay-sensitive task, once the task is assigned to the computing unit, the computing unit must output the calculation result within the required delay time of the task.
在一些实施例中,调度模块102可以参考图7所示的方法进行任务分配。具体地,参考图7,包括以下步骤:In some embodiments, the scheduling module 102 may perform task allocation with reference to the method shown in FIG. 7. Specifically, referring to FIG. 7, the following steps are included:
步骤S1、调度模块102按照优先级等级从高到低的顺序排序需要执行的任务:Task0、Task1、Task2。In step S1, the scheduling module 102 sorts the tasks to be executed according to the priority order from high to low: Task0, Task1, and Task2.
步骤S2、调度模块102取出优先级等级最高的任务Task0,计算Task0的功耗需求w0以及延迟要求时长t0。In step S2, the scheduling module 102 takes out the task Task0 with the highest priority level, and calculates the power consumption requirement w0 and the delay requirement duration t0 of the task0.
步骤S3、调度模块102获取每个运算器的目标功耗余量范围以及每个运算器执行Task0所需的时长。In step S3, the scheduling module 102 obtains the target power consumption margin range of each operator and the time required for each operator to execute Task0.
步骤S4、判断Task0的功耗需求是否落入运算器的目标功耗余量范围,以及运算器执行Task0的时长是否小于或等于Task0的延迟要求时长。In step S4, it is determined whether the power consumption requirement of Task0 falls within the target power consumption margin range of the computing unit, and whether the time period for which the computing unit executes Task0 is less than or equal to the delay required time period of Task0.
步骤S5、如果Task0落在某个运算器的目标功耗余量范围,则将Task0分配给该运算器,指示该运算器必须在t(当前时刻)+t0时刻返回Task0的计算结果。Step S5: If Task0 falls within a target power consumption margin range of a certain computing unit, Task0 is assigned to the computing unit, indicating that the computing unit must return the calculation result of Task0 at time t (current time) + t0.
步骤S6、如果Task0未落在所有运算器的目标功耗余量范围,则将Task0对应的数据拆分成多个子数据块,优先将计算量大的子数据块分配给可用资源多的运算器。Step S6: If Task0 does not fall within the target power consumption margin range of all the arithmetic units, the data corresponding to Task0 is split into multiple sub-data blocks, and the sub-data blocks with a large amount of calculation are preferentially allocated to the arithmetic units with more resources .
以下结合附图说明本发明实施例提供对时延敏感任务的调度。参考图8,需要执行的时延敏感任务按照优先级等级从高到低的顺序依次为Task0、Task1、Task2。Task0、Task1、Task2的功耗需求分别是10W,20W,30W。车载装置共有G0、G1两个运算器,运算器G0的最大功耗限值是40W,目标功耗是30W,运算器G1的最大功耗限值是60W,目标功耗是40W。假设G0、G1均满足Task0、Task1、Task2的延迟要求时长。在本发明实施例中可以按照优先级等级从高到低依次为各个任务分配运算器。The following describes the embodiments of the present invention with reference to the accompanying drawings to provide delay-sensitive task scheduling. Referring to FIG. 8, the delay-sensitive tasks that need to be executed are Task0, Task1, and Task2 in order of priority level from high to low. The power requirements of Task0, Task1, and Task2 are 10W, 20W, and 30W, respectively. The vehicle-mounted device has two arithmetic units G0 and G1. The maximum power consumption limit of the arithmetic unit G0 is 40W, the target power consumption is 30W, the maximum power consumption limit of the arithmetic unit G1 is 60W, and the target power consumption is 40W. It is assumed that G0 and G1 both meet the delay requirements of Task0, Task1, and Task2. In the embodiment of the present invention, an arithmetic unit may be allocated to each task in order from the priority level.
示例的,参考图8,初始状态时运算器G0的占用功耗是10W,目标功耗余量是20W,运算器G1是空闲状态,目标功耗余量是40W。首先,运算器G0的目标功耗余量是20W,Task0的功耗需求是10W,运算器G0的目标功耗余量可以满足Task0的10W的功耗需求,因此,可以将Task0分配给运算器G0,此时,运算器G0的目标功耗余量变为10W,运算器G1的目标功耗余量仍为40W。For example, referring to FIG. 8, in the initial state, the occupied power consumption of the arithmetic unit G0 is 10W, the target power consumption margin is 20W, the arithmetic unit G1 is idle, and the target power consumption margin is 40W. First, the target power consumption margin of the arithmetic unit G0 is 20W, and the power consumption requirement of Task0 is 10W. The target power consumption margin of the arithmetic unit G0 can meet the 10W power consumption requirement of Task0. Therefore, Task0 can be assigned to the arithmetic unit. G0. At this time, the target power consumption margin of the arithmetic unit G0 becomes 10W, and the target power consumption margin of the arithmetic unit G1 is still 40W.
随后,运算器G1的目标功耗余量可以满足Task1的10W的功耗需求,还可以将Task1分配给运算器G1,此时,运算器G0的目标功耗余量仍为10W,运算器G1的目标功耗余量变为20W。Subsequently, the target power consumption margin of the computing unit G1 can meet the 10W power consumption requirement of Task1, and Task1 can also be assigned to the computing unit G1. At this time, the target power consumption margin of the computing unit G0 is still 10W, and the computing unit G1 The target power consumption margin becomes 20W.
最后,Task2的功耗需求大于运算器G0的目标功耗余量10W,大于运算器G0的目标功耗余量20W,因此可以把Task2的数据拆分成两部分。一部分的功耗需求是10W,与运算器G0的目标功耗余量10W匹配,那么可以将这部分数据分配给运算器G0进行计算。另一部分的功耗需求是20W,与运算器G1的目标功耗余量20W匹配,那么可以将这部分数据分配给运算器G1进行计算。Finally, the power consumption requirement of Task2 is greater than the target power consumption margin of the arithmetic unit G0 of 10W, and greater than the target power consumption margin of the arithmetic unit G0 of 20W, so the data of Task2 can be split into two parts. A part of the power consumption requirement is 10W, which matches the target power consumption margin of the arithmetic unit G0 of 10W, so this part of the data can be allocated to the arithmetic unit G0 for calculation. The power consumption requirement of the other part is 20W, which matches the target power consumption margin of 20W of the arithmetic unit G1. Then, this part of data can be allocated to the arithmetic unit G1 for calculation.
以下结合附图说明本发明实施例提供对非时延敏感任务的调度。结合图9,需要执行的时延敏感任务按照优先级等级从高到低的顺序依次为Task0、Task1、Task2。Task0、Task1、Task2的功耗需求分别是10W,20W,30W。车载装置共有G0、G1两个运算器,运 算器G0的最大功耗限值是40W,目标功耗是30W,运算器G1的最大功耗限值是60W,目标功耗是40W。初始状态时运算器G0的占用功耗是10W,目标功耗余量是20W,运算器G1是空闲状态,目标功耗余量是40W。The following describes embodiments of the present invention with reference to the drawings to provide scheduling for non-delay-sensitive tasks. With reference to FIG. 9, the delay-sensitive tasks that need to be executed are Task0, Task1, and Task2 in order of priority level from high to low. The power requirements of Task0, Task1, and Task2 are 10W, 20W, and 30W, respectively. The vehicle-mounted device has two processors, G0 and G1. The maximum power consumption limit of the processor G0 is 40W, the target power consumption is 30W, the maximum power consumption limit of the processor G1 is 60W, and the target power consumption is 40W. In the initial state, the occupant power consumption of the arithmetic unit G0 is 10W, the target power consumption margin is 20W, the arithmetic unit G1 is idle, and the target power consumption margin is 40W.
参考图9,对于非时延敏感任务可以将延时延要求较低的任务合并,将合并后的任务分配给运算器。示例的,可以将Task0、Task1的数据合并成一个任务,合并后的数据的功耗需求是10W+20W=30W,与运算器G1的目标功耗余量匹配,因此可以将合并后的任务分配给运算器G1,运算器G1的目标功耗余量为10W,运算器G0的目标功耗余量为20W。虽然运算器G1、G0的目标功耗余量均不满足Task2的功耗需求,但是由于Task2同样为非时延敏感任务,可以暂时不为Task2分配运算器,等到有足够的计算资源时再为Task2分配运算器。Referring to FIG. 9, for non-delay-sensitive tasks, tasks with lower delay requirements may be merged, and the merged tasks may be allocated to a processor. For example, the data of Task0 and Task1 can be merged into one task. The power consumption requirement of the merged data is 10W + 20W = 30W, which matches the target power consumption margin of the processor G1. Therefore, the merged task can be allocated. To the arithmetic unit G1, the target power consumption margin of the arithmetic unit G1 is 10W, and the target power consumption margin of the arithmetic unit G0 is 20W. Although the target power consumption margins of the arithmetic units G1 and G0 do not meet the power consumption requirements of Task2, since Task2 is also a non-delay-sensitive task, it is not necessary to temporarily assign an arithmetic unit to Task2, and wait until there are sufficient computing resources. Task2 assigns a component.
本发明实施例提供一种装置,所述装置可以是本发明实施例涉及的车载装置中的调度模块,如图2中所示的车载装置10的调度模块102。在采用对应各个功能划分各个功能模块的情况下,图10示出了上述通信设备的一种可能的结构示意图。如图10所示,所述装置包括确定单元1001、计算单元1002以及分配单元1003。An embodiment of the present invention provides a device, which may be a scheduling module in an in-vehicle device according to an embodiment of the present invention, such as the scheduling module 102 of the in-vehicle device 10 shown in FIG. 2. In a case where each functional module is divided corresponding to each function, FIG. 10 shows a possible structural diagram of the foregoing communication device. As shown in FIG. 10, the device includes a determination unit 1001, a calculation unit 1002, and an allocation unit 1003.
确定单元1001,用于支持所述装置执行上述实施例中的步骤401、步骤402,和/或用于本文所描述的技术的其它过程。A determining unit 1001 is configured to support the apparatus to perform step 401, step 402, and / or other processes for the technology described herein.
计算单元1002,用于支持所述装置执行上述实施例中的步骤403,和/或用于本文所描述的技术的其它过程;A computing unit 1002, configured to support the apparatus to perform step 403 in the above embodiments, and / or other processes used in the technology described herein;
分配单元1003,用于支持所述装置执行上述实施例中的步骤405,和/或用于本文所描述的技术的其它过程;An allocating unit 1003, configured to support the apparatus to perform step 405 in the above embodiments, and / or other processes for the technology described herein;
需要说明的是,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。It should be noted that all relevant content of each step involved in the foregoing method embodiments can be referred to the functional description of the corresponding functional module, and will not be repeated here.
示例性的,在采用集成的单元的情况下,本申请实施例提供的装置的结构示意图如图6所示。在图11中,该装置包括:处理模块1101和通信模块1102。处理模块1101用于对装置的动作进行控制管理,例如,执行确定单元1001、计算单元1002以及分配单元1003执行的步骤,和/或用于执行本文所描述的技术的其它过程。通信模块1102用于支持装置与其他设备之间的交互。如图11所示,装置还可以包括存储模块1103,存储模块1103用于存储装置的程序代码和数据。Exemplarily, when an integrated unit is used, a schematic structural diagram of a device provided in an embodiment of the present application is shown in FIG. 6. In FIG. 11, the apparatus includes a processing module 1101 and a communication module 1102. The processing module 1101 is used to control and manage the actions of the device, for example, to execute the steps performed by the determination unit 1001, the calculation unit 1002, and the distribution unit 1003, and / or other processes for performing the techniques described herein. The communication module 1102 is used to support interaction between the device and other devices. As shown in FIG. 11, the device may further include a storage module 1103. The storage module 1103 is configured to store program code and data of the device.
本发明实施例提供的任务调度方法可应用于图12所示的,所述装置可以是本发明实施例所述的调度模块102。如图12所示,该装置可以包括至少一个处理器1201,存储器1202、收发器1203以及通信总线1204。The task scheduling method provided by the embodiment of the present invention may be applied to the method shown in FIG. 12, and the device may be the scheduling module 102 according to the embodiment of the present invention. As shown in FIG. 12, the device may include at least one processor 1201, a memory 1202, a transceiver 1203, and a communication bus 1204.
下面结合图12对该装置的各个构成部件进行具体的介绍:The following describes each component of the device in detail with reference to FIG. 12:
处理器1201是装置的控制中心,可以是一个处理器,也可以是多个处理元件的统称。例如,处理器120201是一个中央处理器(central processing unit,CPU),也可以是特定集成电路(Application Specific Integrated Circuit,ASIC),或者是被配置成实施本发明实施例的一个或多个集成电路,例如:一个或多个微处理器(digital signal processor,DSP),或,一个或者多个现场可编程门阵列(Field Programmable Gate Array,FPGA)。The processor 1201 is a control center of the device, and may be a processor or a collective name of a plurality of processing elements. For example, the processor 120201 is a central processing unit (CPU), or may be an application specific integrated circuit (ASIC), or one or more integrated circuits configured to implement an embodiment of the present invention. For example, one or more microprocessors (digital signal processors, DSPs), or one or more field programmable gate arrays (Field Programmable Gate Arrays, FPGAs).
其中,处理器1201可以通过运行或执行存储在存储器1202内的软件程序,以及调用存储在存储器1202内的数据,执行装置的各种功能。The processor 1201 can execute various functions of the device by running or executing a software program stored in the memory 1202 and calling data stored in the memory 1202.
在具体的实现中,作为一种实施例,处理器1201可以包括一个或多个CPU,例如图12中所示的CPU0和CPU1。In a specific implementation, as an embodiment, the processor 1201 may include one or more CPUs, such as CPU0 and CPU1 shown in FIG. 12.
在具体实现中,作为一种实施例,装置可以包括多个处理器,例如图12中所示的处理器1201和处理器1205。这些处理器中的每一个可以是一个单核处理器(single-CPU),也可以是一个多核处理器(multi-CPU)。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的处理核。In a specific implementation, as an embodiment, the device may include multiple processors, such as the processor 1201 and the processor 1205 shown in FIG. 12. Each of these processors can be a single-core processor (single-CPU) or a multi-core processor (multi-CPU). A processor herein may refer to one or more devices, circuits, and / or processing cores for processing data (such as computer program instructions).
存储器1202可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器1202可以是独立存在,通过通信总线1204与处理器1201相连接。存储器1202也可以和处理器1201集成在一起。The memory 1202 may be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, a random access memory (random access memory, RAM), or other types that can store information and instructions The dynamic storage device can also be Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc (Read-Only Memory, CD-ROM) or other optical disk storage, optical disk storage (Including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), disk storage media or other magnetic storage devices, or can be used to carry or store desired program code in the form of instructions or data structures and can be used by a computer Any other media accessed, but not limited to this. The memory 1202 may exist independently, and is connected to the processor 1201 through a communication bus 1204. The memory 1202 may also be integrated with the processor 1201.
其中,所述存储器1202用于存储执行本发明方案的软件程序,并由处理器1201来控制执行。The memory 1202 is configured to store a software program that executes the solution of the present invention, and is controlled and executed by the processor 1201.
收发器1203,使用任何收发器一类的装置,用于与图1所示系统中的其他节点间的通信,如:其他中继节点、核心节点、目标节点或源节点等。或用于实现装置与图1中的基站之间的通信。还可以用于与通信网络通信,如以太网,无线接入网(radio access network,RAN),无线局域网(Wireless Local Area Networks,WLAN)等。收发器1203可以包括接收单元实现接收功能,以及发送单元实现发送功能。The transceiver 1203 uses any device such as a transceiver to communicate with other nodes in the system shown in FIG. 1, such as other relay nodes, core nodes, target nodes, or source nodes. Or used to implement communication between the device and the base station in FIG. 1. It can also be used to communicate with communication networks, such as Ethernet, radio access network (RAN), wireless local area networks (WLAN), and so on. The transceiver 1203 may include a receiving unit to implement a receiving function, and a transmitting unit to implement a transmitting function.
通信总线1204,可以是工业标准体系结构(Industry Standard Architecture,ISA)总线、外部设备互连(Peripheral Component,PCI)总线或扩展工业标准体系结构(Extended Industry Standard Architecture,EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。为便于表示,图12中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The communication bus 1204 may be an Industry Standard Architecture (ISA) bus, an External Device Component (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus. The bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only a thick line is used in FIG. 12, but it does not mean that there is only one bus or one type of bus.
图12中示出的设备结构并不构成对装置的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。The device structure shown in FIG. 12 does not constitute a limitation on the device, and may include more or fewer parts than shown, or some parts may be combined, or different parts may be arranged.
本发明实施例中,处理器1201可以运行存储器1202中的代码执行本发明实施例图4、图7所示的方法。In the embodiment of the present invention, the processor 1201 may run code in the memory 1202 to execute the methods shown in FIG. 4 and FIG. 7 of the embodiment of the present invention.
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。Through the description of the above embodiments, those skilled in the art can clearly understand that, for the convenience and brevity of the description, only the division of the above functional modules is used as an example. In practical applications, the above functions can be allocated according to needs. It is completed by different functional modules, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
在本申请所提供的几个实施例中,应该理解到,所揭露的数据库访问装置和方法,可以通过其它的方式实现。例如,以上所描述的数据库访问装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是 通过一些接口,数据库访问装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed database access device and method can be implemented in other ways. For example, the embodiments of the database access device described above are merely schematic. For example, the division of the modules or units is only a logical function division. In actual implementation, there may be another division manner, such as multiple units or units. Components can be combined or integrated into another device, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection of the database access device or unit through some interfaces, which may be electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The unit described as a separate component may or may not be physically separated, and the component displayed as a unit may be a physical unit or multiple physical units, that is, may be located in one place, or may be distributed to multiple different places. . Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit. The above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。When the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a readable storage medium. Based on such an understanding, the technical solutions of the embodiments of the present application essentially or partly contribute to the existing technology or all or part of the technical solutions may be embodied in the form of a software product, which is stored in a storage medium The instructions include a number of instructions for causing a device (which can be a single-chip microcomputer, a chip, or the like) or a processor to execute all or part of the steps of the method described in the embodiments of the present application. The foregoing storage medium includes various media that can store program codes, such as a U disk, a mobile hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above is only a specific implementation of this application, but the scope of protection of this application is not limited to this. Any changes or replacements within the technical scope disclosed in this application shall be covered by the scope of protection of this application. . Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims (16)

  1. 一种任务调度方法,其特征在于,包括:A task scheduling method, comprising:
    确定移动设备的运动状态,根据所述移动设备的运动状态确定需要执行的M个任务,确定所述M个任务中的N个时延敏感任务;所述M为大于等于1的整数,所述N为小于等于M的整数;Determine the motion state of the mobile device, determine M tasks to be executed according to the motion state of the mobile device, and determine N delay-sensitive tasks among the M tasks; the M is an integer greater than or equal to 1, N is an integer less than or equal to M;
    针对所述N个时延敏感任务中的每一个时延敏感任务,计算所述时延敏感任务的功耗需求,将目标功耗余量范围与所述时延敏感任务的功耗需求相匹配且能够在所述时延敏感任务的延迟要求时长内执行完所述时延敏感任务的运算器确定为执行所述时延敏感任务的目标运算器,并将所述时延敏感任务分配给所述时延敏感任务的目标运算器;For each delay-sensitive task of the N delay-sensitive tasks, calculate the power consumption requirement of the delay-sensitive task, and match the target power consumption margin range with the power consumption requirement of the delay-sensitive task. And the processor capable of executing the delay-sensitive task within the delay-requested time of the delay-sensitive task is determined as the target processor to execute the delay-sensitive task, and the delay-sensitive task is allocated to all Describe the target computing unit for delay-sensitive tasks;
    其中,所述目标功耗余量范围是根据所述移动设备的运算器的目标功耗范围与该运算器的当前功耗确定的;所述时延敏感任务的功耗需求是执行所述时延敏感任务所需的功耗;所述时延敏感任务的延迟要求时长是执行所述时延敏感任务所需的时长。The target power consumption margin range is determined according to the target power consumption range of the mobile device's computing unit and the current power consumption of the computing unit; the power consumption requirement of the delay-sensitive task is to execute the time The power consumption required for a delay-sensitive task; the delay requirement of the delay-sensitive task is the time required to execute the delay-sensitive task.
  2. 根据权利要求1所述的方法,其特征在于,所述确定所述M个任务中的N个时延敏感任务具体包括:The method according to claim 1, wherein the determining N delay-sensitive tasks out of the M tasks specifically comprises:
    根据所述运动状态确定所述M个任务中每一个任务的时延敏感度;Determining a delay sensitivity of each of the M tasks according to the motion state;
    根据所述M个任务中每一个任务的时延敏感度确定所述M个任务中每一个任务的优先级等级;所述优先级等级与时延敏感度成正比例关系;Determining a priority level of each of the M tasks according to delay sensitivity of each of the M tasks; the priority level is proportional to the delay sensitivity;
    将所述M个任务中优先级等级较高的N个任务确定为所述N个时延敏感任务。N tasks with a higher priority among the M tasks are determined as the N delay-sensitive tasks.
  3. 根据权利要求1或2所述的方法,其特征在于,所述计算所述时延敏感任务的功耗需求,将目标功耗余量范围与所述时延敏感任务的功耗需求相匹配且能够在所述时延敏感任务的延迟要求时长内执行完所述时延敏感任务的运算器确定为执行所述时延敏感任务的目标运算器具体包括:The method according to claim 1 or 2, wherein the calculating the power consumption requirement of the delay-sensitive task matches the target power consumption margin range with the power consumption requirement of the delay-sensitive task and The processor capable of determining that the delay-sensitive task has been executed within the delay-requested duration of the delay-sensitive task as a target processor that specifically performs the delay-sensitive task includes:
    计算所述时延敏感任务的功耗需求;Calculating power consumption requirements for the delay-sensitive task;
    针对所述移动设备的每一个运算器,确定所述运算器执行所述时延敏感任务的时长以及所述运算器的目标功耗余量范围;Determining, for each operator of the mobile device, a duration for which the operator performs the delay-sensitive task and a target power consumption margin range of the operator;
    判断所述时延敏感任务的功耗需求是否落入所述运算器的目标功耗余量范围内以及所述运算器执行所述时延敏感任务的时长是否未超过所述时延敏感任务的延迟要求时长;Determine whether the power consumption requirement of the delay-sensitive task falls within the target power consumption margin of the computing unit and whether the time duration for which the computing unit executes the delay-sensitive task does not exceed the delay-sensitive task Delay required;
    若所述时延敏感任务的功耗需求落入所述运算器的目标功耗余量范围内且所述运算器执行所述时延敏感任务的时长未超过所述时延敏感任务的延迟要求时长,则确定所述运算器为所述时延敏感任务的目标运算器。If the power consumption requirement of the delay-sensitive task falls within the target power consumption margin of the computing unit and the time for which the computing unit executes the delay-sensitive task does not exceed the delay requirement of the delay-sensitive task For the duration, it is determined that the operator is a target operator of the delay-sensitive task.
  4. 根据权利要求3所述的方法,其特征在于,所述方法还包括:The method according to claim 3, further comprising:
    若所述时延敏感任务的功耗需求与所述移动设备的所有运算器的目标功耗余量范围均不匹配,则将所述时延敏感任务的输入数据拆分为至少两个子数据块;If the power consumption requirements of the delay-sensitive task do not match the target power consumption margin ranges of all the operators of the mobile device, the input data of the delay-sensitive task is split into at least two sub-data blocks ;
    针对所述至少一个子数据块中的每一个子数据块,确定计算所述子数据块所需的功耗,若所述移动设备的一个运算器的目标功耗余量范围与所述功耗相匹配,则确定所述运算器为所述子数据块对应的目标运算器;For each of the at least one sub-data block, determining a power consumption required to calculate the sub-data block, if a target power consumption margin range of an operator of the mobile device and the power consumption If they match, it is determined that the operator is a target operator corresponding to the sub-data block;
    将所有子数据块对应的目标运算器确定为所述时延敏感任务对应的目标运算器。A target operator corresponding to all sub-data blocks is determined as a target operator corresponding to the delay-sensitive task.
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述将所述时延敏感任务分配给所述时延敏感任务的目标运算器具体包括:The method according to any one of claims 1-4, wherein the target computing unit that assigns the delay-sensitive task to the delay-sensitive task specifically includes:
    计算所述N个时延敏感任务的执行时长,根据所述时延敏感任务的执行时长确定所 述时延敏感任务的执行时长的延迟要求时长;Calculating the execution time of the N delay-sensitive tasks, and determining the delay required duration of the execution time of the delay-sensitive tasks according to the execution time of the delay-sensitive tasks;
    生成任务列表,所述任务列表用于记录所述N个时延敏感任务中的每一个时延敏感任务对应的目标运算器以及每一个目标运算器对应的延迟要求时长,以便所述移动设备的运算器根据所述任务列表确定待执行的时延敏感任务并在所述待执行的时延敏感任务的延迟要求时长内执行完所述待执行的时延敏感任务。Generate a task list, where the task list is used to record a target operator corresponding to each delay-sensitive task in the N delay-sensitive tasks and a delay requirement duration corresponding to each target operator, so that the mobile device ’s The arithmetic unit determines a delay-sensitive task to be executed according to the task list and finishes executing the delay-sensitive task to be executed within a delay required time of the delay-sensitive task to be executed.
  6. 根据权利要求2所述的方法,其特征在于,所述方法还包括:The method according to claim 2, further comprising:
    将所述N个时延敏感任务中的每一个时延敏感任务分配给对应的目标运算器后,针对所述M个任务中除所述N个时延敏感任务外的每一个任务,计算所述任务的功耗需求,将目标功耗余量范围与所述任务的功耗需求相匹配且能够在所述任务的延迟要求时长内执行完所述任务的运算器确定为执行所述任务的目标运算器,并将所述任务分配给所述时延敏感任务的目标运算器;After assigning each delay-sensitive task of the N delay-sensitive tasks to a corresponding target computing unit, for each task of the M tasks except the N delay-sensitive tasks, calculate The power consumption requirement of the task is determined by an operator that matches the target power consumption margin range with the power consumption requirement of the task and is capable of executing the task within the delay request duration of the task as being performed by the task. A target operator, and assigning the task to a target operator of the delay-sensitive task;
    其中,所述任务的功耗需求是执行所述任务所需的功耗,所述任务的延迟要求时长是执行所述任务所需的时长。Wherein, the power consumption requirement of the task is the power consumption required to execute the task, and the delay required time of the task is the time required to execute the task.
  7. 根据权利要求3所述的方法,其特征在于,所述运算器执行所述时延敏感任务的时长T满足:The method according to claim 3, wherein a duration T of the delay-sensitive task performed by the arithmetic unit satisfies:
    Figure PCTCN2019088450-appb-100001
    Figure PCTCN2019088450-appb-100001
    其中,Nc和Nm是运算器内部基本计算单元数量,f 1为运算器上核心的频率;f 2为运算器的内存的频率,C i是执行完所述时延敏感任务所需的双精度浮点运算次数,m j是执行完所述时延敏感任务所需的内存访问频率;所述内存访问频率是每秒访问内存的次数。 Among them, Nc and Nm are the number of basic computing units inside the arithmetic unit, f 1 is the frequency of the core on the arithmetic unit; f 2 is the frequency of the memory of the arithmetic unit, and C i is the double precision required to perform the delay-sensitive task. The number of floating-point operations, m j is the memory access frequency required to complete the delay-sensitive task; the memory access frequency is the number of times that the memory is accessed per second.
  8. 根据权利要求1-7任一项所述的方法,其特征在于,所述时延敏感任务的功耗需求W满足:W=n*w^,其中,n执行完所述时延敏感任务的所需的基本操作数,w^为运算器进行一次基本操作的功耗。The method according to any one of claims 1 to 7, characterized in that the power consumption requirement W of the delay-sensitive task satisfies: W = n * w ^, where n finishes executing the delay-sensitive task. The required basic operand, w ^ is the power consumption of a basic operation performed by the arithmetic unit.
  9. 一种装置,其特征在于,包括:A device, comprising:
    确定单元,用于确定移动设备的运动状态,根据所述移动设备的运动状态确定需要执行的M个任务,确定所述M个任务中的N个时延敏感任务;所述M为大于等于1的整数,所述N为小于等于M的整数;A determining unit, configured to determine a motion state of the mobile device, determine M tasks to be performed according to the motion state of the mobile device, and determine N delay-sensitive tasks among the M tasks; the M is greater than or equal to 1 An integer of N, the N is an integer less than or equal to M;
    计算单元,用于针对所述N个时延敏感任务中的每一个时延敏感任务,计算所述时延敏感任务的功耗需求;A computing unit, configured to calculate a power consumption requirement of the delay-sensitive tasks for each of the N delay-sensitive tasks;
    所述确定单元还用于,将目标功耗余量范围与所述时延敏感任务的功耗需求相匹配且能够在所述时延敏感任务的延迟要求时长内执行完所述时延敏感任务的运算器确定为执行所述时延敏感任务的目标运算器;The determining unit is further configured to match the target power consumption margin range with the power consumption requirement of the delay-sensitive task and be able to execute the delay-sensitive task within the delay-required duration of the delay-sensitive task. Is determined to be the target processor for performing the delay-sensitive task;
    分配单元,用于将所述时延敏感任务分配给所述时延敏感任务的目标运算器;An allocation unit, configured to allocate the delay-sensitive task to a target operator of the delay-sensitive task;
    其中,所述目标功耗余量范围是根据所述移动设备的运算器的目标功耗范围与该运算器的当前功耗确定的;所述时延敏感任务的功耗需求是执行所述时延敏感任务所需的功耗;所述时延敏感任务的延迟要求时长是执行所述时延敏感任务所需的时长。The target power consumption margin range is determined according to the target power consumption range of the mobile device's computing unit and the current power consumption of the computing unit; the power consumption requirement of the delay-sensitive task is to execute the time The power consumption required for a delay-sensitive task; the delay requirement of the delay-sensitive task is the time required to execute the delay-sensitive task.
  10. 根据权利要求9所述的装置,其特征在于,所述确定单元具体用于,根据所述运动状态确定所述M个任务中每一个任务的时延敏感度;The device according to claim 9, wherein the determining unit is specifically configured to determine a delay sensitivity of each of the M tasks according to the motion state;
    根据所述M个任务中每一个任务的时延敏感度确定所述M个任务中每一个任务的优 先级等级;所述优先级等级与时延敏感度成正比例关系;Determining the priority level of each of the M tasks according to the delay sensitivity of each of the M tasks; the priority level is proportional to the delay sensitivity;
    将所述M个任务中优先级等级较高的N个任务确定为所述N个时延敏感任务。N tasks with a higher priority among the M tasks are determined as the N delay-sensitive tasks.
  11. 根据权利要求9或10所述的装置,其特征在于,The device according to claim 9 or 10, wherein:
    所述计算单元具体用于,计算所述时延敏感任务的功耗需求;针对所述移动设备的每一个运算器,确定所述运算器执行所述时延敏感任务的时长以及所述运算器的目标功耗余量范围;The calculation unit is specifically configured to calculate a power consumption requirement of the delay-sensitive task; and for each operator of the mobile device, determine a time period for the operator to execute the delay-sensitive task and the operator Target power consumption margin range;
    所述确定单元具体用于,判断所述时延敏感任务的功耗需求是否落入所述运算器的目标功耗余量范围内以及所述运算器执行所述时延敏感任务的时长是否未超过所述时延敏感任务的延迟要求时长;若所述时延敏感任务的功耗需求落入所述运算器的目标功耗余量范围内且所述运算器执行所述时延敏感任务的时长未超过所述时延敏感任务的延迟要求时长,则确定所述运算器为所述时延敏感任务的目标运算器。The determining unit is specifically configured to determine whether the power consumption requirement of the delay-sensitive task falls within the target power consumption margin of the computing unit and whether the time duration for which the computing unit executes the delay-sensitive task has not yet passed. Exceeds the delay requirement duration of the delay-sensitive task; if the power consumption requirement of the delay-sensitive task falls within the target power consumption margin of the processor and the processor executes the delay-sensitive task If the duration does not exceed the delay required duration of the delay-sensitive task, it is determined that the computing unit is a target computing unit of the delay-sensitive task.
  12. 根据权利要求11所述的装置,其特征在于,所述确定单元还用于,若所述时延敏感任务的功耗需求与所述移动设备的所有运算器的目标功耗余量范围均不匹配,则将所述时延敏感任务的输入数据拆分为至少两个子数据块;The apparatus according to claim 11, wherein the determining unit is further configured to: if a power consumption requirement of the delay-sensitive task does not exceed a target power consumption margin range of all the arithmetic units of the mobile device Matching, the input data of the delay-sensitive task is split into at least two sub-data blocks;
    针对所述至少一个子数据块中的每一个子数据块,确定计算所述子数据块所需的功耗,若所述移动设备的一个运算器的目标功耗余量范围与所述功耗相匹配,则确定所述运算器为所述子数据块对应的目标运算器;For each of the at least one sub-data block, determining a power consumption required to calculate the sub-data block, if a target power consumption margin range of an operator of the mobile device and the power consumption If they match, it is determined that the operator is a target operator corresponding to the sub-data block;
    将所有子数据块对应的目标运算器确定为所述时延敏感任务对应的目标运算器。A target operator corresponding to all sub-data blocks is determined as a target operator corresponding to the delay-sensitive task.
  13. 根据权利要求9-12任一项所述的装置,其特征在于,所述分配单元具体用于,计算所述N个时延敏感任务的执行时长,根据所述时延敏感任务的执行时长确定所述时延敏感任务的执行时长的延迟要求时长;The apparatus according to any one of claims 9 to 12, wherein the allocation unit is specifically configured to calculate an execution time of the N delay-sensitive tasks, and determine the execution time according to the execution time of the delay-sensitive tasks. A delay request duration of the execution duration of the delay-sensitive task;
    生成任务列表,所述任务列表用于记录所述N个时延敏感任务中的每一个时延敏感任务对应的目标运算器以及每一个目标运算器对应的延迟要求时长,以便所述移动设备的运算器根据所述任务列表确定待执行的时延敏感任务并在所述待执行的时延敏感任务的延迟要求时长内执行完所述待执行的时延敏感任务。Generate a task list, where the task list is used to record a target operator corresponding to each delay-sensitive task in the N delay-sensitive tasks and a delay requirement duration corresponding to each target operator, so that the mobile device ’s The arithmetic unit determines a delay-sensitive task to be executed according to the task list and finishes executing the delay-sensitive task to be executed within a delay required time of the delay-sensitive task to be executed.
  14. 根据权利要求10所述的装置,其特征在于,The device according to claim 10, wherein:
    所述计算单元还用于,将所述N个时延敏感任务中的每一个时延敏感任务分配给对应的目标运算器后,针对所述M个任务中除所述N个时延敏感任务外的每一个任务,计算所述任务的功耗需求;The computing unit is further configured to, after allocating each delay-sensitive task of the N delay-sensitive tasks to a corresponding target computing unit, remove the N delay-sensitive tasks from the M tasks For each external task, calculate the power consumption requirements of the task;
    所述分配单元还用于,将目标功耗余量范围与所述任务的功耗需求相匹配且能够在所述任务的延迟要求时长内执行完所述任务的运算器确定为执行所述任务的目标运算器,并将所述任务分配给所述时延敏感任务的目标运算器;The allocating unit is further configured to: an operator that matches the target power consumption margin range with the power consumption demand of the task and is capable of executing the task within the delay request duration of the task is determined to execute the task A target operator, and assigning the task to a target operator of the delay-sensitive task;
    其中,所述任务的功耗需求是执行所述任务所需的功耗,所述任务的延迟要求时长是执行所述任务所需的时长。Wherein, the power consumption requirement of the task is the power consumption required to execute the task, and the delay required time of the task is the time required to execute the task.
  15. 根据根据权利要求11所述的装置,其特征在于,所述运算器执行所述时延敏感任务的时长T满足:The apparatus according to claim 11, wherein a duration T of the delay-sensitive task performed by the computing unit satisfies:
    Figure PCTCN2019088450-appb-100002
    Figure PCTCN2019088450-appb-100002
    其中,Nc和Nm是运算器内部基本计算单元数量,f 1为运算器上核心的频率;f 2为运 算器的内存的频率,C i是执行完所述时延敏感任务所需的双精度浮点运算次数,m j是执行完所述时延敏感任务所需的内存访问频率;所述内存访问频率是每秒访问内存的次数。 Among them, Nc and Nm are the number of basic computing units inside the arithmetic unit, f 1 is the frequency of the core on the arithmetic unit; f 2 is the frequency of the memory of the arithmetic unit, and C i is the double precision required to perform the delay-sensitive task. Number of floating-point operations, m j is the memory access frequency required to complete the delay-sensitive task; the memory access frequency is the number of times the memory is accessed per second.
  16. 根据根据权利要求9-15任一项所述的装置,其特征在于,所述时延敏感任务的功耗需求W满足:W=n*w^,其中,n执行完所述时延敏感任务的所需的基本操作数,w^为运算器进行一次基本操作的功耗。The apparatus according to any one of claims 9 to 15, wherein the power consumption requirement W of the delay-sensitive task satisfies: W = n * w ^, where n finishes executing the delay-sensitive task Of the required basic operands, w ^ is the power consumption of a basic operation performed by the arithmetic unit.
PCT/CN2019/088450 2018-05-28 2019-05-25 Task scheduling method and device WO2019228285A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810525810.7A CN110543148B (en) 2018-05-28 2018-05-28 Task scheduling method and device
CN201810525810.7 2018-05-28

Publications (1)

Publication Number Publication Date
WO2019228285A1 true WO2019228285A1 (en) 2019-12-05

Family

ID=68696622

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/088450 WO2019228285A1 (en) 2018-05-28 2019-05-25 Task scheduling method and device

Country Status (2)

Country Link
CN (1) CN110543148B (en)
WO (1) WO2019228285A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113759833A (en) * 2020-06-05 2021-12-07 航天科工惯性技术有限公司 Multi-sensor collection task scheduling method

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257486B (en) * 2019-12-23 2023-12-29 北京国家新能源汽车技术创新中心有限公司 Computing force distribution method, computing force distribution device, computer equipment and storage medium
CN111258235A (en) * 2020-01-10 2020-06-09 浙江吉利汽车研究院有限公司 Method, device, equipment and storage medium for realizing vehicle-mounted function

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103096445A (en) * 2013-02-05 2013-05-08 清华大学 Method and system of wireless sensor network task scheduling based on actual battery model
US20130262891A1 (en) * 2012-03-30 2013-10-03 Verizon Patent And Licensing Inc. Method and system for managing power of a mobile device
CN105893126A (en) * 2016-03-29 2016-08-24 华为技术有限公司 Task scheduling method and device
CN107450982A (en) * 2017-06-07 2017-12-08 上海交通大学 A kind of method for scheduling task based on system mode

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101604264B (en) * 2009-07-08 2012-07-25 深圳先进技术研究院 Task scheduling method and system for supercomputer
JP5582016B2 (en) * 2010-12-15 2014-09-03 ソニー株式会社 Task management apparatus, task management method, and program
CN102508708B (en) * 2011-11-30 2014-04-23 湖南大学 Heterogeneous multi-core energy-saving task schedule method based on improved genetic algorithm
US8943252B2 (en) * 2012-08-16 2015-01-27 Microsoft Corporation Latency sensitive software interrupt and thread scheduling
CN103902379A (en) * 2012-12-25 2014-07-02 中国移动通信集团公司 Task scheduling method and device and server cluster
US9323574B2 (en) * 2014-02-21 2016-04-26 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. Processor power optimization with response time assurance
CN104965754A (en) * 2015-03-31 2015-10-07 腾讯科技(深圳)有限公司 Task scheduling method and task scheduling apparatus
CN105260005B (en) * 2015-09-22 2018-09-14 浙江工商大学 Cloud workflow schedule optimization method towards energy consumption
CN107145388B (en) * 2017-05-25 2020-10-30 深信服科技股份有限公司 Task scheduling method and system under multi-task environment
CN107678850B (en) * 2017-10-17 2020-07-07 合肥工业大学 Relay satellite task scheduling method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130262891A1 (en) * 2012-03-30 2013-10-03 Verizon Patent And Licensing Inc. Method and system for managing power of a mobile device
CN103096445A (en) * 2013-02-05 2013-05-08 清华大学 Method and system of wireless sensor network task scheduling based on actual battery model
CN105893126A (en) * 2016-03-29 2016-08-24 华为技术有限公司 Task scheduling method and device
CN107450982A (en) * 2017-06-07 2017-12-08 上海交通大学 A kind of method for scheduling task based on system mode

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113759833A (en) * 2020-06-05 2021-12-07 航天科工惯性技术有限公司 Multi-sensor collection task scheduling method
CN113759833B (en) * 2020-06-05 2023-06-06 航天科工惯性技术有限公司 Multi-sensor acquisition task scheduling method

Also Published As

Publication number Publication date
CN110543148A (en) 2019-12-06
CN110543148B (en) 2021-04-09

Similar Documents

Publication Publication Date Title
US11181913B2 (en) Autonomous vehicle fleet model training and testing
US11887377B2 (en) Acceleration of data processing for object detection
US11884294B2 (en) Lane change planning and control in autonomous machine applications
US10956211B2 (en) Method and apparatus of allocating automotive computing tasks to networked devices with heterogeneous capabilities
JP2023514905A (en) Behavior planning for autonomous vehicles
US20200302250A1 (en) Iterative spatial graph generation
WO2019228285A1 (en) Task scheduling method and device
Baidya et al. Vehicular and edge computing for emerging connected and autonomous vehicle applications
US20210035443A1 (en) Navigation analysis for a multi-lane roadway
CN115649195A (en) Method and device for predicting forward target convergence cut, electronic device and storage medium
WO2021000787A1 (en) Method and device for road geometry recognition
CN111562978B (en) Architecture and device for sharing multi-stream visual processing on a device
CN115988462B (en) Debugging method of edge computing module based on vehicle-road cooperation
Xia et al. Lane scheduling around crossroads for edge computing based autonomous driving
CN115454082A (en) Vehicle obstacle avoidance method and system, computer readable storage medium and electronic device
CN114154510A (en) Control method and device for automatic driving vehicle, electronic equipment and storage medium
TW202228026A (en) Allocating processing resources to concurrently-executing neural networks
Velusamy et al. Automotive sensor infrastructure-challenges and opportunities
US20230342161A1 (en) Boot process system-on-chip node configuration
Atik et al. Are turn-by-turn navigation systems of regular vehicles ready for edge-assisted autonomous vehicles?
WO2021229707A1 (en) Information processing device, program, system, and information processing method
US20230341234A1 (en) Lane planning architecture for autonomous machine systems and applications
CN116434571A (en) Cooperative vehicle ramp afflux method, device, computer equipment and storage medium
JP7392027B2 (en) system
US11677839B2 (en) Automatic coalescing of GPU-initiated network communication

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19810048

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19810048

Country of ref document: EP

Kind code of ref document: A1