CN111752700B - Hardware selection method and device on processor - Google Patents

Hardware selection method and device on processor Download PDF

Info

Publication number
CN111752700B
CN111752700B CN201910239753.0A CN201910239753A CN111752700B CN 111752700 B CN111752700 B CN 111752700B CN 201910239753 A CN201910239753 A CN 201910239753A CN 111752700 B CN111752700 B CN 111752700B
Authority
CN
China
Prior art keywords
hardware
sub
service
target
scheme
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910239753.0A
Other languages
Chinese (zh)
Other versions
CN111752700A (en
Inventor
周智强
叶挺群
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN201910239753.0A priority Critical patent/CN111752700B/en
Publication of CN111752700A publication Critical patent/CN111752700A/en
Application granted granted Critical
Publication of CN111752700B publication Critical patent/CN111752700B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides a hardware selection method, a device and an electronic device on a processor, wherein a service to be executed by the processor is divided into a plurality of sub-services, and a plurality of hardware on the processor is divided into a plurality of hardware groups, and the method comprises the following steps: determining a plurality of hardware groups divided on a processor; distributing hardware groups for each sub-service for a plurality of times, and forming a hardware scheme for the hardware groups distributed for each sub-service each time; aiming at each hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the hardware scheme, and calculating the total time consumption for executing each sub-service; among the plurality of hardware schemes, the hardware scheme that is least in total time consumption is selected as the final hardware scheme. The method of the application can be used for distributing hardware for the service processed on the processor.

Description

Hardware selection method and device on processor
Technical Field
The present application relates to the field of computers, and in particular, to a method and apparatus for selecting hardware on a processor.
Background
With the increasing demands on computing devices to handle business, processors of computing devices have also evolved rapidly.
To accommodate today's high-demand business processes, a processor on a computing device today may include multiple pieces of hardware, which may be divided into different hardware groups, each of which is a computing unit, that may perform business processes independently.
Therefore, how to select the hardware adapted to the service processing, so that the hardware resources on the processor are reasonably allocated becomes a continuous discussion problem in the industry.
Disclosure of Invention
In view of this, the present application provides a method and apparatus for hardware selection on a processor.
Specifically, the application is realized by the following technical scheme:
according to a first aspect of the present application, there is provided a method of selecting hardware on a processor, a service to be executed by the processor being divided into a plurality of sub-services, a plurality of hardware on the processor being divided into a plurality of hardware groups, the method comprising:
determining a plurality of hardware groups divided on a processor;
distributing hardware groups for each sub-service for a plurality of times, and forming a hardware scheme for the hardware groups distributed for each sub-service each time;
aiming at each hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the hardware scheme, and calculating the total time consumption for executing each sub-service;
Among the plurality of hardware schemes, the hardware scheme that is least in total time consumption is selected as the final hardware scheme.
Optionally, the plurality of hardware groups are two hardware groups;
the hardware groups are allocated for each sub-service for a plurality of times, and each hardware group allocated for each sub-service forms a hardware scheme; aiming at each hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the hardware scheme, and calculating the total time consumption for executing each sub-service; among the plurality of hardware schemes, selecting a hardware scheme with the smallest total consumption includes:
determining a main hardware group and an auxiliary hardware group in the two hardware groups;
respectively distributing the main hardware group to each sub-service to form a default hardware scheme, executing each sub-service by adopting the hardware group distributed for each sub-service in the default hardware scheme, recording the time consumption of executing the sub-service of each sub-service by adopting the default hardware scheme, and recording the total time consumption of executing each sub-service;
sequencing each sub-service for a plurality of times according to different appointed sequences based on the recorded sub-service time consumption of each sub-service;
for each sort, sequentially selecting target sub-business according to the sort, replacing the hardware group corresponding to the selected target sub-business with an auxiliary hardware group, and forming a plurality of target hardware schemes without changing the hardware groups corresponding to other sub-business; aiming at each target hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the target hardware scheme, and calculating the total time consumption for executing each sub-service; selecting a hardware scheme with the least total time consumption for executing each sub-service from the default hardware scheme and the multiple target hardware schemes as a candidate hardware scheme for the current sequencing;
And selecting the candidate hardware scheme with the minimum total consumption from the candidate hardware schemes determined by the multiple sequencing as a final hardware scheme.
Optionally, when the recording performs the total consumption of each sub-service, the method includes:
setting the values of a plurality of preset total time consumption variables N as follows: executing the total time consumption of each sub-service by adopting a default hardware scheme; each ordering corresponds to a total time-consuming variable N;
sequentially selecting target sub-services according to the sequence for each sequence, replacing the hardware group corresponding to the selected target sub-services with an auxiliary hardware group, and forming a plurality of target hardware schemes without changing the hardware groups corresponding to other sub-services; aiming at each target hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the target hardware scheme, and calculating the total time consumption for executing each sub-service; selecting a hardware scheme with the least total time consumption for executing each sub-service from the default hardware scheme and the plurality of target hardware schemes as a candidate hardware scheme for the current sequencing, wherein the method comprises the following steps:
for each sorting, determining a total time-consuming variable N corresponding to the sorting;
selecting the first sub-service of the sequence as a target sub-service according to the sequence;
Replacing the hardware group corresponding to the target sub-service with an auxiliary hardware group, wherein the hardware groups corresponding to other sub-services are unchanged, so as to form a target hardware scheme;
executing each sub-service by adopting a hardware group allocated to each sub-service in the formed target hardware scheme, and calculating the target total time consumption for executing each sub-service by adopting the target hardware scheme;
if the total time consumption of the target is smaller than the value of N corresponding to the current sorting, updating the value of N corresponding to the current sorting into the total time consumption of the target, updating the recorded default hardware scheme into the target hardware scheme, eliminating the target sub-service in the current sorting, and returning the sub-service which is the forefront of the sorting selected according to the sorting as the target sub-service;
and if the target time consumption is greater than the value of N corresponding to the current sequencing, determining the recorded default hardware scheme as a candidate hardware scheme of the current sequencing.
Optionally, the determining the main hardware group and the auxiliary hardware group in the two hardware groups includes:
determining computing power of two hardware groups;
and selecting a hardware group with large computing power as a main hardware group, and selecting a hardware group with small computing power as an auxiliary hardware group.
Optionally, the different specified sequence includes: the time consumption of the sub-services corresponding to each sub-service is from small to large, and the time consumption of the sub-services corresponding to each sub-service is from large to small.
Optionally, the processor is a GPU, and the hardware on the GPU includes: a Cuda Core chip, a Tensor Core chip, and a DLA chip;
one of the two hardware groups includes: a Cuda Core chip and a Tensor Core chip; another hardware group includes: cuda Core chip and DLA chip.
According to a second aspect of the present application, there is provided a hardware selection device on a processor, a service that the processor needs to execute being divided into a plurality of sub-services, a plurality of hardware on the processor being divided into a plurality of hardware groups, the device comprising:
a determining unit configured to determine a plurality of hardware groups divided on the processor;
the selecting unit is used for distributing hardware groups for each sub-service for a plurality of times, and each hardware group distributed for each sub-service forms a hardware scheme; aiming at each hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the hardware scheme, and calculating the total time consumption for executing each sub-service; among the plurality of hardware schemes, the hardware scheme that is least in total time consumption is selected as the final hardware scheme.
Optionally, the plurality of hardware groups are two hardware groups;
the selection unit is specifically configured to determine a main hardware group and an auxiliary hardware group from two hardware groups;
respectively distributing the main hardware group to each sub-service to form a default hardware scheme, executing each sub-service by adopting the hardware group distributed for each sub-service in the default hardware scheme, recording the time consumption of executing the sub-service of each sub-service by adopting the default hardware scheme, and recording the total time consumption of executing each sub-service; sequencing each sub-service for a plurality of times according to different appointed sequences based on the recorded sub-service time consumption of each sub-service; for each sort, sequentially selecting target sub-business according to the sort, replacing the hardware group corresponding to the selected target sub-business with an auxiliary hardware group, and forming a plurality of target hardware schemes without changing the hardware groups corresponding to other sub-business; aiming at each target hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the target hardware scheme, and calculating the total time consumption for executing each sub-service; selecting a hardware scheme with the least total time consumption for executing each sub-service from the default hardware scheme and the multiple target hardware schemes as a candidate hardware scheme for the current sequencing; and selecting the candidate hardware scheme with the minimum total consumption from the candidate hardware schemes determined by the multiple sequencing as a final hardware scheme.
Optionally, the selecting unit is specifically configured to set values of a plurality of preset total time consumption variables N as follows: executing the total time consumption of the service by adopting a default hardware scheme; each ordering corresponds to a total time-consuming variable N;
the selecting unit sequentially selects target sub-services according to the sequence for each sequence, replaces the hardware group corresponding to the selected target sub-services with an auxiliary hardware group, and forms a plurality of target hardware schemes without changing the hardware groups corresponding to other sub-services; aiming at each target hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the target hardware scheme, and calculating the total time consumption for executing each sub-service; when the hardware scheme with the least total time consumption for executing each sub-service is selected from the default hardware scheme and the plurality of target hardware schemes as the candidate hardware scheme for the current sequencing, the method is further used for:
for each sorting, determining a total time-consuming variable N corresponding to the sorting;
selecting the first sub-service of the sequence as a target sub-service according to the sequence;
replacing the hardware group corresponding to the target sub-service with an auxiliary hardware group, wherein the hardware groups corresponding to other sub-services are unchanged, so as to form a target hardware scheme;
Executing each sub-service by adopting a hardware group allocated to each sub-service in the formed target hardware scheme, and calculating the target total time consumption for executing each sub-service by adopting the target hardware scheme;
if the total time consumption of the target is smaller than the value of N corresponding to the current sorting, updating the value of N corresponding to the current sorting into the total time consumption of the target, updating the recorded default hardware scheme into the target hardware scheme, eliminating the target sub-service in the current sorting, and returning the sub-service which is the forefront of the sorting selected according to the sorting as the target sub-service;
and if the target time consumption is greater than the value of N corresponding to the current sequencing, determining the recorded default hardware scheme as a candidate hardware scheme of the current sequencing.
Optionally, the selecting unit is specifically configured to determine computing capabilities of the two hardware groups when determining the main hardware group and the auxiliary hardware group in the two hardware groups; and selecting a hardware group with large computing power as a main hardware group, and selecting a hardware group with small computing power as an auxiliary hardware group.
Optionally, the different specified sequence includes: the time consumption of the sub-services corresponding to each sub-service is from small to large, and the time consumption of the sub-services corresponding to each sub-service is from large to small.
Optionally, the processor is a GPU, and the hardware on the GPU includes: a Cuda Core chip, a Tensor Core chip, and a DLA chip;
one of the two hardware groups includes: a Cuda Core chip and a Tensor Core chip; another hardware group includes: cuda Core chip and DLA chip.
As can be seen from the above description, since the present application does not allocate one hardware group to one service to perform service processing, but divides one service into a plurality of sub-services, and allocates a hardware group adapted to each sub-service, so that the total consumption of performing each sub-service using the hardware group allocated to each sub-service is minimized, hardware resources of a processor can be fully utilized when one service is performed.
Drawings
FIG. 1a is a schematic diagram of a service architecture according to an exemplary embodiment of the present application;
FIG. 1b is a schematic diagram of a traffic division according to an exemplary embodiment of the present application;
FIG. 1c is a block diagram of a processor in accordance with an exemplary embodiment of the present application;
FIG. 2 is a flow chart illustrating a method of hardware selection on a processor in accordance with an exemplary embodiment of the present application;
FIG. 3 is a flowchart illustrating another method of hardware selection on a processor in accordance with an exemplary embodiment of the present application;
FIG. 4a is a schematic diagram of another traffic division according to an exemplary embodiment of the present application;
FIG. 4b is a block diagram of another processor in accordance with an exemplary embodiment of the present application;
FIG. 4c is a schematic diagram of another method of hardware selection on a processor, according to an exemplary embodiment of the application;
FIG. 5 is a hardware configuration diagram of an electronic device according to an exemplary embodiment of the present application;
fig. 6 is a block diagram of a hardware selection device on a processor according to an exemplary embodiment of the application.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the application. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
Before describing the hardware selection method on the processor provided by the application, several concepts proposed by the application are described in detail.
1) Service
The business related by the application is mainly the business which needs to be processed on the equipment such as the computer, the server and the like. For example, the service may include a video based on deep learning, a picture analysis service, and of course, the service may also include a message analysis service, a dangerous event recognition service, etc., which are only exemplified herein and are not particularly limited.
2) Sub-services
Typically, a service is made up of a number of algorithm modules.
For example, as shown in fig. 1a, the deep learning-based video, picture analysis service may include: the algorithm module comprises a decoding algorithm module, an algorithm module for algorithm A reasoning calculation, an algorithm module for algorithm B reasoning calculation, … and an algorithm module for algorithm N reasoning calculation.
For a service, different algorithm modules may adapt to different hardware groups on the processor, so in the embodiment of the present application, a service is divided into a plurality of sub-services, where each sub-service is an algorithm module, so as to determine the hardware group on the processor that each algorithm module adapts to.
In the dividing process, the application can divide the business by adopting a pipeline structure. Specifically, the present application may divide the traffic into multiple stages of pipelines, each of which is an algorithm module (i.e., each of which is a sub-traffic).
A buffer area is configured between two adjacent sub-services, and the buffer area is used for exchanging data between the two adjacent sub-services.
For example, as shown in fig. 1b, fig. 1b shows a schematic diagram in which a video, picture analysis service based on deep learning is divided into a plurality of sub-services.
The application divides the decoding algorithm module into sub-service 1 (also called pipeline 1), the algorithm module of algorithm A reasoning calculation into sub-service 2 (also called pipeline 2), …, the algorithm module of algorithm N reasoning calculation into sub-service N+1 (also called pipeline N+1). Sub-service 1, sub-services 2, …, sub-service n+1 are connected in series.
A buffer is configured between each adjacent sub-service, for example, a buffer is configured between sub-service 1 and sub-service 2. Sub-service 1 may write data generated by executing the sub-service 1 into the buffer, and sub-service 2 may read data generated by the sub-service 1 from the buffer to perform service processing of the sub-service 2.
By configuring the buffer area between adjacent sub-services, each sub-service can be executed without waiting for the processing result of the previous sub-service when executing the own service, but can directly acquire the processing result of the previous sub-service from the buffer area to execute, so that each sub-service can independently execute the own service.
The present application aims to select the hardware group on the processor adapted to each sub-service, so that the total consumption of executing each sub-service by adopting the hardware group selected for each sub-service is minimized.
Furthermore, it should be noted that the plurality of subtasks may be represented in multiple threads or multiple processes.
3) Hardware group
In order to adapt to the current high-demand service processing, the current computer processor may include a plurality of pieces of hardware, where the plurality of pieces of hardware may be divided into different hardware groups, and each hardware group is a computing unit, and may perform service processing independently.
For example, a processor on a computer may include: GPU (Graphics Processing Unit, graphics processor), CPU (Central Processing Unit ), etc.
As shown in fig. 1c, when the processor is a GPU, the plurality of hardware on the GPU may include: cuda Core (Cuda Core, which is the name of NVIDIA ambida dedicated to "stream processor" in its GPU), tensor Core (a new processing Core), DLA (a dedicated chip suitable for deep learning) chip, etc. The hardware included on the GPU is only exemplarily described here and is not specifically limited.
The hardware on the GPU may be divided into two hardware groups, one of which (hardware group a as shown in fig. 1 c) consists of Cuda Core and Tensor Core, and the other (hardware group B as shown in fig. 1 c) consists of Cuda Core and DLA. Of course, the hardware groups may be divided in other manners, and the manner of dividing the hardware groups is only described here by way of example, and is not particularly limited.
The application aims at providing a hardware selection method for executing service processing, which divides the service into a plurality of sub-services, distributes hardware groups for each sub-service for a plurality of times, and forms a hardware scheme for the hardware groups distributed for each sub-service each time. And respectively executing the service by adopting the formed multiple hardware schemes, taking the total time consumption for executing the service as a reference basis, and selecting the hardware scheme with the minimum total time consumption. And determining the hardware for finally executing each sub-service as a hardware group corresponding to each sub-module recorded in the selected hardware scheme.
The application does not allocate a hardware group for one service to execute service processing, but divides one service into a plurality of sub-services, and allocates a hardware group matched with each sub-service for each sub-service, so that the total consumption of executing each sub-service by adopting the hardware group allocated for each sub-service is minimum, and the hardware resource of a processor can be fully utilized when executing one service.
Referring to fig. 2, fig. 2 is a flowchart illustrating a hardware selection method for performing service processing, which is applicable to an electronic device and may include the following steps, according to an exemplary embodiment of the present application.
Step 201: the electronic device may determine a plurality of hardware groups partitioned on the processor.
Step 202: the electronic equipment can allocate hardware groups for each sub-service for a plurality of times, and each hardware group allocated for each sub-service forms a hardware scheme; aiming at each hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the hardware scheme, and calculating the total time consumption for executing each sub-service; among the plurality of hardware schemes, the hardware scheme that is least in total time consumption is selected as the final hardware scheme.
In an optional implementation manner, the electronic device may determine a default hardware scheme first, then use the default hardware scheme as a reference, replace a hardware group corresponding to a sub-service recorded in the default hardware scheme continuously to form a plurality of target hardware schemes, and select a hardware scheme with the least time consumption from the default hardware scheme and the plurality of target hardware schemes.
The method is specifically as follows:
next, step 202 will be described in detail with reference to step 2021 to step 2025, taking an example that the hardware on the processor is divided into two hardware groups.
Step 2021: the electronic device may determine the primary hardware set and the secondary hardware set in two hardware sets.
In implementation, the electronic device may determine the computing power of the two hardware groups, then determine the hardware group with the greater computing power as the primary hardware group, and determine the hardware group with the lesser computing power as the secondary hardware group.
The computing power may be measured by a FLOPS (FloaNing-poinN OperaNions Per Second, number of floating point operations performed per second) parameter, or may be measured by other parameters, and the parameters for measuring the computing power are only exemplified herein, and are not specifically limited.
Step 2022: the electronic equipment distributes the main hardware group to each sub-service respectively to form a default hardware scheme, executes each sub-service by adopting the hardware group distributed for each sub-service in the default hardware scheme, records the time consumption of executing the sub-service of each sub-service by adopting the default hardware scheme, and records the total time consumption of executing each sub-service.
When the method is implemented, the electronic equipment can allocate a main hardware group to each sub-service to form a default hardware scheme. The default hardware scheme includes: and a main hardware group allocated for each sub-service. Of course, other content may be included in the default hardware scheme, which is only exemplary and not specifically limited.
For example, the service is divided into 3 sub-services, namely sub-service 1, sub-service 2 and sub-service 3, and the electronic device may allocate a main hardware group to sub-service 1, a main hardware group to sub-service 2 and a main hardware group to sub-service 3 to form a default hardware scheme. The default hardware scheme comprises the following steps: sub-service 1-main hardware group, sub-service 2-main hardware group, sub-service 3-main hardware group.
Then, the electronic device may execute each sub-service using the hardware group allocated to each sub-service recorded in the default hardware scheme (in other words, execute the service using the default hardware scheme), that is, the electronic device may execute the sub-service 1 using the main hardware group, execute the sub-service 2 using the main hardware group, and execute the sub-service 3 using the main hardware group. It should be noted that, the execution of the service by using the hardware scheme in the present application means that the meaning of executing each sub-service by using the hardware group allocated to each sub-service and recorded in the hardware scheme is the same, and will not be described in detail.
The electronic device can record the sub-business time consuming for executing each sub-business using the default hardware scheme and the total time consuming for executing each sub-business using the default hardware scheme.
When recording, the electronic device pre-creates a plurality of total time consumption variables N, and the electronic device can set the initial values of the plurality of total time consumption variables N as the total time consumption for executing each sub-service by adopting a default hardware scheme. Each total time consuming variable corresponds to each rank.
Step 2023: the electronic device may sort the sub-services multiple times according to different specified orders based on the recorded sub-service time consumption of each sub-service.
Wherein the different specified sequences may include: the sub-service time consumption of each sub-service is from big to small, and the sub-service time consumption of each sub-service is from small to big. The different order of designation is merely illustrative and not specifically limiting.
The method may include, in different specified orders: the order of the sub-service time consumption of each sub-service from large to small and the order of the sub-service time consumption of each sub-service from small to large are exemplified.
When the method is implemented, the electronic equipment can sort all the sub-services once according to the order from small to large in time consumption of the sub-services of all the sub-services. The electronic device may further sort the sub-services according to the order of time consumption of the sub-services from big to small.
The order of sub-business time consumption from small to large corresponds to one total time consumption variable N, and the order of sub-business time consumption from large to small corresponds to the other total time consumption variable N.
Step 2024: for each sort, sequentially selecting target sub-business according to the sort, replacing the hardware group corresponding to the selected target sub-business with an auxiliary hardware group, and forming a plurality of target hardware schemes without changing the hardware groups corresponding to other sub-business; aiming at each target hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the target hardware scheme, and calculating the total time consumption for executing each sub-service; and selecting a hardware scheme with the least total time consumption for executing each sub-service from the default hardware scheme and the multiple target hardware schemes as a candidate hardware scheme for the current sequencing.
The candidate hardware scheme obtained by the primary sorting is described in detail below through steps a to J.
Step A: the electronic device may assign a corresponding total time-consuming variable N to the present ordering.
And (B) step (B): the electronic device may select, in the current ranking, the sub-service with the forefront ranking as the target sub-service.
Step C: the electronic device can replace the hardware group corresponding to the target sub-service with the auxiliary hardware group, and the hardware groups corresponding to the sub-services except the target sub-service are unchanged, so that a target hardware scheme is formed.
Step D: the electronic device can execute the business using the target hardware scheme (in other words, execute each sub-business using the target hardware scheme), and then calculate the target total time consuming to execute the business using the target hardware scheme.
Step E: the electronic device may compare the total time consumption of the target with the value of the total time consumption variable N assigned for the present ordering.
Step F: if the electronic equipment determines that the total time consumption of the target is smaller than the value of N corresponding to the current sequencing, updating the value of N corresponding to the current sequencing into the total time consumption of the target, and updating the recorded default hardware scheme into the target hardware scheme. The electronic device may delete the target sub-service in the present ordering, and then return to step B.
And step J, if the electronic equipment determines that the total time consumption of the target is greater than the value of N corresponding to the current sequencing, determining the recorded default hardware scheme as a candidate hardware scheme corresponding to the current sequencing.
It should be noted that, the candidate hardware schemes corresponding to other sequences obtained based on other sequences are the same as the above description, and are not repeated here.
Step 2025: and selecting the candidate hardware scheme with the minimum total consumption from the candidate hardware schemes determined by the multiple sequencing as a final hardware scheme.
As seen in step 2024, each ordering results in a candidate hardware solution. The electronic device can select the candidate hardware scheme with the minimum total consumption from the candidate hardware schemes determined by multiple sequencing, and determine the hardware for finally executing each sub-service as the hardware group corresponding to each sub-service recorded in the selected candidate hardware scheme.
The hardware selection method provided by the embodiment of the application is described in detail below by taking two orders of the different designated orders, namely the order from small to large in time consumption of the sub-business and the order from large to small in time consumption of the sub-business, as examples.
Referring to fig. 3, fig. 3 is a flowchart illustrating another hardware selection method for performing service processing, which is applicable to an electronic device, according to an exemplary embodiment of the present application, and may include the following steps.
It is assumed that, as shown in fig. 4a, the service is divided into 4 sub-services, sub-service 1, sub-service 2, sub-service 3 and sub-service 4, respectively. A buffer area is configured between the sub-service 1 and the sub-service 2, a buffer area is configured between the sub-service 2 and the sub-service 3, and a buffer area is configured between the sub-service 3 and the sub-service 4.
Assume, as shown in FIG. 4b, that hardware on a processor includes: cuda Core, tensor Core, and DLA chips. It is assumed that the hardware on the processor is divided into two hardware groups, hardware group 1 and hardware group 2, respectively. The hardware group 1 is: cuda core+Tensor Core, hardware group 2 is: cuda core+dla.
It is assumed that two total time consuming variables, a first total time consuming variable N1 and a second total time consuming variable N2, respectively, are created in advance on the electronic device. N1 corresponds to the order in which the sub-traffic time consumption is from small to large (herein abbreviated as the first order), and N2 corresponds to the order in which the sub-traffic time consumption is from large to small (herein abbreviated as the second order).
Step 301: and determining the hardware group with large computing power from the two hardware groups as a main hardware group, and determining the hardware group with small computing power as an auxiliary hardware group.
Assuming that the computing power of the hardware group 1 is greater than that of the hardware group 2, the hardware group 1 is determined as a main hardware group, and the hardware group 2 is determined as an auxiliary hardware group.
Step 302: the main hardware group is allocated to each sub-service to form a scheme 1, a recorded first default hardware scheme corresponding to the first order is determined as the scheme 1, and a recorded second default hardware scheme corresponding to the second order is determined as the scheme 1.
For example, the electronic device may assign hardware group 1 to sub-service 1 through sub-service 4, respectively, forming scheme 1.
The first default hardware scheme states: sub-service 1-hardware group 1, sub-service 2-hardware group 1, sub-service 3-hardware group 1, and sub-service 4-hardware group 1.
Step 303: executing each sub-service by adopting a first default hardware scheme, determining the time consumption of executing each sub-service and the first total time consumption of executing each sub-service, and setting the values of N1 and N2 as the determined first total time consumption;
for example, the electronic device may execute each sub-service using a first default hardware scheme. Specifically, the electronic device may employ hardware group 1 to execute sub-service 1, hardware group 1 to execute sub-service 2, hardware group 1 to execute sub-service 3, and hardware group 1 to execute sub-service 4.
The electronic device may record sub-service time consumption t1 of sub-service 1, sub-service time consumption t2 of sub-service 2, sub-service time consumption t3 of sub-service 3, and sub-service time consumption t4 of sub-service 4.
The electronic device may also record a first total time T0 for executing each sub-service using the first default hardware scheme, and update the values of N1 and N2 to T0.
Step 304: sequencing the sub-services according to the sequence from small to large in time consumption of the recorded sub-services;
step 305: and selecting the first sub-service as a first target sub-service, replacing the hardware group corresponding to the first target sub-service with an auxiliary hardware group, and forming a first target hardware scheme without changing the hardware groups corresponding to other sub-services.
Step 306: executing each sub-service by adopting a first target hardware scheme, and calculating second total time consumption for executing each sub-service by adopting the first target scheme;
step 307: detecting whether the second total consumption is smaller than the value of N1;
step 308: if the second total time consumption is less than the value of N1, the first target sub-service is removed from the sorting, the value of N1 is updated to the second total time consumption, the recorded first default hardware scheme is updated to the first target hardware scheme, and the step 305 is returned.
Step 309: and if the second total time consumption is greater than the first total time consumption, determining the first default hardware scheme recorded currently as a first candidate hardware scheme.
For example, assume that t2 > t3 > t1 > t4;
The time consumption of the sub-services is from small to large: sub-service 4, sub-service 1, sub-service 3, sub-service 2.
The electronic device may then select sub-service 4 as the first target sub-service. Then, the electronic device may replace the hardware group of the sub-service 4 with the hardware group 2, and the sub-services 1 to 3 still correspond to the hardware group 1, so as to form a first target hardware scheme.
Wherein, the first target hardware scheme records: sub-service 1-hardware group 1, sub-service 2-hardware group 1, sub-service 3-hardware group 1, sub-service 4-hardware group 2.
The electronic device may perform each sub-service using the first target hardware scheme, i.e. perform sub-service 1 using hardware group 1, perform sub-service 2 using hardware group 1, perform sub-service 3 using hardware group 1, perform sub-service 4 using hardware group 2, and then the electronic device may calculate the second total time consumption for performing each sub-service using the first target hardware scheme. Let T1 < T0 be assumed for the second total time consuming time T1.
The electronic device may compare the magnitudes of the values of T1 and N1 (i.e., T0). In this example, since T1 < T0, the electronic device may update the value of N1 to T1, update the recorded first default hardware scheme to the first target hardware scheme (i.e. sub-service 1-hardware group 1, sub-service 2-hardware group 1, sub-service 3-hardware group 1, sub-service 4-hardware group 2), and then reject the first target sub-service (i.e. sub-service 4) in the above-mentioned order, and reject the order of sub-service 4 as follows: sub-service 1, sub-service 3, sub-service 2.
The electronic device may then select the first-ranked sub-service (i.e., sub-service 1) as the first target sub-service at the current ranking (i.e., sub-service 1, sub-service 3, sub-service 2). Then, the electronic device may replace the hardware group of the sub-service 1 with the hardware group 2, where the hardware groups corresponding to the sub-services 2 to 4 are unchanged, so as to form a first target hardware scheme.
Wherein, the first target hardware scheme records: sub-service 1-hardware group 2, sub-service 2-hardware group 1, sub-service 3-hardware group 1, sub-service 4-hardware group 2.
The electronic device may perform each sub-service using the first target hardware scheme, i.e. perform sub-service 1 using hardware group 2, perform sub-service 2 using hardware group 1, perform sub-service 3 using hardware group 1, perform sub-service 4 using hardware group 2, and then the electronic device may calculate a second total time consumption for performing each sub-service using the first target hardware scheme. Let T2 > T1, assuming the second total time consumption is T2.
The electronic device may compare the magnitudes of the values of T2 and N1 (i.e., T1). In this example, since T2 > T1, the electronic device may take the currently recorded default hardware scheme (i.e., sub-service 1-hardware group 1, sub-service 2-hardware group 1, sub-service 3-hardware group 1, sub-service 4-hardware group 2) as the first candidate hardware scheme.
Step 310: sequencing the sub-services according to the sequence from the big time to the small time of the recorded sub-services;
step 311: and selecting the first sub-service as a second target sub-service, replacing the hardware group corresponding to the second target sub-service with an auxiliary hardware group, and forming a second target hardware scheme without changing the hardware groups corresponding to other sub-services.
Step 312: executing each sub-service by adopting a second target hardware scheme, and calculating a third total time consumption for executing each sub-service by adopting the second target hardware scheme;
step 313: detecting whether the third total consumption is smaller than the value of N2;
step 314: if the third total time consumption is less than the value of N2, deleting the second target sub-service in the sorting in step 311, updating the value of N2 to be the third total consumption, and updating the recorded second default hardware scheme to be the second target hardware scheme.
Step 315: and if the third total time consumption is greater than the first total time consumption, determining the second default hardware scheme recorded currently as a second candidate hardware scheme.
For example, assume that t2 > t3 > t1 > t4;
the time consumption of the sub-services is from small to large: sub-service 2, sub-service 3, sub-service 1 and sub-service 4.
The electronic device can then select sub-service 2 as the first target sub-service. Then, the electronic device may replace the hardware group of the sub-service 2 with the hardware group 2, where the sub-service 1, the sub-service 3, and the sub-service 4 still correspond to the hardware group 1, to form a second target hardware scheme.
Wherein, the second target hardware scheme records: sub-service 1-hardware group 1, sub-service 2-hardware group 2, sub-service 3-hardware group 1, sub-service 4-hardware group 1.
The electronic device may perform each sub-service using the second target hardware scheme, i.e. perform sub-service 1 using hardware group 1, perform sub-service 2 using hardware group 2, perform sub-service 3 using hardware group 1, perform sub-service 4 using hardware group 1, and then the electronic device may calculate a third total time consumption for performing each sub-service using the second target hardware scheme. Let T3 be the third total time consuming time T3, let T3 < T0.
The electronic device may compare the magnitudes of the values of T3 and N2 (i.e., T0). In this example, since T3 < T0, the electronic device may update the value of N2 to T3, update the recorded second default hardware scheme to the second target hardware scheme (i.e. sub-service 1-hardware group 1, sub-service 2-hardware group 2, sub-service 3-hardware group 1, sub-service 4-hardware group 1), and then reject the first target sub-service (i.e. sub-service 2) in the above-mentioned order, and reject the order of sub-service 2 as follows: sub-service 3, sub-service 1, sub-service 4.
The electronic device may then select the first ranked sub-service (i.e., sub-service 3) as the second target sub-service at the current ranking (i.e., sub-service 3, sub-service 1, sub-service 4). Then, the electronic device may replace the hardware group of the sub-service 3 with the hardware group 2, where the hardware groups corresponding to the sub-service 1, the sub-service 2, and the sub-service 4 are unchanged, to form a second target hardware scheme.
Wherein, the second target hardware scheme records: sub-service 1-hardware group 1, sub-service 2-hardware group 2, sub-service 3-hardware group 2, sub-service 4-hardware group 1.
The electronic device may execute each sub-service using the second target hardware scheme, i.e. execute sub-service 1 using hardware group 1, execute sub-service 2 using hardware group 2, execute sub-service 3 using hardware group 2, execute sub-service 4 using hardware group 1, and then the electronic device may calculate the total time consumption for executing each sub-service using the second target hardware scheme as T4, assuming T4 > T3.
The electronic device may compare the magnitudes of the values of T4 and N1 (i.e., T3). In this example, since T4 > T3, the electronic device may take the currently recorded second default hardware scheme (i.e., sub-service 1-hardware group 1, sub-service 2-hardware group 2, sub-service 3-hardware group 1, sub-service 4-hardware group 1) as the second candidate hardware scheme.
Step 316: the electronic equipment can select a hardware scheme with the minimum total time consumption from the first candidate hardware scheme and the second candidate hardware scheme, and the hardware for finally executing each sub-service is determined as a hardware group corresponding to each sub-service recorded by the selected hardware scheme.
For example, from the above examples, it can be seen that:
the first candidate hardware scheme (namely, a sub-service 1-hardware group 1, a sub-service 2-hardware group 1, a sub-service 3-hardware group 1, a sub-service 4-hardware group 2) has total time consumption of T1;
the total time consumption of the second candidate hardware scheme (sub-service 1-hardware group 1, sub-service 2-hardware group 2, sub-service 3-hardware group 1, sub-service 4-hardware group 1) is T3;
assuming T1 < T3, the electronic device may determine that the hardware that ultimately executes each sub-service is the hardware group corresponding to each sub-service recorded in the first candidate hardware scheme.
As shown in fig. 4c, the hardware that finally executes the sub-service 1 is the hardware group 1, the hardware that finally executes the sub-service 2 is the hardware group 1, the hardware that finally executes the sub-service 3 is the hardware group 1, and the hardware that finally executes the sub-service 4 is the hardware group 2.
As can be seen from the above description, since the present application does not allocate one hardware group to one service to perform service processing, but divides one service into a plurality of sub-services, and allocates a hardware group adapted to each sub-service, so that the total consumption of performing each sub-service using the hardware group allocated to each sub-service is minimized, hardware resources of a processor can be fully utilized when one service is performed.
Referring to fig. 5, fig. 5 is a hardware configuration diagram of an electronic device according to an exemplary embodiment of the present application.
The electronic device includes: a communication interface 501, a processor 502, a machine-readable storage medium 503, and a bus 504; wherein the communication interface 501, the processor 502 and the machine-readable storage medium 503 communicate with each other via a bus 504. The processor 502 may perform the on-processor hardware selection method described above by reading and executing machine-executable instructions in the machine-readable storage medium 503 corresponding to the on-processor hardware selection control logic.
The machine-readable storage medium 503 referred to herein may be any electronic, magnetic, optical, or other physical storage device that may contain or store information, such as executable instructions, data, or the like. For example, a machine-readable storage medium may be: volatile memory, nonvolatile memory, or similar storage medium. In particular, the machine-readable storage medium 503 may be RAM (Radom Access Memory, random access memory), flash memory, a storage drive (e.g., hard drive), a solid state drive, any type of storage disk (e.g., optical disk, DVD, etc.), or a similar storage medium, or a combination thereof.
Referring to fig. 6, fig. 6 is a block diagram of a hardware selection device on a processor according to an exemplary embodiment of the application. The device can be applied to the electronic equipment shown in fig. 5, and can comprise the following units.
A determining unit 601, configured to determine a plurality of hardware groups divided on a processor;
a selecting unit 602, configured to allocate a hardware group for each sub-service multiple times, where each hardware group allocated for each sub-service forms a hardware scheme; aiming at each hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the hardware scheme, and calculating the total time consumption for executing each sub-service; among the plurality of hardware schemes, the hardware scheme that is least in total time consumption is selected as the final hardware scheme.
The selecting unit 602 is specifically configured to determine a main hardware group and an auxiliary hardware group from two hardware groups;
respectively distributing the main hardware group to each sub-service to form a default hardware scheme, executing each sub-service by adopting the hardware group distributed for each sub-service in the default hardware scheme, recording the time consumption of executing the sub-service of each sub-service by adopting the default hardware scheme, and recording the total time consumption of executing each sub-service;
sequencing each sub-service for a plurality of times according to different appointed sequences based on the recorded sub-service time consumption of each sub-service; for each sort, sequentially selecting target sub-business according to the sort, replacing the hardware group corresponding to the selected target sub-business with an auxiliary hardware group, and forming a plurality of target hardware schemes without changing the hardware groups corresponding to other sub-business;
Aiming at each target hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the target hardware scheme, and calculating the total time consumption for executing each sub-service;
selecting a hardware scheme with the least total time consumption for executing each sub-service from the default hardware scheme and the multiple target hardware schemes as a candidate hardware scheme for the current sequencing;
and selecting the candidate hardware scheme with the minimum total consumption from the candidate hardware schemes determined by the multiple sequencing as a final hardware scheme.
Optionally, the selecting unit 602 is specifically configured to set values of a plurality of preset total time consumption variables N as follows: executing the total time consumption of the service by adopting a default hardware scheme; each ordering corresponds to a total time-consuming variable N;
the selecting unit 602 sequentially selects target sub-services according to the sequence for each sequence, and replaces the hardware group corresponding to the selected target sub-service with an auxiliary hardware group, and the hardware groups corresponding to other sub-services are unchanged, so as to form a plurality of target hardware schemes; aiming at each target hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the target hardware scheme, and calculating the total time consumption for executing each sub-service; when the hardware scheme with the least total time consumption for executing each sub-service is selected from the default hardware scheme and the plurality of target hardware schemes as the candidate hardware scheme for the current sequencing, the method is further used for:
For each sorting, determining a total time-consuming variable N corresponding to the sorting;
selecting the first sub-service of the sequence as a target sub-service according to the sequence;
replacing the hardware group corresponding to the target sub-service with an auxiliary hardware group, wherein the hardware groups corresponding to other sub-services are unchanged, so as to form a target hardware scheme;
executing each sub-service by adopting a hardware group allocated to each sub-service in the formed target hardware scheme, and calculating the target total time consumption for executing each sub-service by adopting the target hardware scheme;
if the total time consumption of the target is smaller than the value of N corresponding to the current sorting, updating the value of N corresponding to the current sorting into the total time consumption of the target, updating the recorded default hardware scheme into the target hardware scheme, eliminating the target sub-service in the current sorting, and returning the sub-service which is the forefront of the sorting selected according to the sorting as the target sub-service;
and if the target time consumption is greater than the value of N corresponding to the current sequencing, determining the recorded default hardware scheme as a candidate hardware scheme of the current sequencing.
Optionally, the selecting unit 602 is specifically configured to determine computing capabilities of two hardware groups when determining the main hardware group and the auxiliary hardware group in the two hardware groups; and selecting a hardware group with large computing power as a main hardware group, and selecting a hardware group with small computing power as an auxiliary hardware group.
Optionally, the different specified sequence includes: the time consumption of the sub-services corresponding to each sub-service is from small to large, and the time consumption of the sub-services corresponding to each sub-service is from large to small.
Optionally, the processor is a GPU, and the hardware on the GPU includes: a Cuda Core chip, a Tensor Core chip, and a DLA chip;
one of the two hardware groups includes: a Cuda Core chip and a Tensor Core chip; another hardware group includes: cuda Core chip and DLA chip.
The implementation process of the functions and roles of each unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present application. Those of ordinary skill in the art will understand and implement the present application without undue burden.
The foregoing description of the preferred embodiments of the application is not intended to be limiting, but rather to enable any modification, equivalent replacement, improvement or the like to be made within the spirit and principles of the application.

Claims (10)

1. A method of hardware selection on a processor, wherein a service that the processor needs to execute is divided into a plurality of sub-services, and wherein a plurality of hardware on the processor is divided into two hardware groups, the method comprising:
determining a main hardware group and an auxiliary hardware group in the two hardware groups;
respectively distributing the main hardware group to each sub-service to form a default hardware scheme, executing each sub-service by adopting the hardware group distributed for each sub-service in the default hardware scheme, recording the time consumption of executing the sub-service of each sub-service by adopting the default hardware scheme, and recording the total time consumption of executing each sub-service;
sequencing each sub-service for a plurality of times according to different appointed sequences based on the recorded sub-service time consumption of each sub-service;
for each sort, sequentially selecting target sub-business according to the sort, replacing the hardware group corresponding to the selected target sub-business with an auxiliary hardware group, and forming a plurality of target hardware schemes without changing the hardware groups corresponding to other sub-business; aiming at each target hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the target hardware scheme, and calculating the total time consumption for executing each sub-service; selecting a hardware scheme with the least total time consumption for executing each sub-service from the default hardware scheme and the multiple target hardware schemes as a candidate hardware scheme for the current sequencing;
And selecting the candidate hardware scheme with the minimum total consumption from the candidate hardware schemes determined by the multiple sequencing as a final hardware scheme.
2. The method of claim 1, wherein when the recording performs the total consumption of each sub-service, comprising:
setting the values of a plurality of preset total time consumption variables N as follows: executing the total time consumption of each sub-service by adopting a default hardware scheme; each ordering corresponds to a total time-consuming variable N;
sequentially selecting target sub-services according to the sequence for each sequence, replacing the hardware group corresponding to the selected target sub-services with an auxiliary hardware group, and forming a plurality of target hardware schemes without changing the hardware groups corresponding to other sub-services; aiming at each target hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the target hardware scheme, and calculating the total time consumption for executing each sub-service; selecting a hardware scheme with the least total time consumption for executing each sub-service from the default hardware scheme and the plurality of target hardware schemes as a candidate hardware scheme for the current sequencing, wherein the method comprises the following steps:
for each sorting, determining a total time-consuming variable N corresponding to the sorting;
selecting the first sub-service of the sequence as a target sub-service according to the sequence;
Replacing the hardware group corresponding to the target sub-service with an auxiliary hardware group, wherein the hardware groups corresponding to other sub-services are unchanged, so as to form a target hardware scheme;
executing each sub-service by adopting a hardware group allocated to each sub-service in the formed target hardware scheme, and calculating the target total time consumption for executing each sub-service by adopting the target hardware scheme;
if the total time consumption of the target is smaller than the value of N corresponding to the current sorting, updating the value of N corresponding to the current sorting into the total time consumption of the target, updating the recorded default hardware scheme into the target hardware scheme, eliminating the target sub-service in the current sorting, and returning the sub-service which is the forefront of the sorting selected according to the sorting as the target sub-service;
and if the total time consumption of the target is greater than the value of N corresponding to the current sequencing, determining the recorded default hardware scheme as a candidate hardware scheme of the current sequencing.
3. The method of claim 1, wherein determining the primary hardware set and the secondary hardware set among the two hardware sets comprises:
determining computing power of two hardware groups;
and selecting a hardware group with large computing power as a main hardware group, and selecting a hardware group with small computing power as an auxiliary hardware group.
4. The method of claim 1, wherein the different specified order comprises: the time consumption of the sub-services corresponding to each sub-service is from small to large, and the time consumption of the sub-services corresponding to each sub-service is from large to small.
5. The method of claim 1, wherein the processor is a GPU, and the hardware on the GPU comprises: a Cuda Core chip, a Tensor Core chip, and a DLA chip;
one of the two hardware groups includes: a Cuda Core chip and a Tensor Core chip; another hardware group includes: cuda Core chip and DLA chip.
6. A hardware selection device on a processor, wherein a service that the processor needs to execute is divided into a plurality of sub-services, and a plurality of hardware on the processor is divided into two hardware groups, the device comprising:
the determining unit is used for determining a main hardware group and an auxiliary hardware group in the two hardware groups;
the selecting unit is used for respectively distributing the main hardware group to each sub-service to form a default hardware scheme, executing each sub-service by adopting the hardware group distributed for each sub-service in the default hardware scheme, recording the time consumption of executing the sub-service of each sub-service by adopting the default hardware scheme, and recording the total time consumption of executing each sub-service;
Sequencing each sub-service for a plurality of times according to different appointed sequences based on the recorded sub-service time consumption of each sub-service;
for each sort, sequentially selecting target sub-business according to the sort, replacing the hardware group corresponding to the selected target sub-business with an auxiliary hardware group, and forming a plurality of target hardware schemes without changing the hardware groups corresponding to other sub-business; aiming at each target hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the target hardware scheme, and calculating the total time consumption for executing each sub-service; selecting a hardware scheme with the least total time consumption for executing each sub-service from the default hardware scheme and the multiple target hardware schemes as a candidate hardware scheme for the current sequencing;
and selecting the candidate hardware scheme with the minimum total consumption from the candidate hardware schemes determined by the multiple sequencing as a final hardware scheme.
7. The apparatus according to claim 6, wherein the selecting unit is configured to, when the total time consumed for executing each sub-service is recorded, set values of a preset plurality of total time consumption variables N as follows: executing the total time consumption of each sub-service by adopting a default hardware scheme; each ordering corresponds to a total time-consuming variable N;
The selecting unit sequentially selects target sub-services according to the sequence for each sequence, replaces the hardware group corresponding to the selected target sub-services with an auxiliary hardware group, and forms a plurality of target hardware schemes without changing the hardware groups corresponding to other sub-services; aiming at each target hardware scheme, executing each sub-service by adopting a hardware group allocated to each sub-service in the target hardware scheme, and calculating the total time consumption for executing each sub-service; when the hardware scheme with the least total time consumption for executing each sub-service is selected from the default hardware scheme and the plurality of target hardware schemes as the candidate hardware scheme for the current sequencing, the method is further used for:
for each sorting, determining a total time-consuming variable N corresponding to the sorting;
selecting the first sub-service of the sequence as a target sub-service according to the sequence;
replacing the hardware group corresponding to the target sub-service with an auxiliary hardware group, wherein the hardware groups corresponding to other sub-services are unchanged, so as to form a target hardware scheme;
executing each sub-service by adopting a hardware group allocated to each sub-service in the formed target hardware scheme, and calculating the target total time consumption for executing each sub-service by adopting the target hardware scheme;
If the total time consumption of the target is smaller than the value of N corresponding to the current sorting, updating the value of N corresponding to the current sorting into the total time consumption of the target, updating the recorded default hardware scheme into the target hardware scheme, eliminating the target sub-service in the current sorting, and returning the sub-service which is the forefront of the sorting selected according to the sorting as the target sub-service;
and if the total time consumption of the target is greater than the value of N corresponding to the current sequencing, determining the recorded default hardware scheme as a candidate hardware scheme of the current sequencing.
8. The apparatus according to claim 6, wherein the selection unit, in the determining of the main hardware group and the auxiliary hardware group among the two hardware groups, is specifically configured to determine computing power of the two hardware groups; and selecting a hardware group with large computing power as a main hardware group, and selecting a hardware group with small computing power as an auxiliary hardware group.
9. The apparatus of claim 6, wherein the different specified order comprises: the time consumption of the sub-services corresponding to each sub-service is from small to large, and the time consumption of the sub-services corresponding to each sub-service is from large to small.
10. The apparatus of claim 6, wherein the processor is a GPU, the hardware on the GPU comprising: a Cuda Core chip, a Tensor Core chip, and a DLA chip;
One of the two hardware groups includes: a Cuda Core chip and a Tensor Core chip; another hardware group includes: cuda Core chip and DLA chip.
CN201910239753.0A 2019-03-27 2019-03-27 Hardware selection method and device on processor Active CN111752700B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910239753.0A CN111752700B (en) 2019-03-27 2019-03-27 Hardware selection method and device on processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910239753.0A CN111752700B (en) 2019-03-27 2019-03-27 Hardware selection method and device on processor

Publications (2)

Publication Number Publication Date
CN111752700A CN111752700A (en) 2020-10-09
CN111752700B true CN111752700B (en) 2023-08-25

Family

ID=72672216

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910239753.0A Active CN111752700B (en) 2019-03-27 2019-03-27 Hardware selection method and device on processor

Country Status (1)

Country Link
CN (1) CN111752700B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1503150A (en) * 2002-11-19 2004-06-09 ��ʽ���綫֥ Task allocation method in multiprocessor system, and multiprocessor system
US20070038987A1 (en) * 2005-08-10 2007-02-15 Moriyoshi Ohara Preprocessor to improve the performance of message-passing-based parallel programs on virtualized multi-core processors
CN103488531A (en) * 2013-09-26 2014-01-01 中国船舶重工集团公司第七一六研究所 Software and hardware mixing real-time task scheduling method based on multi-core processor and FPGA

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8479214B2 (en) * 2008-09-30 2013-07-02 Microsoft Corporation Hardware throughput saturation detection
US8650431B2 (en) * 2010-08-24 2014-02-11 International Business Machines Corporation Non-disruptive hardware change

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1503150A (en) * 2002-11-19 2004-06-09 ��ʽ���綫֥ Task allocation method in multiprocessor system, and multiprocessor system
US20070038987A1 (en) * 2005-08-10 2007-02-15 Moriyoshi Ohara Preprocessor to improve the performance of message-passing-based parallel programs on virtualized multi-core processors
CN103488531A (en) * 2013-09-26 2014-01-01 中国船舶重工集团公司第七一六研究所 Software and hardware mixing real-time task scheduling method based on multi-core processor and FPGA

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
多核系统静态任务调度的启发式算法;宋宇鲲等;电子测量与仪器学报;第32卷(第5期);134-141 *

Also Published As

Publication number Publication date
CN111752700A (en) 2020-10-09

Similar Documents

Publication Publication Date Title
US9384053B2 (en) Task allocation optimization system, task allocation optimization method, and non-transitory computer readable medium storing task allocation optimization program
CN103226467A (en) Data parallel processing method and system as well as load balancing scheduler
US20060026052A1 (en) Scheduling system
US8869149B2 (en) Concurrency identification for processing of multistage workflows
CN112232439B (en) Pseudo tag updating method and system in unsupervised ReID
US10831738B2 (en) Parallelized in-place radix sorting
JP6908643B2 (en) Systems and methods for scheduling a collection of non-preemptive tasks in a multi-robot environment
JP2017016541A (en) Information processing apparatus, parallel computing system, job schedule setting program, and job schedule setting method
CN112085644A (en) Multi-column data sorting method and device, readable storage medium and electronic equipment
CN107977275B (en) Task processing method based on message queue and related equipment
US9003419B2 (en) Network balancing procedure that includes redistributing flows on arcs incident on a batch of vertices
CN111767023A (en) Data sorting method and data sorting system
CN112035234B (en) Distributed batch job distribution method and device
CN111752700B (en) Hardware selection method and device on processor
CN108268316A (en) The method and device of job scheduling
CN111736959A (en) Spark task scheduling method considering data affinity under heterogeneous cluster
US20230127869A1 (en) Method and apparatus with process scheduling
US8032439B2 (en) System and method for process scheduling
US8201023B2 (en) Test optimization
CN115421926A (en) Task scheduling method, distributed system, electronic device and storage medium
CN114399228A (en) Task scheduling method and device, electronic equipment and medium
CN113268539A (en) Big data mining task processing method based on cloud computing and big data mining system
JP6753521B2 (en) Computational resource management equipment, computational resource management methods, and programs
US20180365063A1 (en) System allocating links for data packets in an electronic system
CN112100446A (en) Search method, readable storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant