CN114780241A

CN114780241A - Acceleration card setting method, device and medium applied to server

Info

Publication number: CN114780241A
Application number: CN202210469708.6A
Authority: CN
Inventors: 樊嘉恒; 阚宏伟; 王彦伟; 黄宬
Original assignee: Guangdong Inspur Smart Computing Technology Co Ltd
Current assignee: Guangdong Inspur Smart Computing Technology Co Ltd
Priority date: 2022-04-30
Filing date: 2022-04-30
Publication date: 2022-07-22

Abstract

The invention discloses a method, a device and a medium for setting an acceleration card applied to a server, which are suitable for the technical field of servers. If the situation that the data of the computing power support currently required by the server exceeds the data of the computing power support which can be provided by the current accelerator card of the server is determined, applying for a corresponding target accelerator card in a resource pool of the accelerator card; sending the IP address of the target accelerator card to a server; and establishing a communication connection between the server and the target acceleration card according to the IP address so that the target acceleration card is put into the computing support service of the server. The existing server with less calculation power needs to be searched for a server with larger calculation power support, and then the problem of high maintenance cost caused by transferring the whole software environment is solved. The dynamic application of the accelerator card is realized without changing the software environment of the server, so that the maintenance cost is saved, and the computing power capability of the server is improved.

Description

Acceleration card setting method, device and medium applied to server

Technical Field

The present invention relates to the field of server technologies, and in particular, to a method, an apparatus, and a medium for setting an accelerator card applied to a server.

Background

In the internet industry, along with popularization of informatization and explosive increase of data volume, higher requirements are placed on computing power, and meanwhile, due to the rise of the fields of machine learning, artificial intelligence, unmanned driving, industrial simulation and the like, a Central Processing Unit (CPU) encounters more and more performance bottlenecks when Processing massive computing and massive data/pictures. In order to meet the demand for diversified computation, more and more scenes are introduced with hardware such as Graphics Processing Units (GPUs) and Field-Programmable Gate arrays (FPGAs) for acceleration, and heterogeneous computation is performed accordingly. Heterogeneous computing technologies are various computing units such as a CPU, a GPU, an Application Specific Integrated Circuit (ASIC), a coprocessor, and an FPGA, and form a hybrid system using different types of instruction sets and computing units of different architectures to execute a special way of computing.

At present, the heterogeneous computing technology uses the mode of the accelerator card as the mode of machine-card binding, inserts a required accelerator card device into a device of a server, and accesses the device of the accelerator card through a universal bus Component Interconnect Express (PCIE) bus. Software running on the server uses the corresponding accelerator card equipment according to the calculation power of the software and the algorithm. Fig. 1 is a schematic diagram of an application of an existing accelerator card, as shown in fig. 1, in a server group 1, on a server host 1-server host6, based on a currently required data amount, accelerator card collocation Processing calculation is performed, an FPGA, a GPU, and a specific processor (XPU) based on artificial intelligence calculation support the calculation power support required by host1, and different applications require different types of accelerator cards. In the existing hardware binding mode, when the algorithm or the calculation power of a host computer changes, the accelerator card cannot change the combination mode in time. For example, if the requirement for power is increased on host1, a new accelerator card needs to be accessed, and field workers need to operate the accelerator card to insert the accelerator card into host1, so that the maintenance cost is high. If the calculation power correspondingly supported by the accelerator card of the host of host1 is less, and if the calculation power is matched, the number of the internal accelerator cards is less, and other server hosts supported by larger calculation power need to be searched, the whole software environment of the host of host1 needs to be transferred, and a software architecture environment is built, which results in higher maintenance cost.

Therefore, how to reduce the maintenance cost of the host needs to be solved urgently by those skilled in the art.

Disclosure of Invention

The invention aims to provide a method, a device and a medium for setting an accelerator card applied to a server, which realize dynamic application of the accelerator card, save maintenance cost and improve computing power capability of the server.

In order to solve the above technical problem, the present invention provides an acceleration card setting method applied to a server, including:

if the situation that the data of the computing power support required by the server currently exceeds the data of the computing power support which can be provided by the current accelerator card of the server is determined, applying for a corresponding target accelerator card in an accelerator card resource pool, wherein the accelerator card resource pool comprises a plurality of accelerator card resources, and the target accelerator card is the accelerator card determined in each accelerator card in the accelerator card resource pool;

sending the IP address of the target accelerator card to a server;

and establishing a communication connection between the server and the target acceleration card according to the IP address so that the target acceleration card is put into the computing support service of the server.

Preferably, the step of determining that the data of the computing power support currently required by the server exceeds the data of the computing power support currently provided by the accelerator card of the server comprises the following steps:

the receiving server sends data requiring increased computing power support to determine that the current computing power support data of the server exceeds the current computing power support data provided by the accelerator card of the server.

Preferably, applying for the corresponding target accelerator card in the accelerator card resource pool includes:

determining the type of an accelerator card corresponding to data which needs to be supported by increased computing power of a server;

selecting a target resource pool corresponding to the type from a plurality of resource pools of an accelerator card resource pool, wherein the target resource pool comprises a target accelerator card;

determining the idle state of each accelerator card in a target resource pool;

and selecting an accelerator card currently matched with the server from the accelerator cards in the idle state as a target accelerator card.

Preferably, the establishing of the accelerator card resource pool comprises the following steps:

obtaining each acceleration card in advance;

establishing each resource pool according to the type of each accelerator card;

controlling each resource pool and a local switch corresponding to each resource pool to establish connection, wherein each resource pool and each local switch are in one-to-one correspondence;

and controlling the serial connection of each local switch to complete the establishment of the accelerator card resource pool.

Preferably, the types of the accelerator cards at least include an FPGA accelerator card type, a GPU accelerator card type, and an XPU accelerator card type.

Preferably, the method further comprises the following steps:

when the target acceleration card is put into the calculation force support service of the server, acquiring calculation force support data provided by an actual acceleration card of the server, wherein the calculation force support data currently required by the server is greater than or equal to the calculation force support data provided by the actual acceleration card;

determining a corresponding unused accelerator card according to the relation between the data of the calculation power support provided by the actual accelerator card and the data of the calculation power support currently required by the server;

and releasing the resources of the unused acceleration card so as to apply for use by the next server.

Preferably, the method further comprises the following steps:

and polling the idle state of each accelerator card in the accelerator card resource pool according to the time interval so as to apply the accelerator card corresponding to the idle state for the server.

In order to solve the above technical problem, the present invention further provides an accelerator card setting apparatus applied to a server, including:

the application module is used for applying for a corresponding target accelerator card in an accelerator card resource pool if the situation that the data of the calculation power support required by the server currently exceeds the data of the calculation power support which can be provided by the current accelerator card of the server is determined, wherein the accelerator card resource pool comprises the resources of a plurality of accelerator cards, and the target accelerator card is the accelerator card determined in each accelerator card in the accelerator card resource pool;

the sending module is used for sending the IP address of the target accelerator card to the server;

and the establishing module is used for establishing communication connection between the server and the target acceleration card according to the IP address so that the target acceleration card is put into the calculation support service of the server.

a memory for storing a computer program;

and a processor for implementing the steps of the acceleration card setting method applied to the server as described above when executing the computer program.

In order to solve the above technical problem, the present invention further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the acceleration card setting method applied to the server as described above.

The invention provides an accelerator card setting method applied to a server, which comprises the steps of applying for a corresponding target accelerator card in an accelerator card resource pool if determining that data supported by the computing power currently required by the server exceeds data supported by the computing power currently provided by the accelerator card of the server, wherein the accelerator card resource pool comprises resources of a plurality of accelerator cards, and the target accelerator card is determined from each accelerator card in the accelerator card resource pool; sending the IP address of the target accelerator card to a server; and establishing a communication connection between the server and the target acceleration card according to the IP address so that the target acceleration card is put into the computing support service of the server. When the calculation power of the current server is required to be increased, the corresponding accelerator card can be directly called, and the problem of high maintenance cost caused by the fact that field workers need to operate and insert in the existing hardware binding mode is solved. The existing server with less calculation power needs to be searched for a server with larger calculation power support, and then the problem of high maintenance cost caused by transferring the whole software environment is solved. The software environment of the server does not need to be changed, the accelerator card is dynamically applied, the maintenance cost is saved, and the computing power of the server is improved.

In addition, the invention also provides an accelerator card setting device and medium applied to the server, and the accelerator card setting device and medium have the same beneficial effects as the accelerator card setting method applied to the server.

Drawings

In order to more clearly illustrate the embodiments of the present invention, the drawings required for the embodiments will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained by those skilled in the art without inventive effort.

FIG. 1 is a schematic diagram of a conventional accelerator card application;

fig. 2 is a flowchart of an acceleration card setting method applied to a server according to an embodiment of the present invention;

fig. 3 is a structural diagram of an accelerator card setting apparatus applied to a server according to an embodiment of the present invention;

fig. 4 is a structural diagram of another accelerator card setting apparatus applied to a server according to an embodiment of the present invention;

fig. 5 is a schematic view of an application scenario of an accelerator card setting method applied to a server according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without any creative work belong to the protection scope of the present invention.

The core of the invention is to provide a method, a device and a medium for setting an accelerator card applied to a server, so as to realize dynamic application of the accelerator card, save maintenance cost and improve computing power capability of the server.

In order that those skilled in the art will better understand the disclosure, reference will now be made in detail to the embodiments of the disclosure as illustrated in the accompanying drawings.

It should be noted that, the accelerator card setting method applied to the server provided by the present invention is applicable to the technical field of servers, and the mode using the accelerator card is a machine-card binding mode currently, and the required accelerator card device is inserted into the device of the server, and accesses the accelerator card device through the PCIE bus. And software running on the server uses the corresponding accelerator card equipment according to the calculation power of the software and the algorithm. The accelerator card has larger system power consumption and heat, and has stronger performance compared with common display cards, graphic cards and computing cards. The accelerator card integrates a plurality of processing cores on the same display card substrate. The server can combine a plurality of different types of accelerator cards according to the algorithm requirement of the host computer. For example, host2 shown in fig. 1 combines 2 FPGAs with 1 GPU, host3 combines 2 GPUs with 1 FPGA, and the like. The computing power performance of the server is not specifically limited, and is set according to actual conditions.

Fig. 2 is a flowchart of an acceleration card setting method applied to a server according to an embodiment of the present invention, as shown in fig. 1, the method includes:

s11: and if the situation that the data of the calculation power support required by the server currently exceeds the data of the calculation power support provided by the current accelerator card of the server is determined, applying for a corresponding target accelerator card in an accelerator card resource pool, wherein the accelerator card resource pool comprises a plurality of accelerator card resources, and the target accelerator card is the accelerator card determined in each accelerator card in the accelerator card resource pool.

It can be understood that the computing power of the server is 1s, which words and texts can be run, and which memory and display memory are occupied by each word and text. The server realizes the capability of outputting results after processing data during computing, and on a server mainboard, the data transmission sequence is sequentially a CPU, an internal memory, a hard disk and a network card.

Specifically, if it is determined that the data of the computing power support currently required by the server exceeds the data of the computing power support that can be provided by the current accelerator card of the server, it indicates that the computing power support data provided by the accelerator card device operated by the server cannot meet the data obtained by the current computing power support of the server, and an accelerator card device needs to be added. And adding the accelerator card equipment to apply, and applying for a corresponding target accelerator card in the accelerator card resource pool at the moment. The running states of the server are two, one is that the currently running server does not have any accelerator card equipment, and the server needs to apply for the accelerator card resource pool according to actual computing power support data; the other is that the accelerator card device operated by the currently operated server cannot meet the larger calculation power support because the server operates the larger calculation power support, and other accelerator card devices need to be added to the original accelerator card device to apply for the accelerator card resource pool.

When the calculation power support data required by the server currently exceeds the calculation power support data which can be provided by the current accelerator card of the server, the determination mode can be various, one is that the server reports the calculation power support data required to be increased by the server to a management server center, and the management server center receives the data; one is that the management service center actively polls the computing power support data condition of the server according to the preset time, the management service center stores the computing power support data of the currently running accelerator card device of the server, when the difference between the obtained actual computing power support data of the server and the computing power support data of the currently running accelerator card device meets the threshold, the computing power support data of the currently running accelerator card of the server can not meet the data supported by the actual computing power, and the current running condition is confirmed to the server after the pre-judgment; and the other is a combined mode of the two situations, and after double determination, the corresponding target accelerator card is applied to the accelerator card resource pool.

The method only applies for a corresponding target accelerator card from an accelerator card resource pool when the calculation power support data required by the server currently exceeds the calculation power support data provided by the current accelerator card of the server; when the calculation power support data currently required by the server does not exceed the calculation power support data which can be provided by the current acceleration card of the server, the calculation power support of the server is satisfied by the acceleration card equipment of the currently running server, and other acceleration card equipment does not need to be added. In the two cases, one case is that the accelerator card device currently operated by the server exactly meets the computing power support of the server, and the other case is that the accelerator card device currently operated by the server is more, and remains, the computing power support data of the accelerator card is far greater than the computing power support of the server, so that the resources can be maximized, the remaining accelerator cards can be released to return to the resource pool of the accelerator card again, and the other servers can apply for use conveniently.

When accelerator card equipment needs to be added, application is applied to an accelerator card resource pool, at the moment, different types of accelerator cards are subjected to pooling processing in the accelerator card resource pool, the accelerator cards are managed in a unified mode, the accelerator card resource pool is provided with a plurality of resource pools, the accelerator card resource pool can be established into the same resource pool according to the accelerator cards of the unified type, servers with the same attribute can be assigned to specific resource pools according to the calculation force support data attribute of different servers, the accelerator cards among different servers can be used and isolated, and specific limitation is not made.

And applying for a corresponding target accelerator card in the accelerator card resource pool, wherein the target accelerator card is the accelerator card determined in each accelerator card in the accelerator card resource pool in order to distinguish other accelerator cards in the accelerator card resource pool. Specifically, the accelerator cards in the accelerator card resource pool are an accelerator card which is being put into the server and an accelerator card which is not put into the server, in order to distinguish the states of the accelerator cards, the accelerator cards may be subjected to state flags, and the setting of the specific state flag value is not particularly limited as long as the specific state flag value can be distinguished.

S12: and sending the IP address of the target accelerator card to a server.

In step S11, the target accelerator card corresponding to the application is further obtained, where the Internet Protocol (IP) address of the target accelerator card is obtained, the IP address is equivalent to a human id card and is used to mark an address in a TCP/IP communication Protocol, and the IP address includes four parts, i.e., an IP address, a subnet mask, a default gateway, and a Domain Name System (DNS). The subnet partition is commonly called to be used with Virtual Local Area Network (VLAN), which reduces Network traffic, improves Network performance, simplifies management and is easy to expand the range. Because the switch in the accelerator card resource pool needs networking, the IP address of the VLAN interface of the switch is configured, the VLAN switch is simultaneously located in multiple LANs, and the VLAN virtual interface of the switch is borne on a physical port, that is, after a physical port UP included in a certain VLAN, the VLAN virtual interface is UP. For the three-layer switch, IP addresses can be configured for a plurality of VLAN interfaces, and each VLAN interface can be accessed under the default condition; for a two-layer switch, the IP address configured for the VLAN interface can only be used for management.

S13: and establishing a communication connection between the server and the target acceleration card according to the IP address so that the target acceleration card is put into the computing support service of the server.

And establishing communication connection between the server and the target acceleration card according to the configured IP address. In particular, the link between the switches needs to be able to transmit data of multiple VLANs at the same time, and in order for the switches to be able to identify data frames from different VLANs, the frame must have a Tag inserted in the VLAN at the frame header before leaving the local switch. After the data frame carries the VLAN Tag, although the transmission path passes through a plurality of switches, it is possible to identify which VLAN the frame belongs to. When the switch forwards the frame to the destination host, the VLAN Tag is deleted and converted into a standard Ethernet frame structure. And the target accelerator card is conveniently put into the computing power support service of the server according to the established communication connection.

For example: host1 applies for 1 × FPGA, 1 × GPU, 1 × XPU, at this time, the management server will tell the server host1 the IP of these 3 cards, and form a closed-loop virtual small network with host1 and 3 accelerator cards in the form of VLAN, so that it forms a network isolation with other devices, and ensures that other devices cannot access them, thereby interfering with their operations and execution.

The method for setting the accelerator card applied to the server comprises the steps of applying for a corresponding target accelerator card in an accelerator card resource pool if the situation that data supported by the calculation power currently required by the server exceeds the data supported by the calculation power currently provided by the accelerator card of the server is determined, wherein the accelerator card resource pool comprises resources of a plurality of accelerator cards, and the target accelerator card is the accelerator card determined in each accelerator card in the accelerator card resource pool; sending the IP address of the target accelerator card to a server; and establishing a communication connection between the server and the target acceleration card according to the IP address so that the target acceleration card is put into the computing support service of the server. When the computing power of the current server is required to be increased, the corresponding accelerator card can be directly called, and the problem that the maintenance cost is high due to the fact that field workers need to operate and insert in the existing hardware binding mode is solved. The existing server with less calculation power needs to be searched for a server with larger calculation power support, and then the problem of high maintenance cost caused by transferring the whole software environment is solved. The dynamic application of the accelerator card is realized without changing the software environment of the server, so that the maintenance cost is saved, and the computing power capability of the server is improved.

On the basis of the above embodiment, determining that the data of the computing power support currently required by the server exceeds the data of the computing power support that can be provided by the current accelerator card of the server specifically includes:

Specifically, the management service center in this embodiment passively receives data sent by the server, that is, as a way for the server to actively send data. The content of the data is the data which needs the increased computing power support by the server, and when the server sends the increased computing power support data, the current computing power support data of the server is determined to exceed the computing power support data which can be provided by the current acceleration card of the server.

In addition to the mode of determining the accelerator card device which needs to be added in this embodiment, the method may further include actively acquiring calculation force support data of the server for the management service center, actively polling the calculation force support data of the server according to preset time, storing calculation force support data corresponding to the accelerator card device currently operated by the server, when a difference between the acquired actual calculation force support data of the server and the calculation force support data of the accelerator card device currently operated satisfies a threshold, indicating that the calculation force support data of the accelerator card of the currently operated server cannot satisfy the actual calculation force support data, and determining the current operation condition after the judgment is completed.

Another determination method may combine the two embodiments, and doubly determine that the backward accelerator card resource pool applies for the corresponding target accelerator card. The present invention is not particularly limited, and may be set according to actual conditions.

The receiving server provided by the embodiment sends the data requiring increased computing power support to determine that the current computing power support data of the server exceeds the computing power support data which can be provided by the current acceleration card of the server. The determination mode is simplified, and the setting speed of the accelerator card is increased according to the received data of increased computing power support sent by the server by the management service center.

On the basis of the foregoing embodiment, the applying for the corresponding target accelerator card in the accelerator card resource pool in step S11 specifically includes:

determining the type of an accelerator card corresponding to data which needs to be supported by the increased computing power of the server;

determining the idle state of each accelerator card in a target resource pool;

and selecting the acceleration card which is currently adapted to the server from the acceleration cards in the idle state as a target acceleration card.

Specifically, when it is determined that the server needs to add the accelerator card device, it is determined what type of accelerator card the data that the server needs to add the calculation power support corresponds to, and the acquired accelerator card resource pools are different according to different requirements. For example, if the data that the server needs to increase computational power support corresponds to the FPGA accelerator card type, the server needs to obtain the corresponding FPGA accelerator card from the FPGA accelerator card resource pool in the accelerator card resource pool. Determining the type corresponding to the accelerator card device that needs to be added by the server, and selecting a target resource pool corresponding to the type from the past resource pools of the accelerator card resource pool, wherein it can be understood that the target resource pool contains the target accelerator card.

After a target resource pool is selected, determining the working state of an accelerator card in the target resource pool, wherein the working state comprises a use state and an idle state. The target resource pool is provided with a plurality of accelerator cards, and the plurality of accelerator cards have different working states, so that the management service center has different choices. For example, there are 6 accelerator cards of the same type in the target resource pool, and if 3 accelerator cards are being put into the computing power support of the server, the operating state is the use state, and at this time, the management service center cannot call the accelerator card in the use state. And if the other 3 pieces of the data are not put into the calculation support of the server, the working state is an idle state and can be called in the accelerator card in the idle state. It is further necessary to select the accelerator card currently adapted to the server as the target accelerator card.

For the working state of the accelerator card, the accelerator card can be identified by setting a state flag bit of the accelerator card, and when the accelerator card is in a use state, the state flag bit can be set to be 1; when the accelerator card is in the idle state, the status flag bit may be set to 0, and the specific value of the status flag bit is set according to the actual situation, as long as the use status of each accelerator card in the resource pool can be distinguished.

The method for determining the type of the accelerator card corresponding to the data which needs to be supported by the server with increased computing power provided by the embodiment; selecting a target resource pool corresponding to the type from a plurality of resource pools of the accelerator card resource pool, wherein the target resource pool comprises a target accelerator card; determining the idle state of each accelerator card in a target resource pool; and selecting an accelerator card currently matched with the server from the accelerator cards in the idle state as a target accelerator card. The method is convenient for distinguishing the use states of the accelerator cards in the accelerator card resource pool, selects the accelerator card in an idle state as a target accelerator card for use, dynamically applies and calls in real time, and provides the computing power of the server under the condition of not changing the software environment of the server.

On the basis of the above embodiment, as a preferred embodiment, the establishing of the accelerator card resource pool includes the following steps:

obtaining each acceleration card in advance;

establishing each resource pool according to the type of each accelerator card;

controlling each resource pool and a local switch corresponding to each resource pool to establish connection, wherein each resource pool corresponds to each local switch one by one;

It should be noted that, the accelerator card resource pool is pre-established, so that each accelerator card is obtained in advance, where each accelerator card is different from each accelerator card in the target resource pool in the above embodiment, and each accelerator card therein is not subjected to pooling processing. And performing classified management according to the acquired types of the accelerator cards, for example, the type of the FPGA accelerator card is in the same resource pool, the type of the GPU accelerator card is in the same resource pool, so that later maintenance is facilitated, and the accelerator cards of the same type are managed in a centralized manner.

And controlling each resource pool to establish connection with the local switch corresponding to each resource pool, wherein each resource pool corresponds to each local switch one by one, and in combination with the above example, the resource pool of the FPGA acceleration card type corresponds to one switch, and the resource pool of the GPU acceleration card type corresponds to one switch. Then the resource pool is connected with the switch through the network, and finally the local switches are connected in a serial connection mode, so that an integral network is formed, and the resources of all the accelerator cards exist in the integral network. The local switch is used for connecting the physical channel of the server and the accelerator card and the channel of data interaction.

The accelerator card resource pool can be established according to the type of each accelerator card and the attribute of the accelerator card adapted to the server, for example, if the attributes of the accelerator card devices input by the host1 server and the host2 server are the same, and the types of the accelerator cards input by the host2 server are different, the accelerator card used by the server needs to establish the resource pool according to the type of the accelerator card, and the established resource pool is further isolated from the resource pools except the host1 server and the host2 server, and the resource pools established for the host1 server and the host2 server can only be called by the two servers. The resource pool may also be created according to the attribute of the accelerator card adapted to the server, and the present invention is not limited specifically.

The method comprises the steps of obtaining each acceleration card in advance; establishing each resource pool according to the type of each accelerator card; controlling each resource pool and a local switch corresponding to each resource pool to establish connection, wherein each resource pool and each local switch are in one-to-one correspondence; and controlling the serial connection of each local switch to complete the establishment of the accelerator card resource pool. The accelerator cards of the same type are managed in a centralized mode, so that the maintenance of the accelerator cards in the later period is facilitated, and the maintenance cost is saved.

On the basis of the above embodiment, the types of the accelerator cards at least include an FPGA accelerator card type, a GPU accelerator card type, and an XPU accelerator card type.

Specifically, the type of each accelerator card includes at least an FPGA accelerator card type, a GPU accelerator card type, an XPU accelerator card type, and other commonly used accelerator cards, and may also be a Secure Socket Layer (SSL). Each type of accelerator card has different advantageous features: the FPGA accelerator card has the programmable characteristic, and the processing algorithm is more flexible; the GPU accelerator card is mainly used for processing images and artificial intelligence algorithms; XPU accelerator card has fixed certain specific algorithm, has oneself advantage in specific acceleration field, and the acquirement in advance of each accelerator card can be acquireed according to actual conditions.

The types of the accelerator cards provided by the embodiment of the invention at least comprise an FPGA accelerator card type, a GPU accelerator card type and an XPU accelerator card type, so that software operated by a server can conveniently use corresponding accelerator card equipment according to the calculation power of the software and the algorithm requirement.

On the basis of the above embodiment, in order to save resources maximally and avoid resource waste, the method further includes:

when the target accelerator card is put into the computing power support service of the server, obtaining the data of the computing power support provided by the actual accelerator card of the server;

when the data of the calculation power support currently required by the server is larger than the data of the calculation power support provided by the actual accelerator card, determining the corresponding unused accelerator card according to the relationship between the data of the calculation power support provided by the actual accelerator card and the data of the calculation power support currently required by the server;

Specifically, after the server completes the calculation power support and increases the target accelerator card, a corresponding instruction is sent to the management service, and further, data of the calculation power support provided by the actual accelerator card of the server needs to be acquired, wherein the server applies for the target accelerator card in the accelerator card resource pool according to the currently required calculation power support data, and the data is greater than or equal to the data of the calculation power support provided by the actual accelerator card.

And determining a corresponding unused accelerator card according to the relationship between the two data, and when the data of the calculation power support currently required by the server is greater than the data of the calculation power support provided by the actual accelerator card, indicating that the target accelerator card newly added by the server is still remained, and further determining that the unused accelerator card releases the resources. When the data of the computing power support currently required by the server is equal to the data of the computing power support provided by the actual accelerator card, the newly added target accelerator card of the server is consistent with the computing power support of the current server and can not be processed.

In addition, after determining that the accelerator card is not used, the management service center can poll the unused accelerator card of the server according to preset time, so that the situation that the unused accelerator card cannot release resources to influence the application and use of the next server is avoided.

When a target accelerator card is put into a calculation power support service of a server, data of calculation power support provided by an actual accelerator card of the server are obtained; when the data of the calculation power support currently required by the server is larger than the data of the calculation power support provided by the actual accelerator card, determining the corresponding unused accelerator card according to the relationship between the data of the calculation power support provided by the actual accelerator card and the data of the calculation power support currently required by the server; and the unused acceleration card is subjected to resource release so as to facilitate the application and use of the next server, so that the resource maximization is facilitated, and the resource waste is avoided.

On the basis of the above embodiment, the method further comprises:

Specifically, the management service center can poll the idle state of each accelerator card in the accelerator card resource pool according to the time interval, so that when the server host suddenly needs to increase computing power, the idle state of the accelerator card corresponding to the target resource pool is shortened and searched and determined, and the target accelerator card is directly called and selected.

In addition, in order to shorten the time for applying for the target accelerator card to the accelerator card resource pool, the IP addresses, the algorithm support data, the corresponding types and the working states of the remaining accelerator cards in the plurality of resource pools of the accelerator card resource pool are arranged into a table, so that clear searching is facilitated, and the IP addresses, the algorithm support data, the corresponding types and the working states of the accelerator card equipment currently operated by the server and the table of the resource pool can be arranged into a database, so that the management service center can call the accelerator card conveniently.

In order to facilitate checking of related workers, the service condition of the accelerator card device currently operated by the server can be output, and as an output result, the service conditions of different servers and the condition of increasing the accelerator card devices are conveniently checked.

The method and the device for polling the idle state of each accelerator card in the accelerator card resource pool according to the time interval provided by the embodiment of the invention are convenient for applying the accelerator card corresponding to the idle state for the server, so that when the server host suddenly needs to increase the computing power, the idle state of the accelerator card corresponding to the target resource pool is shortened and searched and determined, and the target accelerator card is directly called and selected.

On the basis that the above detailed descriptions describe various embodiments corresponding to the setting method of the accelerator card applied to the server, the present invention further discloses a setting device of the accelerator card applied to the server corresponding to the above method, and fig. 3 is a structural diagram of the setting device of the accelerator card applied to the server provided by the embodiment of the present invention.

As shown in fig. 3, the accelerator card setting apparatus applied to the server includes:

an application module 11, configured to apply for a corresponding target accelerator card in an accelerator card resource pool if it is determined that data supported by computing power currently required by a server exceeds data supported by computing power that can be provided by a current accelerator card of the server, where the accelerator card resource pool includes resources of multiple accelerator cards, and the target accelerator card is an accelerator card determined in each accelerator card in the accelerator card resource pool;

a sending module 12, configured to send the IP address of the target accelerator card to a server;

and the establishing module 13 is used for establishing communication connection between the server and the target acceleration card according to the IP address so that the target acceleration card is put into the computing support service of the server.

Since the embodiment of the apparatus portion corresponds to the above-mentioned embodiment, please refer to the above-mentioned embodiment of the method portion for describing the embodiment of the apparatus portion, and details are not repeated herein.

The acceleration card setting device applied to the server comprises a first step of applying for a corresponding target acceleration card in an acceleration card resource pool if the fact that data of calculation force support required by the server currently exceeds data of calculation force support provided by a current acceleration card of the server is determined, wherein the acceleration card resource pool comprises resources of a plurality of acceleration cards, and the target acceleration card is the acceleration card determined in each acceleration card in the acceleration card resource pool; sending the IP address of the target accelerator card to a server; and establishing a communication connection between the server and the target acceleration card according to the IP address so that the target acceleration card is put into the computing power support service of the server. When the calculation power of the current server is required to be increased, the corresponding accelerator card can be directly called, and the problem of high maintenance cost caused by the fact that field workers need to operate and insert in the existing hardware binding mode is solved. The existing server with less calculation power needs to be searched for a server with larger calculation power support, and then the problem of high maintenance cost caused by transferring the whole software environment is solved. The dynamic application of the accelerator card is realized without changing the software environment of the server, so that the maintenance cost is saved, and the computing power capability of the server is improved.

Fig. 4 is a structural diagram of another accelerator card setting apparatus applied to a server according to an embodiment of the present invention, and as shown in fig. 4, the apparatus includes:

a memory 21 for storing a computer program;

and a processor 22 for implementing the steps of the acceleration card setting method applied to the server when executing the computer program.

The accelerator card setting device applied to the server provided by the embodiment may include, but is not limited to, a smart phone, a tablet computer, a notebook computer, or a desktop computer.

The processor 22 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The Processor 22 may be implemented in at least one hardware form of a Digital Signal Processor (DSP), an FPGA, and a Programmable Logic Array (PLA). The processor 22 may also include a main processor, which is a processor for processing data in the wake state, also referred to as a CPU, and a coprocessor; a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 22 may be integrated with a GPU that is responsible for rendering and drawing the content that the display screen needs to display. In some embodiments, processor 22 may also include an Artificial Intelligence (AI) processor for processing computational operations related to machine learning.

Memory 21 may include one or more computer-readable storage media, which may be non-transitory. Memory 21 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 21 is at least used for storing the following computer program 211, wherein after being loaded and executed by the processor 22, the computer program can implement the relevant steps of the acceleration card setting method applied to the server disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 21 may also include an operating system 212, data 213, and the like, and the storage manner may be a transient storage or a permanent storage. Operating system 212 may include Windows, Unix, Linux, etc., among others. Data 213 may include, but is not limited to, data related to the accelerator card setting method applied to the server, and the like.

In some embodiments, the accelerator card setting device applied to the server may further include a display screen 23, an input/output interface 24, a communication interface 25, a power supply 26, and a communication bus 27.

Those skilled in the art will appreciate that the configuration shown in FIG. 4 does not constitute a limitation of the accelerator card setting means applied to the server and may include more or fewer components than those shown.

The processor 22 calls the instructions stored in the memory 21 to implement the acceleration card setting method applied to the server provided in any of the above embodiments.

The acceleration card setting device applied to the server comprises a first step of applying for a corresponding target acceleration card in an acceleration card resource pool if the fact that data of calculation force support required by the server currently exceeds data of calculation force support provided by a current acceleration card of the server is determined, wherein the acceleration card resource pool comprises resources of a plurality of acceleration cards, and the target acceleration card is the acceleration card determined in each acceleration card in the acceleration card resource pool; sending the IP address of the target accelerator card to a server; and establishing a communication connection between the server and the target acceleration card according to the IP address so that the target acceleration card is put into the computing support service of the server. When the computing power of the current server is required to be increased, the corresponding accelerator card can be directly called, and the problem that the maintenance cost is high due to the fact that field workers need to operate and insert in the existing hardware binding mode is solved. The existing server with less calculation power and larger calculation power support is required to be searched, and the problem of high maintenance cost caused by migration of the whole software environment is solved. The software environment of the server does not need to be changed, the accelerator card is dynamically applied, the maintenance cost is saved, and the computing power of the server is improved.

Further, the present invention also provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by the processor 22, realizes the steps of the acceleration card setting method applied to the server as described above.

It is to be understood that if the method in the above embodiments is implemented in the form of software functional units and sold or used as a stand-alone product, it can be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention, which is substantially or partially contributed by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and performs all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a portable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

For the introduction of the computer-readable storage medium provided by the present invention, please refer to the above method embodiment, which is not described herein again, and has the same beneficial effects as the above setting method for the accelerator card applied to the server.

Fig. 5 is a schematic view of an application scenario of the accelerator card setting method applied to the server according to the embodiment of the present invention, and as shown in fig. 5, the method includes pooling FPGAs, GPUs, and XPU accelerator cards, performing centralized management on the accelerator cards of a unified type, connecting the accelerator cards to a local switch 33 through a network, and finally connecting the switches to form an integrated network to complete establishment of an accelerator card resource pool, where the accelerator card resource pool includes an FPGA accelerator card resource pool 35, a GPU accelerator card resource pool 36, and an XPU accelerator card resource pool 37. After networking is completed, the management server 34 is responsible for managing the accelerator card and allocating resources. When the server host131 needs an accelerator card device and needs to apply for the management server 34, and the host1 applies for 1 × FPGA, 1 × GPU, and 1 × XPU, at this time, the management server 34 will tell the server host131 the IP addresses of 3 accelerator cards, and form a closed-loop virtual small network by the host1 and the 3 accelerator cards through VLAN communication, that is, the accelerator card applied for by the light color arrow forms network isolation with other devices, so that other devices cannot access the network, and operation and execution interference are avoided. The server host232 also applies for resources according to its own needs.

The application scenarios of the method for setting an accelerator card applied to a server according to the embodiment of the present invention are introduced above, and the method has the same beneficial effects as the above-mentioned method for setting an accelerator card applied to a server.

The present invention provides a method for setting an accelerator card applied to a server, an apparatus for setting an accelerator card applied to a server, and a medium thereof. The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, without departing from the principle of the present invention, it is possible to make various improvements and modifications to the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims

1. An acceleration card setting method applied to a server is characterized by comprising the following steps:

if the situation that data of computing power support required by the server currently exceeds data of computing power support which can be provided by a current accelerator card of the server is determined, applying for a corresponding target accelerator card in an accelerator card resource pool, wherein the accelerator card resource pool comprises a plurality of accelerator card resources, and the target accelerator card is the accelerator card determined in each accelerator card in the accelerator card resource pool;

sending the IP address of the target accelerator card to the server;

and establishing communication connection between the server and the target accelerator card according to the IP address so that the target accelerator card can be put into computing power support service of the server.

2. The method as claimed in claim 1, wherein the step of determining that the data of the computing power support currently required by the server exceeds the data of the computing power support that can be provided by the current acceleration card of the server comprises:

and receiving data which is sent by the server and needs increased computing power support so as to determine that the current computing power support data of the server exceeds the computing power support data which can be provided by the current accelerator card of the server.

3. The method as claimed in claim 2, wherein the step of applying for the corresponding target accelerator card from within the accelerator card resource pool comprises:

determining the type of an accelerator card corresponding to the data which needs to be increased in computational power support by the server;

selecting a target resource pool corresponding to the type from a plurality of resource pools of the accelerator card resource pool, wherein the target resource pool comprises the target accelerator card;

determining an idle state of each accelerator card in the target resource pool;

and selecting an accelerator card currently adapted to the server from the accelerator cards in the idle state as the target accelerator card.

4. The method for setting the accelerator card applied to the server according to claim 3, wherein the establishing of the accelerator card resource pool comprises the following steps:

obtaining each acceleration card in advance;

creating each resource pool according to the type of each acceleration card;

5. The setting method for accelerator cards applied to servers of claim 4, wherein the types of each accelerator card at least comprise an FPGA accelerator card type, a GPU accelerator card type, and an XPU accelerator card type.

6. The setting method of the accelerator card applied to the server according to any one of claims 1 to 5, further comprising:

when the data of the calculation power support currently required by the server is larger than the data of the calculation power support provided by the actual accelerator card, determining a corresponding unused accelerator card according to the relationship between the data of the calculation power support provided by the actual accelerator card and the data of the calculation power support currently required by the server;

7. The setting method of the accelerator card applied to the server according to claim 4, further comprising:

polling the idle state of each accelerator card in the accelerator card resource pool according to a time interval so as to apply the accelerator card corresponding to the idle state for the server.

8. An acceleration card setting device applied to a server, comprising:

an application module, configured to apply for a corresponding target accelerator card in an accelerator card resource pool if it is determined that data of computing power support currently required by a server exceeds data of computing power support that can be provided by a current accelerator card of the server, where the accelerator card resource pool includes resources of multiple accelerator cards, and the target accelerator card is an accelerator card determined in each accelerator card in the accelerator card resource pool;

and the establishing module is used for establishing the communication connection between the server and the target accelerator card according to the IP address so that the target accelerator card can be put into the computing power support service of the server.

9. An acceleration card setting device applied to a server is characterized by comprising:

a memory for storing a computer program;

a processor for implementing the steps of the acceleration card setting method applied to the server according to any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the acceleration card setting method applied to a server according to any one of claims 1 to 7.