CN113312182B

CN113312182B - Cloud computing node, file processing method and device

Info

Publication number: CN113312182B
Application number: CN202110852505.0A
Authority: CN
Inventors: 朴君
Original assignee: Alibaba Cloud Computing Ltd
Current assignee: Alibaba Cloud Computing Ltd
Priority date: 2021-07-27
Filing date: 2021-07-27
Publication date: 2022-01-11
Anticipated expiration: 2041-07-27
Also published as: CN113312182A

Abstract

One or more embodiments of the present specification provide a cloud computing node, a file processing method, and an apparatus. The cloud computing node comprises a host, an expansion board card and a coprocessor, wherein the expansion board card and the coprocessor are assembled on the host, an application program for scheduling computing tasks runs on the host, and the application program applies for a memory space in the coprocessor in advance: an application program running on a host receives a calculation task aiming at a target file, and calls a disk drive which is pre-deployed on the host to issue a reading instruction aiming at the target file to an expansion board card, wherein the reading instruction comprises a source address and a destination address of the target file, and the destination address points to a memory space which is applied by the application program in a coprocessor in advance; the expansion board card receives a reading instruction, acquires a target file based on a source address, and stores the target file into a memory space based on a destination address and informs the coprocessor; the coprocessor reads the target file and performs the computational tasks for the target file.

Description

Cloud computing node, file processing method and device

Technical Field

One or more embodiments of the present disclosure relate to the field of cloud computing technologies, and in particular, to a cloud computing node, a file processing method, and an apparatus.

Background

Thanks to the widespread use of cloud technology, storage and computing resources in a network are shared between different enterprises and individual users in a more efficient manner, and cloud computing nodes capable of performing heterogeneous computing are gradually stepping into the horizon of people in order to meet the development needs of AI and blockchain technologies.

Heterogeneous computation, namely on the basis of the original CPU of the host, a coprocessor such as a GPU and an FPGA is further combined to execute a computation task, and the coprocessor is a processor for assisting the host CPU to perform operation processing, and is called heterogeneous computation because instruction sets of the host CPU and the coprocessor such as the GPU and the FPGA are different.

When a computing task is executed by a coprocessor, a target file needs to be stored in a host memory and then copied to a coprocessor memory from the host memory for processing.

Disclosure of Invention

In view of this, one or more embodiments of the present disclosure provide a cloud computing node, a file processing method and a file processing apparatus.

In order to achieve the above purpose, one or more embodiments of the present disclosure provide the following technical solutions:

according to a first aspect of one or more embodiments of the present disclosure, a cloud computing node is provided, including a host, an expansion board and a coprocessor, where the expansion board and the coprocessor are mounted on the host, an application program for scheduling a computing task is run on the host, and the application program applies for a memory space in the coprocessor in advance; wherein:

an application program running on a host computer receives a computing task aiming at a target file;

the application program calls a disk drive which is pre-deployed on the host to issue a reading instruction aiming at the target file to an expansion board card; the reading instruction comprises a source address and a destination address of the target file, and the destination address points to a memory space which is applied by the application program in a coprocessor in advance;

the expansion board card receives the reading instruction and acquires the target file based on the source address;

the expansion board card transfers the target file to the memory space based on the destination address and informs the coprocessor;

and the coprocessor reads the target file and executes the calculation task aiming at the target file.

According to a second aspect of one or more embodiments of the present disclosure, a file processing method is provided, where the file processing method is applied to a host equipped with an expansion board and a coprocessor, where the host, the expansion board and the coprocessor form a cloud computing node deployed in a cloud computing network, an application program for scheduling a computing task is run on the host, and the application program applies for a memory space in the coprocessor in advance; the method comprises the following steps:

the application program calls a disk drive which is deployed on the host in advance to issue a reading instruction aiming at the target file to an expansion board card, the reading instruction comprises a source address and a destination address of the target file, the destination address points to a memory space which is applied in a coprocessor by the application program in advance, so that the expansion board card obtains the target file based on the source address, and transfers the target file to the memory space based on the destination address, and the coprocessor reads the target file and executes a calculation task aiming at the target file.

According to a third aspect of one or more embodiments of the present disclosure, a file processing method is provided, where the file processing method is applied to an expansion board mounted on a host, the host is further equipped with a coprocessor, the host, the expansion board and the coprocessor form a cloud computing node deployed in a cloud computing network, an application program for scheduling a computing task is run on the host, and the application program applies for a memory space in the coprocessor in advance; the method comprises the following steps:

receiving a reading instruction aiming at a target file issued by an application program running on a host; the reading instruction is sent by the application program after receiving a calculation task aiming at the target file, the reading instruction comprises a source address and a destination address of the target file, and the destination address points to a memory space which is applied by the application program in a coprocessor in advance;

acquiring the target file based on the source address;

and transferring the target file to the memory space based on the destination address and informing the coprocessor to enable the coprocessor to read the target file and execute the calculation task aiming at the target file.

According to a fourth aspect of one or more embodiments of the present disclosure, a file processing apparatus is provided, which is operated on a host equipped with an expansion board and a coprocessor, where the host, the expansion board and the coprocessor form a cloud computing node deployed in a cloud computing network, and the apparatus is used to schedule a computing task and applies for a memory space in the coprocessor in advance; the device comprises a task receiving unit and an instruction issuing unit:

the task receiving unit is used for receiving a calculation task aiming at a target file;

the instruction issuing unit calls a disk drive which is deployed on the host in advance to issue a reading instruction aiming at the target file to an expansion board card, wherein the reading instruction comprises a source address and a destination address of the target file, the destination address points to a memory space which is applied in a coprocessor by an application program in advance, so that the expansion board card acquires the target file based on the source address, and transfers the target file to the memory space based on the destination address, and the coprocessor reads the target file and executes a calculation task aiming at the target file.

According to a fifth aspect of one or more embodiments of the present disclosure, a file processing apparatus is provided, where the file processing apparatus is run on an expansion board mounted on a host, the host is further equipped with a coprocessor, the host, the expansion board and the coprocessor form a cloud computing node deployed in a cloud computing network, the host runs an application program for scheduling a computing task, and the application program applies for a memory space in the coprocessor in advance; the device comprises an instruction receiving unit, a file acquiring unit and a file unloading unit:

the instruction receiving unit receives a reading instruction which is issued by an application program running on the host and aims at the target file; the reading instruction is sent by the application program after receiving a calculation task aiming at the target file, the reading instruction comprises a source address and a destination address of the target file, and the destination address points to a memory space which is applied by the application program in a coprocessor in advance;

the file acquisition unit acquires the target file based on the source address;

and the file dump unit is used for dumping the target file into the memory space based on the destination address and informing the coprocessor to enable the coprocessor to read the target file and execute the calculation task aiming at the target file.

According to a sixth aspect of one or more embodiments of the present specification, a computer-readable storage medium is provided, on which computer instructions are stored, and the computer instructions, when executed by a processor, implement the steps of the method according to the second and third aspects.

As can be seen from the above description, in this specification, an application program running on a host and scheduling a computation task applies for a memory space in a coprocessor in advance, and after receiving the computation task for a target file, the application program issues a read instruction carrying a source address and a destination address of the target file to an expansion board card, so that the expansion board card forwards the target file acquired based on the source address to the memory space of the coprocessor to which the destination address points, so that the coprocessor executes computation. By adopting the technical scheme provided by the specification, the expansion board can transfer the target file to the memory space of the coprocessor after the target file is obtained, and the memory copy through the host is not needed, so that the data transmission link is shortened, the data transmission delay is reduced, the host resource and the bus bandwidth are effectively saved, and the file processing efficiency is improved.

Drawings

Fig. 1 is an architectural diagram of a cloud computing node, shown in an exemplary embodiment.

Fig. 2 is a flowchart illustrating a method for implementing file processing based on a cloud computing node according to an exemplary embodiment.

Fig. 3 is a schematic diagram illustrating a data transmission link of a target file when file processing is implemented based on a cloud computing node according to an exemplary embodiment.

Fig. 4 is an architectural diagram of a cloud computing node, shown in another example embodiment.

Fig. 5 is a schematic diagram illustrating a data transmission link of a target file when file processing is implemented based on a cloud computing node according to another exemplary embodiment.

Fig. 6 is a flowchart illustrating a file processing method according to an exemplary embodiment.

Fig. 7 is a flowchart illustrating a file processing method according to another exemplary embodiment.

Fig. 8 is a schematic structural diagram of an electronic device in which an apparatus for implementing file processing based on a cloud computing node according to an exemplary embodiment is located.

FIG. 9 is a block diagram of a file processing apparatus according to an example embodiment.

Fig. 10 is a block diagram of a file processing apparatus according to another exemplary embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with one or more embodiments of the present specification. Rather, they are merely examples of apparatus and methods consistent with certain aspects of one or more embodiments of the specification, as detailed in the claims which follow.

It should be noted that: in other embodiments, the steps of the corresponding methods are not necessarily performed in the order shown and described herein. In some other embodiments, the method may include more or fewer steps than those described herein. Moreover, a single step described in this specification may be broken down into multiple steps for description in other embodiments; multiple steps described in this specification may be combined into a single step in other embodiments.

A cloud computing network widely used by enterprises and personal users actually comprises a plurality of cloud computing nodes, wherein the cloud computing nodes can be implemented on any physical device in the cloud computing network in a node instance manner, and one or more cloud computing nodes can be implemented on the same physical device. The user can send related tasks to the cloud computing node through a client or a browser running on terminal equipment such as a smart phone and a personal computer to use storage resources and computing resources in the cloud computing network, and the cloud computing node feeds back results to the user after completing the tasks.

In order to improve the processing efficiency of the cloud computing node in processing computing tasks relating to the technical fields of AI (Artificial Intelligence), block chains and the like, heterogeneous computing is adopted, and a coprocessor for executing the computing tasks is introduced into the original architecture of the cloud computing node. And an application program for scheduling the computing tasks is operated on a host of the cloud computing node, receives the computing tasks aiming at the target files sent by the user and schedules the computing tasks to the corresponding coprocessors for execution.

In the related art, a target file acquired by an expansion board card is transferred to a host memory, and a coprocessor receives a notification of an application program and copies the target file from the host memory to its own memory to execute a calculation task for the target file. It can be seen that the data transmission mode of transferring the target file to the host memory and then copying the target file to the coprocessor memory wastes host resources and bus bandwidth, and has a negative impact on file processing efficiency.

In view of this, the present specification provides a cloud computing node, a file processing method and a file processing apparatus to solve the deficiencies in the related art.

Referring to fig. 1, fig. 1 is a schematic diagram illustrating an architecture of a cloud computing node according to an exemplary embodiment of the present disclosure.

The cloud computing node adopts a master-slave mode architecture and comprises a host 110, an expansion board card 120 and a coprocessor 130, wherein the expansion board card 120 and the coprocessor 130 are assembled on the host 110. The expansion board 120 and the coprocessor 130 are assembled on the host 110 in a detachable assembly manner such as insertion, fastening, and the like, and/or a fixed assembly manner such as welding, which is not limited in this specification.

The host 110, that is, the main device body of the cloud computing node, may be composed of various components such as a CPU, a memory, and a power supply, which are not described in detail herein.

In this embodiment, the host 110 is a core of the cloud computing node, and runs one or more applications for scheduling computing tasks thereon, for example, the applications may be implemented based on a buffer (Convolutional Architecture for Fast Feature Embedding, etc.), a tensrflow (google machine learning algorithm library), and the like, and they are triggered by computing tasks sent by users to schedule the computing tasks to corresponding coprocessors for execution. The computing task is a computing task performed by the coprocessor 130 for a target file that is pre-stored in the local storage device 140 or the remote storage device 150 accessible by the expansion board 120. The same application may schedule one or more computational tasks, in particular, the same application may schedule one or more computational tasks for execution by the same coprocessor. It will be appreciated that the host 110 may also have applications running thereon that undertake other computing tasks or storage tasks.

The application program may run in the physical host 110, or may run in a Virtual Machine (VM) on the host 110, where they may run in the same Virtual Machine or in different Virtual machines; in this specification, whether the application program runs in the virtual machine, and the number and the distribution manner of the application program running in the virtual machine are not particularly limited.

The expansion board 120 may be a MoC (MicroServer on Card), i.e., a physical board device including a CPU, a memory, and a network Card, and can perform data interaction with the host 110 through a PCIe (Peripheral Component Interconnect Express) bus. The expansion board 120 is generally used to externally connect an external device such as a storage device to the host 110, and can present the external device to the host 110 in the form of a virtual device for management, so that functions such as expansion and management can be implemented for the host 110.

In this embodiment, the expansion board 120 may access a local storage device 140 such as an HDD (Hard Disk Drive, or called mechanical Hard Disk), an SSD (Solid State Disk), and the like mounted on the expansion board 120, and may also access a remote storage device 150 such as a cloud Disk.

The coprocessor 130 may be a GPU, an FPGA, an ASIC, or the like, and the coprocessor 130 includes various components such as a memory besides a processor chip, which is not described herein. The coprocessor 130 performs data interaction with the host 110 and the expansion board 120 through a PCIe bus, is controlled by an application program on the host 110, and is used to perform some specific computing tasks. For example, a GPU (Graphics Processing Unit) is a coprocessor for performing Graphics operations; an FPGA (Field-Programmable Gate Array) is a coprocessor which can be programmed by software and can realize various computing tasks by using pre-built logic units and allocable resources; an ASIC (Application Specific Integrated Circuit) is a specialized coprocessor designed and customized to perform Specific computing tasks.

It should be noted that the host 110, the expansion board 120, and the coprocessor 130 are only main components of the cloud computing node in this embodiment, and the cloud computing node may also include other components, which is not limited in this specification.

The type and number of the expansion board cards mounted on the host 110 are not specifically limited in this specification; on the basis of the expansion board card 120, the host 110 may further be equipped with other expansion board cards, and the types of the other expansion board cards may be the same as the expansion board card 120 or different from the expansion board card 120. The type and number of coprocessors mounted on the host 110 are not specifically limited in this description; on the basis of the coprocessor 130, the host 110 may be equipped with other coprocessors, and the types of the other coprocessors may be the same as the coprocessor 130 or different from the coprocessor 130.

Referring to fig. 2, fig. 2 is a schematic flow chart of a method for implementing file processing based on the cloud computing node shown in fig. 1.

The method for realizing file processing by the cloud computing node comprises the following specific steps:

at step 202, an application running on a host computer receives a computing task for a target file.

Step 204, the application program calls a disk drive which is deployed on the host in advance to issue a reading instruction for the target file to an expansion board card; the reading instruction comprises a source address and a destination address of the target file, and the destination address points to a memory space which is applied by the application program in the coprocessor in advance.

A user accesses an application program that schedules a computing task running on the cloud computing node host 110 through a client or a browser running on a terminal device, and sends the computing task for a target file to the application program.

After receiving the calculation task for the target file, the application program running on the host 110 calls a disk drive pre-deployed on the host to issue a read instruction for the target file to the expansion board 120. The read command issued by the application is transmitted from the host 110 to the expansion board 120 via the PCIe bus between the host 110 and the expansion board 120.

The reading instruction carries a source address and a destination address of the target file, the source address indicates the address of the target file, and the destination address indicates the unloading address of the target file.

The source address of the target file in the reading instruction can be determined by a file identifier carried in the computing task sent by the user.

In an alternative implementation, the target File is stored in a File System running on the host, the File System includes, but is not limited to, EXT4 (Fourth-generation Extended File System), EXT2 (Second-generation Extended File System), and the like, and the File System manages files in the form of directories and File names.

When the target file is stored in the file system, the file identifier may be a file path of the target file in the file system, and specifically, the file path may be composed of a directory, a subdirectory, and a file name of the target file in the file system.

Before sending a calculation task aiming at the target file, a user accesses the file system to finish the storage of the target file and obtains a file path of the target file in the file system. The file path is to be carried in a computational task for the target file, and the application may query the file system based on the file path to determine a source address of the target file.

Each file stored in the file system is actually stored in the local storage device 140 or the remote storage device 150 accessible by the expansion board 120, the expansion board 120 may present the local storage device 140 and the remote storage device 150 to the host 110 in the form of a block device, and specific contents for creating the block device may refer to related technologies, which are not described herein again. In this implementation, the application program running on the host 110 queries the file system to determine that the source address of the target file is actually the start address and data length of the target file in the block device.

In order to improve the space utilization of the storage device, the target file may be further stored in the same block device in segments, and the application program queries the source address of the target file determined by the file system, which may include a plurality of start addresses and corresponding data lengths of the target file in the block device, that is, addresses of a plurality of segments.

For example, the file path of the target file carried in the computing task in the file system may be: directory a/subdirectory B/filename C, based on the file path, the source address of the target file determined by the application program query includes addresses of 3 sectors, which are respectively sector 1: start address 1+ data length 1, sector 2: start address 2+ data length 2, section 3: the start address 3+ data length 3, i.e. the target file is divided into parts 1, 2, 3 to be stored on sections 1, 2, 3, respectively.

The destination address of the target file in the read instruction is an address of a memory space in the coprocessor 130, which is applied by the application program in advance.

The application program applies for a memory space to the corresponding coprocessor 130 in advance according to the scheduled calculation task, the application program can apply for the memory space to the coprocessor 130 after being started, and circularly uses the memory space when scheduling the calculation task, and the application program can also apply for the memory space to the coprocessor 130 for the calculation task to use after receiving the calculation task every time.

Specifically, the application program calls a coprocessor driver pre-deployed on the host 110 to issue a memory application instruction to the coprocessor 130, and receives an address of the allocated memory space returned by the coprocessor 130.

The coprocessor driver, which is disposed on the host 110 after the coprocessor is assembled, is loaded into the operating system of the host 110 to implement management and control of the coprocessor 130 by the host 110.

The application program generates a read instruction based on the determined source address and the address of the applied memory space, and calls a disk drive to issue an instruction to the expansion board card 120.

The disk drive may be an NVME SSD (Non-Volatile Memory Express SSD, solid state disk under the interface specification of the host controller of the nonvolatile Memory), and manages SSDs with different specifications by a standardized interface, and has good transmission rate and latency characteristics, in this embodiment, the disk drive is used to realize management and control of the host 110 on the local storage device 140 and the remote storage device 150 that can be accessed by the expansion board 120.

Based on the foregoing, when the target file is stored in the same device in different sections, the application program sends a plurality of reading instructions to the expansion board 120 based on a plurality of start addresses and corresponding data lengths determined by querying and addresses of the memory space of the coprocessor 130 applied in advance, so that the expansion board 120 obtains part of file data of the target file stored in different sections from the local storage device 140 or the remote storage device 150 for a plurality of times based on the start address and the data length included in each reading instruction.

And step 206, the expansion board card receives the reading instruction and acquires the target file based on the source address.

The expansion board 120 acquires the target file based on the source address in the reading instruction, specifically, when the source address points to the local storage device 140, the expansion board 120 may read the target file in the local storage device 140, and when the source address points to the remote storage device 150, the expansion board 120 may send an acquisition packet for the target file to the remote storage device 150 through a network card on the expansion board 120 to obtain the target file returned by the expansion board 120.

It should be noted that, when the expansion board 120 acquires the target file based on the source address, the target file is not acquired in a file form but acquired in a data form, and based on the foregoing, when the target file is stored in the same device in different sections, the expansion board 120 may acquire part of file data in the target file stored in the different sections for multiple times respectively based on multiple reading instructions issued by the application program, that is, may read each part of the target file in the local storage device 140, which is located in a different section, for multiple times respectively, or receive each part of the target file in a different section, which is returned by the remote storage device 150. Assuming that the target file is divided into parts 1, 2, and 3 and stored in sections 1, 2, and 3, respectively, the application program issues read instructions 1, 2, and 3 to the expansion board 120 based on sections 1, 2, and 3 in step 204, and the expansion board 120 performs data acquisition based on the read instructions 1, 2, and 3, respectively, to obtain parts 1, 2, and 3 of the target file stored in sections. It can be understood that the expansion board 120 may be discontinuous when acquiring the file data in the sections 1 and 2 based on the reading instructions 1 and 2, for example, after acquiring the portion 1 of the target file stored in the section 1, the expansion board 120 may acquire the portion 2 of the target file stored in the section 2 after performing acquisition of another file based on a reading instruction issued by another application program.

And step 208, the expansion board transfers the target file to the memory space of the coprocessor based on the destination address in the reading instruction and notifies the coprocessor.

The target file obtained from the local storage device 140 or the remote storage device 150 is transmitted to the memory space of the coprocessor 120 by the expansion board 120 through the PCIe bus between the expansion board and the host 110 and the PCIe bus between the host 110 and the coprocessor 130.

Specifically, the chips for obtaining the target file in the expansion board 120 include an SoC chip for obtaining the target file from the local storage device 140 and a network card FPGA chip for obtaining the target file from the remote storage device 150, and the target file may be transferred to a Memory space of the coprocessor 130 by using a DMA (Direct Memory Access) technology, and the coprocessor 130 is notified to read and process the target file by modifying a flag register and the like.

Based on the foregoing, the expansion board 120 does not forward the target file to the coprocessor 130 in a file form but in a data form, when the target file is stored in the same device in different sections, the expansion board 120 obtains the partial file data of the target file based on the source address in the read instruction in step 206, part of the acquired file data of the target file is transferred to the memory space of the coprocessor 130 based on the destination address in the reading instruction, for a plurality of received reading instructions, the expansion board 120 may sequentially execute data acquisition and data dump corresponding to each reading instruction based on the sequence of receiving the reading instructions, that is, the data are sequentially obtained based on the source address in each read instruction, and the obtained data are respectively transferred to the memory of the coprocessor 130 based on the destination address in each read instruction.

Step 210, the coprocessor reads the target file and executes the calculation task for the target file.

After the coprocessor 130 acquires that the target file to be processed is stored in the memory space, the target file is read, and a calculation task for the target file is executed, so that a task result is fed back to a user.

Based on the foregoing example, the coprocessor 130 may process the target file not in a file form but in a data form, when the target file is stored in the same device in segments, after the expansion board 120 transfers part of the acquired file data of the target file to the memory space of the coprocessor 130 based on the destination address in the read instruction in step 208, the coprocessor 130 may read and process part of the file data of the target file, and for the file data transferred by the expansion board 120 for multiple times, the coprocessor 130 may sequentially read and process the file data based on the order in which the file data is transferred to the memory space of the coprocessor 130.

Referring to fig. 3, fig. 3 is a schematic diagram of a data transmission link of a target file in a cloud computing node when file processing is implemented based on the cloud computing node shown in fig. 1.

Taking the example that the target file is located in the local storage device 140, the target file acquired by the local storage device 140 is first transmitted to the system chip of the expansion board 120, then transmitted to the host 110 by the system chip through the PCIe bus between the expansion board 120 and the host 110, and then transmitted to the memory space of the coprocessor 130 by the PCIe bus between the host 110 and the coprocessor 130.

Taking the example that the target file is located in the remote storage device 150, the target file acquired by the remote storage device 150 is first transmitted to the network card of the expansion board 120, then transmitted to the host 110 by the network card FPGA through the PCIe bus between the expansion board 120 and the host 110, and then transmitted to the memory space of the coprocessor 130 through the PCIe bus between the host 110 and the coprocessor 130.

To further shorten the data transmission link of the target file in the cloud computing node, the present specification proposes another architecture manner of the cloud computing node, please refer to fig. 4, where fig. 4 is a schematic architecture diagram of the cloud computing node shown in another exemplary embodiment of the present specification, in which the coprocessor 430 is mounted on the expansion board 420, and the expansion board 420 presents the coprocessor 430 to the host 410 in the form of an external device. The way in which the coprocessor 430 is assembled on the expansion board card 420 includes detachable assembling ways such as insertion and fastening, and/or fixed assembling ways such as welding, which are not limited in this specification.

When implementing file processing based on the cloud computing node shown in fig. 4, the expansion board 420 in step 208 transfers the target file to the memory space of the coprocessor 430 based on the destination address in the read instruction, and may implement the target file by using a PCIe bus between the expansion board 420 and the coprocessor 430.

Referring to fig. 5, fig. 5 is a schematic diagram of a data transmission link of a target file in the cloud computing node when the cloud computing node shown in fig. 4 is used to implement file processing.

Taking the example that the target file is located in the local storage device 440, the target file acquired by the local storage device 440 is first transmitted to the system chip of the expansion board 420, and then transmitted to the memory space of the coprocessor 430 by the system chip through the PCIe bus between the expansion board 420 and the coprocessor 430.

Taking the example that the target file is located in the remote storage device 450, the target file acquired by the remote storage device 450 is first transmitted to the network card of the expansion board 420, and then transmitted to the memory space of the coprocessor 430 by the network card FPGA through the PCIe bus between the expansion board 420 and the coprocessor 430.

Compared with the data transmission link shown in fig. 3 and implemented based on the cloud computing node shown in fig. 1, the data transmission link shown in fig. 5 and implemented based on the cloud computing node shown in fig. 4 further shortens the link for transmitting the target file from the expansion board card to the coprocessor, reduces the transmission delay of the target file, and improves the file processing efficiency.

In order to make those skilled in the art better understand the technical solution provided by the present specification, the following provides a further detailed description of the file processing method implemented based on the cloud computing node shown in fig. 2, and the embodiments described later are only a part of the embodiments illustrated by way of example, and not all embodiments.

In a certain cloud computing node of the cloud computing network, an application program for scheduling an AI image processing task of face recognition is operated on a host computer of the certain cloud computing node, and the AI image processing task is executed by a GPU. And the application program calls a GPU driver arranged on a host in advance to apply for a memory from the GPU, so that the address of the memory space allocated by the GPU is obtained.

The method comprises the steps that a user accesses the application program through a cloud client running on terminal equipment of the user, and sends an AI image processing task aiming at a target image file, wherein the target image file is stored in a cloud disk in advance through a file system running on a host, and the AI image processing task sent by the user comprises a file path of the target image file in the file system.

After receiving the AI image processing task for the target image file, the application program queries the file system to determine a source address of the target image file based on the file path carried in the calculation task, where the target image file is stored in segments, and the source address includes 3 start addresses and corresponding data lengths.

According to the determined source address and the address of the memory space in the GPU applied in advance, the application program generates 3 reading instructions of the target image file and calls an NVME SSD driver to issue the 3 reading instructions to the expansion board card in sequence, and the reading instructions are transmitted to the expansion board card through a PCIe bus between the host and the expansion board card.

After the expansion board of the cloud computing node receives each reading instruction, based on a source address in the reading instruction, sending an acquisition message for acquiring partial file data in the target image file to the cloud disk through a network card on the expansion board.

And after part of file data of the target image file returned by the cloud disk is received by the network card, the DMA controller in the FPGA chip of the network card is transferred and stored into the memory space of the GPU based on the destination address in the reading instruction.

The GPU is inserted on the expansion board card, partial file data of the target image file is transmitted to a memory space of the GPU through a PCIe bus between the expansion board card and the GPU, and the expansion board card changes the flag bit register when the partial file data of the target image file is written into the memory space of the GPU so as to inform the GPU to read and process the partial file data of the target image file.

And after the GPU learns that the file data to be processed is stored in the memory space of the GPU based on the change of the zone bit register, reading the file data from the memory space and executing a corresponding AI image processing task.

By analogy, the expansion board card sequentially completes data acquisition corresponding to the 3 reading instructions, and transfers part of the acquired file data of the target image file to the memory space of the GPU, and correspondingly, the GPU also reads part of the file data of the target file from the memory space respectively until all the file data of the target image file are processed.

Based on the above embodiments, this specification provides a file processing method, which is applied to a host equipped with an expansion board card and a coprocessor, where the host, the expansion board card, and the coprocessor form a cloud computing node deployed in a cloud computing network, an application program for scheduling a computing task is run on the host, and the application program applies for a memory space in the coprocessor in advance. The host, the expansion board card and the coprocessor can be the host 110, the expansion board card 120 and the coprocessor 130 shown in fig. 1, or the host 410, the expansion board card 420 and the coprocessor 430 shown in fig. 4, respectively.

Referring to fig. 6, fig. 6 is a flowchart illustrating a file processing method according to an exemplary embodiment of the present disclosure.

The file processing method may specifically include the steps of:

at step 602, an application running on a host computer receives a computing task for a target file.

Step 604, the application program calls a disk drive pre-deployed on the host to issue a read instruction for the target file to an expansion board card, so that the expansion board card obtains the target file and stores the target file to the memory space, and a coprocessor reads the target file and executes a calculation task for the target file; the reading instruction comprises a source address and a destination address of the target file, and the destination address points to a memory space which is applied by the application program in the coprocessor in advance.

Accordingly, the present specification provides a file processing method, which is applied to an expansion board card assembled on a host, the host is further assembled with a coprocessor, the host, the expansion board card and the coprocessor form a cloud computing node deployed in a cloud computing network, an application program for scheduling a computing task is run on the host, and the application program applies for a memory space in the coprocessor in advance. The host, the expansion board card and the coprocessor can be the host 110, the expansion board card 120 and the coprocessor 130 shown in fig. 1, or the host 410, the expansion board card 420 and the coprocessor 430 shown in fig. 4, respectively.

Referring to fig. 7, fig. 7 is a flowchart illustrating a file processing method according to another exemplary embodiment of the present disclosure.

The file processing method may specifically include the steps of:

step 702, the expansion board receives a reading instruction for a target file issued by an application program running on a host; the reading instruction is sent by the application program after receiving a calculation task aiming at the target file, the reading instruction comprises a source address and a destination address of the target file, and the destination address points to a memory space which is applied in a coprocessor by the application program in advance.

Step 704, the expansion board obtains the target file based on the source address.

Step 706, the expansion board transfers the target file to the memory space based on the destination address and notifies the coprocessor, so that the coprocessor reads the target file and executes the calculation task for the target file.

The specific implementation manner of the file processing method shown in fig. 6 and fig. 7 may refer to the implementation steps in the method for implementing file processing based on the cloud computing node shown in fig. 2, and is not described herein again.

Fig. 8 is a schematic structural diagram of an electronic device in which an apparatus for implementing file processing based on a cloud computing node according to an exemplary embodiment is located. Referring to fig. 8, at the hardware level, the apparatus includes a processor 802, an internal bus 804, a network interface 806, a memory 808, and a non-volatile memory 810, but may also include hardware required for other services. One or more embodiments of the present description may be implemented in software, such as by the processor 802 reading a corresponding computer program from the non-volatile storage 810 into the memory 808 and then executing the computer program. Of course, besides software implementation, the one or more embodiments in this specification do not exclude other implementations, such as logic devices or combinations of software and hardware, and so on, that is, the execution subject of the following processing flow is not limited to each logic unit, and may also be hardware or logic devices.

Referring to fig. 9, a file processing apparatus shown in fig. 9 can be applied to the electronic device shown in fig. 8 to implement the technical solution of the present specification. The file processing device runs on a host computer which is provided with an expansion board card and a coprocessor, the host computer, the expansion board card and the coprocessor form a cloud computing node which is deployed in a cloud computing network, the device is used for scheduling computing tasks, and a memory space is applied in the coprocessor in advance; the device comprises a task receiving unit 910 and an instruction issuing unit 920:

the task receiving unit 910 receives a calculation task for a target file;

the instruction issuing unit 920 calls a disk drive pre-deployed on the host to issue a read instruction for the target file to an expansion board, where the read instruction includes a source address and a destination address of the target file, and the destination address points to a memory space pre-applied in a coprocessor by the application program, so that the expansion board obtains the target file based on the source address, and forwards the target file to the memory space based on the destination address, and the coprocessor reads the target file and executes a calculation task for the target file.

Optionally, a file system for managing files is also operated on the host;

the instruction issuing unit 920 queries the file system after receiving the computing task, and determines a source address of the target file.

Optionally, the document processing apparatus further includes a memory application unit 930;

the memory application unit 930 invokes a coprocessor driver pre-deployed on the host to issue a memory application instruction to the coprocessor, and receives an address of the allocated memory space returned by the coprocessor.

Referring to fig. 10, another document processing apparatus shown in fig. 10 can also be applied to the electronic device shown in fig. 8 to implement the technical solution of the present specification. The file processing device runs on an expansion board card assembled on a host, the host is also assembled with a coprocessor, the host, the expansion board card and the coprocessor form a cloud computing node deployed in a cloud computing network, an application program for scheduling a computing task runs on the host, and the application program applies for a memory space in the coprocessor in advance; the device comprises an instruction receiving unit 1010, a file acquiring unit 1020 and a file unloading unit 1030:

the instruction receiving unit 1010 receives a reading instruction for a target file issued by an application program running on a host; the reading instruction is sent by the application program after receiving a calculation task aiming at the target file, the reading instruction comprises a source address and a destination address of the target file, and the destination address points to a memory space which is applied by the application program in a coprocessor in advance;

the file acquiring unit 1020 that acquires the target file based on the source address;

the file dump unit 1030 is configured to dump the target file into the memory space based on the destination address and notify the coprocessor, so that the coprocessor reads the target file and executes the calculation task for the target file.

Optionally, the coprocessor is mounted on the expansion board card;

the file unloading unit 1030 unloading the target file into the memory space based on the destination address, including:

and based on the destination address, the target file is transferred and stored into the memory space through a PCIe bus on the expansion board card.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.

In a typical configuration, a computer includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic disk storage, quantum memory, graphene-based storage media or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used in one or more embodiments of the present description to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of one or more embodiments herein. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

The above description is only for the purpose of illustrating the preferred embodiments of the one or more embodiments of the present disclosure, and is not intended to limit the scope of the one or more embodiments of the present disclosure, and any modifications, equivalent substitutions, improvements, etc. made within the spirit and principle of the one or more embodiments of the present disclosure should be included in the scope of the one or more embodiments of the present disclosure.

Claims

1. A cloud computing node comprises a host, an expansion board card and a coprocessor, wherein the expansion board card and the coprocessor are assembled on the host, an application program for scheduling a computing task runs on the host, and the application program applies for a memory space in the coprocessor in advance; wherein:

an application program running on a host computer receives a computing task aiming at a target file; the target file is stored in a local storage device of the expansion board card or a remote storage device accessed by the expansion board card in advance;

2. The cloud computing node of claim 1, the host further having a file system running thereon that manages files;

the process of determining the source address of the target file comprises the following steps:

and the application program queries the file system after receiving the computing task and determines the source address of the target file.

3. The cloud computing node of claim 1, the application process for memory space in the co-processor comprising:

and the application program calls a coprocessor driver which is pre-deployed on the host to issue a memory application instruction to the coprocessor and receives the address of the allocated memory space returned by the coprocessor.

4. The cloud computing node of claim 1, the co-processor mounted on the expansion board;

the transferring, by the expansion board card, the target file to the memory space based on the destination address includes:

and the expansion board card is used for transferring the target file to the memory space through a PCIe bus on the expansion board card based on the destination address.

5. The cloud computing node of claim 1, the co-processor comprising a Graphics Processor (GPU), a Field Programmable Gate Array (FPGA), and an Application Specific Integrated Circuit (ASIC).

6. A file processing method is applied to a host computer provided with an expansion board card and a coprocessor, the host computer, the expansion board card and the coprocessor form a cloud computing node deployed in a cloud computing network, an application program for scheduling a computing task runs on the host computer, and the application program applies for a memory space in the coprocessor in advance; the method comprises the following steps:

7. The method of claim 6, the host further having a file system running thereon that manages files;

8. The method of claim 6, the application process for memory space in the coprocessor comprising:

9. A file processing method is applied to an expansion board card assembled on a host, the host is also assembled with a coprocessor, the host, the expansion board card and the coprocessor form a cloud computing node deployed in a cloud computing network, an application program for scheduling a computing task runs on the host, and the application program applies for a memory space in the coprocessor in advance; the method comprises the following steps:

receiving a reading instruction aiming at a target file issued by an application program running on a host; the reading instruction is sent by the application program after receiving a calculation task aiming at the target file, the reading instruction comprises a source address and a destination address of the target file, and the destination address points to a memory space which is applied by the application program in a coprocessor in advance; the target file is stored in a local storage device of the expansion board card or a remote storage device accessed by the expansion board card in advance;

acquiring the target file based on the source address;

10. The method of claim 9, the co-processor mounted on the expansion board;

the transferring the target file to the memory space based on the destination address includes:

11. A file processing device runs on a host computer provided with an expansion board card and a coprocessor, wherein the host computer, the expansion board card and the coprocessor form a cloud computing node deployed in a cloud computing network, the device is used for scheduling computing tasks and a memory space is applied in the coprocessor in advance; the device comprises a task receiving unit and an instruction issuing unit:

the task receiving unit is used for receiving a calculation task aiming at a target file; the target file is stored in a local storage device of the expansion board card or a remote storage device accessed by the expansion board card in advance;

the instruction issuing unit calls a disk drive which is deployed on the host in advance to issue a reading instruction aiming at the target file to an expansion board card, the reading instruction comprises a source address and a destination address of the target file, the destination address points to a memory space which is applied in a coprocessor in advance, so that the expansion board card obtains the target file based on the source address, and transfers the target file to the memory space based on the destination address, and the coprocessor reads the target file and executes a calculation task aiming at the target file.

12. A file processing device runs on an expansion board card assembled on a host, the host is also assembled with a coprocessor, the host, the expansion board card and the coprocessor form a cloud computing node deployed in a cloud computing network, an application program for scheduling a computing task runs on the host, and the application program applies for a memory space in the coprocessor in advance; the device comprises an instruction receiving unit, a file acquiring unit and a file unloading unit:

the instruction receiving unit receives a reading instruction which is issued by an application program running on the host and aims at the target file; the reading instruction is sent by the application program after receiving a calculation task aiming at the target file, the reading instruction comprises a source address and a destination address of the target file, and the destination address points to a memory space which is applied by the application program in a coprocessor in advance; the target file is stored in a local storage device of the expansion board card or a remote storage device accessed by the expansion board card in advance;

the file acquisition unit acquires the target file based on the source address;

13. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 6-10.