CN111090503A - High-cost-performance cloud computing service system based on FPGA chip - Google Patents
High-cost-performance cloud computing service system based on FPGA chip Download PDFInfo
- Publication number
- CN111090503A CN111090503A CN201811248254.XA CN201811248254A CN111090503A CN 111090503 A CN111090503 A CN 111090503A CN 201811248254 A CN201811248254 A CN 201811248254A CN 111090503 A CN111090503 A CN 111090503A
- Authority
- CN
- China
- Prior art keywords
- fpga
- module
- scheduling
- unit
- computing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5072—Grid computing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/544—Buffers; Shared memory; Pipes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/547—Remote procedure calls [RPC]; Web services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/54—Indexing scheme relating to G06F9/54
- G06F2209/549—Remote execution
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Stored Programmes (AREA)
Abstract
The invention discloses a high cost performance cloud computing service system based on an FPGA chip, belonging to the technical field of computer hardware and software FPGA chips, comprising a cloud storage system, a scheduling program module and an FPGA computing unit; the cloud storage system comprises a pod node unit, the cloud storage system is connected to the scheduling program module through the pod node unit, and the pod node unit and the scheduling program module correspond to each other; the scheduling program module and the FPGA computing unit are corresponding to each other, and data and command interaction can be carried out between the scheduling program module and the FPGA computing unit; the scheduling program module is used for receiving data, parameters and commands by the PS end, the parallel computing module of the cloud is realized by adopting an arm linux operating system and an FPGA high cost performance chip, the FPGA cloud clustering scheme is low in cost, high in efficiency and high in configurability, private socket service is configured in an arm system, and different requests are realized by adopting a customized protocol.
Description
Technical Field
The invention relates to the technical field of computer hardware and software FPGA chips, in particular to a high-cost-performance cloud computing service system based on an FPGA chip.
Background
The existing FPGA chips used as cloud computing schemes are high-end configuration chips and communicate through a PCIE interface, and the scheme has the disadvantages that the chips are very expensive, and meanwhile, higher pc-end configurations are required on a PCIE interface, and the configurations themselves are also expensive; the existing product has high cost, the FPGA chip has low utilization rate, convenient deployment and flexible configuration, the cost of the existing product is very high when the existing product is deployed in a large scale due to the limitation of the cost, and one server can only be used as a calculation mode if a high-end chip is used.
Based on the technical scheme, the invention designs the high-cost-performance cloud computing service system based on the FPGA chip to solve the problems.
Disclosure of Invention
The invention aims to provide a high-cost-performance cloud computing service system based on an FPGA chip, and aims to solve the problems that the existing product provided in the background technology is high in cost, low in utilization rate of the FPGA chip, convenient to deploy and flexible in configuration, and the existing product is very expensive in cost when deployed in a large scale due to cost limitation.
In order to achieve the purpose, the invention provides the following technical scheme: a high-cost-performance cloud computing service system based on an FPGA chip comprises a cloud storage system, a scheduling program module and an FPGA computing unit;
the cloud storage system comprises a pod node unit, the cloud storage system is connected to the scheduling program module through the pod node unit, and the pod node unit and the scheduling program module correspond to each other;
the scheduling program module and the FPGA computing unit are corresponding to each other, and data and command interaction can be carried out between the scheduling program module and the FPGA computing unit;
and the scheduling program module is used for receiving data, parameters and commands by the PS terminal.
Preferably, the data transmitted by the scheduler module includes FPGA control parameters, weight parameters and data required by the CNN network, and operation commands, and is sequentially transmitted according to the order of the FPGA control parameters, the weight parameters and the data and the operation commands required by the CNN network.
Preferably, the scheduling program module comprises an entering program distribution module and an entering scheduling module;
the program entering distribution module is used for receiving a request command from a cloud end and transmitting the picture cutting pair entering scheduling module;
the program entering distribution module comprises an FPGA parallel computing selection unit, and the FPGA parallel computing selection unit is used for selecting whether to enter the program entering distribution module;
the scheduling module is used for merging the scheduled and segmented pictures and returning the result data to the user in an original way;
the system comprises an entry scheduling module and an entry program distribution module, wherein an FPGA (field programmable gate array) calculation unit support type judgment unit is arranged between the entry scheduling module and the entry program distribution module and used for judging whether to enter the entry scheduling module or not.
Preferably, the entry scheduling module includes a routing module, the routing module is configured to transmit wireless data of the FPGA computing unit, and an internet access of an arm system included in the FPGA computing unit corresponds to the routing module.
Preferably, the number of the FPGA computing units is at least one, each group of the FPGA computing units includes at least 24 independent FPGA chips, and each group of the FPGA chips is embedded in the arm system.
Preferably, when the PS terminal sends a start command to the PL terminal, the FPGA chip uses the control parameters transmitted before to start the calculation.
Compared with the prior art, the invention has the beneficial effects that: the parallel computing module of the cloud is realized by adopting an arm linux operating system and an FPGA high-cost-performance chip, and the FPGA cloud cluster scheme is low in cost, high in efficiency and high in configurability, private socket service is configured in an arm system, and different requests are realized by adopting a customized protocol; the method has the advantages that the FPGA chip with high cost performance and the micro service provided by the embedded system are used, the cloud computing system is directly involved through a network cable, the problem of high cost is solved, one FPGA server has 24 FPGA chips with high cost performance which independently operate, the cost is 1/10 of the GPU server with the same computing power, and meanwhile, different algorithm structures can be configured according to needs and the FPga chips and the micro service can operate simultaneously. The micro-service provided by the embedded system mainly refers to that a socket server is operated in the embedded linux system to process a calculation request from a client side. When the calculation amount of the request calculation is large, the scheduling program is used for distributing tasks, for example, when CNN is used for image processing, the pictures can be cut, the cut pictures are respectively sent to different socket servers, the servers call the FPGA for calculation after receiving the request, the calculation result is returned to the client, and the client combines the pictures. According to the scheme, hardware cost can be greatly saved, and a large amount of deployment can be carried out at the cloud end only by the switch, the network cable and the high cost performance FPGA chip. Meanwhile, the configuration at the cloud end is flexible, an FPGA cluster can be conveniently established, and different functions can be configured for the cluster according to the needs of users, for example, different network structures for operating CNN can be configured in the cluster, and the network structures for operating RNN can also be configured.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a general framework diagram of cloud computing according to the present invention;
FIG. 2 is a diagram of an FPGA computational unit of the present invention;
FIG. 3 is a block diagram of the present invention calling FPGA calculation modules;
FIG. 4 is a diagram of the internal computing framework of the computing unit of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-4, the present invention provides a technical solution: a high-cost-performance cloud computing service system based on an FPGA chip comprises a cloud storage system, a scheduling program module and an FPGA computing unit;
the cloud storage system comprises a pod node unit, the cloud storage system is connected to the scheduling program module through the pod node unit, and the pod node unit and the scheduling program module correspond to each other;
the scheduling program module and the FPGA computing unit are corresponding to each other, and data and command interaction can be carried out between the scheduling program module and the FPGA computing unit;
and the scheduling program module is used for receiving data, parameters and commands by the PS terminal.
In a further embodiment, the data transmitted by the scheduler module includes FPGA control parameters, weight parameters and data required by the CNN network and operation commands, and is sequentially transmitted according to the order of the FPGA control parameters, the weight parameters and the data required by the CNN network and the operation commands;
in a further embodiment, the scheduler module comprises an entry program distribution module and an entry scheduling module;
the program entering distribution module is used for receiving a request command from a cloud end and transmitting the picture cutting pair entering scheduling module;
the program entering distribution module comprises an FPGA parallel computing selection unit, and the FPGA parallel computing selection unit is used for selecting whether to enter the program entering distribution module;
the scheduling module is used for merging the scheduled and segmented pictures and returning the result data to the user in an original way;
the system comprises an entry scheduling module, an entry program distribution module and an FPGA (field programmable gate array) calculation unit, wherein an FPGA calculation unit support type judgment unit is arranged between the entry scheduling module and the entry program distribution module and used for judging whether to enter the entry scheduling module or not;
in a further embodiment, the entry scheduling module includes a routing module, the routing module is configured to transmit wireless data of the FPGA computing unit, and an internet access of an arm system included in the FPGA computing unit corresponds to the routing module;
in a further embodiment, the number of the FPGA computing units is at least one, and each group of the FPGA computing units includes at least 24 independent FPGA chips, and each group of the FPGA chips is embedded in an arm system;
in a further implementation, when the PS terminal sends a start command to the PL terminal, the FPGA chip uses the previously transmitted control parameters to start the calculation;
as shown in the general framework diagram of fig. 1, an FPGA computing unit module is added to an existing k8s system in the cloud, a scheduler is connected to a pod node of k8s, different pod nodes correspond to different schedulers, each scheduler module corresponds to a different FPGA computing unit, and when a user request arrives at the cloud, the flow is as described in 2.1.2, and the scheduler module performs data and command interaction with the FPGA computing unit. The FPGA computing unit in the scheme exists as an independent functional module in a k8s system, and is easy to deploy.
For a single FPGA computing unit, as shown in fig. 2, each FPGA computing unit contains 24 independent FPGA chips, each FPGA chip being embedded in the arm system. The computing unit is provided with a routing module which is connected with the network ports of all the arm systems of the computing unit, thereby ensuring the normal communication of the server.
As shown in fig. 3, when a user sends a data request (for example, CNN picture processing) through a terminal, the user sends picture data to a cloud for processing, the cloud receives the request and then determines whether to enter FPGA cluster computing, if so, first determines whether the CNN algorithm type of the request is already supported, if not, enters an exception handling process, and if so, calls a database to load a bit file, which already supports the type, into a corresponding FPGA cluster. And simultaneously entering a scheduling program, wherein the main task of the scheduling program is to judge the size of the picture and calculate the task, if the picture is very large, the picture is cut into relatively small pictures (such as 512 multiplied by 512), the cut pictures are simultaneously transmitted to the embedded micro-service at the FPGA side in parallel, and the FPGA writes data into a specified memory address after calculation. The microserver at the Server side returns the data to the scheduler, and the scheduler merges the divided pictures and returns the merged pictures to the user terminal.
As shown in the internal calculation flow diagram of the arm-side FPGA chip of fig. 4, the PS-side receives information such as data/parameters/commands from the scheduler module, and transmits the information according to a certain rule according to the actual function requirement, the FPGA control parameters are transmitted first, the weight parameters required by the CNN network are transmitted, then data and an operation command are transmitted, when the PS-side sends a start command to the pl (FPGA) side, the FPGA chip starts calculation using the previously transmitted control parameters, and data required in the calculation process is obtained by sharing with the memory of the PS-side. When the computation is complete, the PL will notify the PS to obtain data. Similarly, the data is obtained through the shared memory.
In the description herein, references to the description of "one embodiment," "an example," "a specific example" or the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The preferred embodiments of the invention disclosed above are intended to be illustrative only. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise embodiments disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention. The invention is limited only by the claims and their full scope and equivalents.
Claims (6)
1. The utility model provides a high price/performance ratio cloud computing service system based on FPGA chip which characterized in that: the system comprises a cloud storage system, a scheduling program module and an FPGA (field programmable gate array) computing unit;
the cloud storage system comprises a pod node unit, the cloud storage system is connected to the scheduling program module through the pod node unit, and the pod node unit and the scheduling program module correspond to each other;
the scheduling program module and the FPGA computing unit are corresponding to each other, and data and command interaction can be carried out between the scheduling program module and the FPGA computing unit;
and the scheduling program module is used for receiving data, parameters and commands by the PS terminal.
2. The FPGA chip-based cost-effective cloud computing service system of claim 1, wherein: the data transmitted by the scheduling program module comprises FPGA control parameters, weight parameters and data required by the CNN network and operation commands, and the data are transmitted in sequence according to the sequence of the FPGA control parameters, the weight parameters and the data required by the CNN network and the operation commands.
3. The FPGA chip-based cost-effective cloud computing service system of claim 1, wherein: the scheduling program module comprises an entering program distribution module and an entering scheduling module;
the program entering distribution module is used for receiving a request command from a cloud end and transmitting the picture cutting pair entering scheduling module;
the program entering distribution module comprises an FPGA parallel computing selection unit, and the FPGA parallel computing selection unit is used for selecting whether to enter the program entering distribution module;
the scheduling module is used for merging the scheduled and segmented pictures and returning the result data to the user in an original way;
the system comprises an entry scheduling module and an entry program distribution module, wherein an FPGA (field programmable gate array) calculation unit support type judgment unit is arranged between the entry scheduling module and the entry program distribution module and used for judging whether to enter the entry scheduling module or not.
4. The FPGA chip-based cost-effective cloud computing service system of claim 3, wherein: the entry scheduling module comprises a routing module, the routing module is used for wireless data transmission of the FPGA computing unit, and a network port of an arm system included by the FPGA computing unit corresponds to the routing module.
5. The FPGA chip-based cost-effective cloud computing service system of claim 4, wherein: the number of the FPGA computing units is at least one group, each group of the FPGA computing units comprises at least 24 groups of independent FPGA chips, and each group of the FPGA chips are all embedded in an arm system.
6. The FPGA chip-based cost-effective cloud computing service system according to any one of claims 2 and 5, characterized in that: when the PS terminal sends a start command to the PL terminal, the FPGA chip uses the control parameters transmitted before to start calculation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811248254.XA CN111090503B (en) | 2018-10-24 | 2018-10-24 | High-cost-performance cloud computing service system based on FPGA chip |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811248254.XA CN111090503B (en) | 2018-10-24 | 2018-10-24 | High-cost-performance cloud computing service system based on FPGA chip |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111090503A true CN111090503A (en) | 2020-05-01 |
CN111090503B CN111090503B (en) | 2023-07-21 |
Family
ID=70392205
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811248254.XA Active CN111090503B (en) | 2018-10-24 | 2018-10-24 | High-cost-performance cloud computing service system based on FPGA chip |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111090503B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106228238A (en) * | 2016-07-27 | 2016-12-14 | 中国科学技术大学苏州研究院 | The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform |
CN108228354A (en) * | 2017-12-29 | 2018-06-29 | 杭州朗和科技有限公司 | Dispatching method, system, computer equipment and medium |
CN108304250A (en) * | 2018-03-05 | 2018-07-20 | 北京百度网讯科技有限公司 | Method and apparatus for the node for determining operation machine learning task |
US10108465B1 (en) * | 2016-06-23 | 2018-10-23 | EMC IP Holding Company LLC | Automated cloud service evaluation and workload migration utilizing standardized virtual service units |
-
2018
- 2018-10-24 CN CN201811248254.XA patent/CN111090503B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10108465B1 (en) * | 2016-06-23 | 2018-10-23 | EMC IP Holding Company LLC | Automated cloud service evaluation and workload migration utilizing standardized virtual service units |
CN106228238A (en) * | 2016-07-27 | 2016-12-14 | 中国科学技术大学苏州研究院 | The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform |
CN108228354A (en) * | 2017-12-29 | 2018-06-29 | 杭州朗和科技有限公司 | Dispatching method, system, computer equipment and medium |
CN108304250A (en) * | 2018-03-05 | 2018-07-20 | 北京百度网讯科技有限公司 | Method and apparatus for the node for determining operation machine learning task |
Non-Patent Citations (3)
Title |
---|
KAIYUAN GUO: "Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA" * |
朱金升: "基于FPGA的无人机航拍图像特定目标识别技术应用研究" * |
树岸;彭鑫;赵文耘;: "基于容器技术的云计算资源自适应管理方法" * |
Also Published As
Publication number | Publication date |
---|---|
CN111090503B (en) | 2023-07-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107018175B (en) | Scheduling method and device of mobile cloud computing platform | |
CN110602156A (en) | Load balancing scheduling method and device | |
CN100389392C (en) | Method for realizing load uniform in clustering system, system and storage controller | |
WO2016065643A1 (en) | Network card configuration method and resource management center | |
JP2015537307A (en) | Component-oriented hybrid cloud operating system architecture and communication method thereof | |
WO2006019512B1 (en) | Apparatus and method for supporting connection establishment in an offload of network protocol processing | |
CN103516744A (en) | A data processing method, an application server and an application server cluster | |
US20230275832A1 (en) | Networking processor and network device | |
CN113422842B (en) | Distributed power utilization information data acquisition system considering network load | |
CN101442493A (en) | Method for distributing IP message, cluster system and load equalizer | |
CN103176780A (en) | Binding system and method of multiple network interfaces | |
CN105472291A (en) | Digital video recorder with multiprocessor cluster and realization method of digital video recorder | |
CN113014611B (en) | Load balancing method and related equipment | |
CN105144109A (en) | Distributed data center technology | |
CN102571568A (en) | Method and device for processing task | |
JP2016531372A (en) | Memory module access method and apparatus | |
CN117312229B (en) | Data transmission device, data processing equipment, system, method and medium | |
CN117041147B (en) | Intelligent network card equipment, host equipment, method and system | |
CN108259605B (en) | Data calling system and method based on multiple data centers | |
CN109525443B (en) | processing method and device for distributed pre-acquisition communication link and computer equipment | |
CN103299298A (en) | Service processing method and system | |
CN113259408A (en) | Data transmission method and system | |
CN114816651A (en) | Communication method, device and system | |
CN111090503A (en) | High-cost-performance cloud computing service system based on FPGA chip | |
CN109831467B (en) | Data transmission method, equipment and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |