CN111090503B - High-cost-performance cloud computing service system based on FPGA chip - Google Patents
High-cost-performance cloud computing service system based on FPGA chip Download PDFInfo
- Publication number
- CN111090503B CN111090503B CN201811248254.XA CN201811248254A CN111090503B CN 111090503 B CN111090503 B CN 111090503B CN 201811248254 A CN201811248254 A CN 201811248254A CN 111090503 B CN111090503 B CN 111090503B
- Authority
- CN
- China
- Prior art keywords
- fpga
- module
- entering
- data
- cloud
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5072—Grid computing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/544—Buffers; Shared memory; Pipes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/547—Remote procedure calls [RPC]; Web services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/54—Indexing scheme relating to G06F9/54
- G06F2209/549—Remote execution
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Stored Programmes (AREA)
Abstract
The invention discloses a high cost performance cloud computing service system based on an FPGA chip, which belongs to the technical field of computer hardware and software FPGA chips and comprises a cloud storage system, a scheduler module and an FPGA computing unit; the cloud storage system comprises a pod node unit, the cloud storage system is accessed to a scheduler module through the pod node unit, and the pod node unit and the scheduler module are respectively corresponding; the dispatcher module and the FPGA computing unit are corresponding to each other, and the dispatcher module and the FPGA computing unit can perform interaction of data and commands; the scheduling program module is used for receiving data, parameters and commands by the PS end, the invention adopts an arm linux operating system and an FPGA high-cost performance chip to realize a cloud parallel computing module, is an FPGA cloud cluster scheme with low cost, high efficiency and high configurability, and configures private socket services in the arm system and adopts a customized protocol to realize different requests.
Description
Technical Field
The invention relates to the technical field of computer hardware and software FPGA chips, in particular to a high-cost performance cloud computing service system based on an FPGA chip.
Background
The existing FPGA chips serving as cloud computing schemes are high-end configuration chips and communicate through a PCIE interface, and the scheme has the defects that the chips are very expensive, and meanwhile, higher configuration of a pc end is required on a PCIE interface, and the configuration is also high in cost; the existing product has high cost, the utilization rate of the FPGA chip is low, the deployment is convenient, the configuration is flexible, the cost of the existing product is very high when the existing product is deployed on a large scale because of the limitation of the cost, and if a high-end chip is used, one server can only make one calculation mode.
Based on the problem, the invention designs a high-cost performance cloud computing service system based on an FPGA chip to solve the problem.
Disclosure of Invention
The invention aims to provide a high-cost-performance cloud computing service system based on an FPGA chip, which aims to solve the problems that the prior product provided in the background art is high in cost, low in utilization rate of the FPGA chip, convenient to deploy and flexible to configure, and the cost of the prior product is very high when the prior product is deployed on a large scale due to cost limitation.
In order to achieve the above purpose, the present invention provides the following technical solutions: a high cost performance cloud computing service system based on an FPGA chip comprises a cloud storage system, a scheduler module and an FPGA computing unit;
the cloud storage system comprises a pod node unit, the cloud storage system is accessed to a scheduler module through the pod node unit, and the pod node unit and the scheduler module are respectively corresponding;
the dispatcher module and the FPGA computing unit are corresponding to each other, and the dispatcher module and the FPGA computing unit can perform interaction of data and commands;
the scheduler module is used for receiving data, parameters and commands by the PS end.
Preferably, the data transmitted by the scheduler module includes FPGA control parameters, weight parameters required by the CNN network, data and operation commands, and the data and operation commands are sequentially transmitted according to the FPGA control parameters, weight parameters required by the CNN network, and the data and operation commands.
Preferably, the scheduler module comprises an entering program distribution module and an entering scheduler module;
the entering program distribution module is used for receiving a request command from the cloud and transmitting a picture cutting pair to the entering scheduling module;
the entering program distribution module comprises an FPGA parallel computation selection unit, and the FPGA parallel computation selection unit is used for selecting whether entering into the program distribution module is needed or not;
the entering scheduling module is used for merging the pictures which are scheduled and segmented and returning the result data to the user in an original way;
the system comprises an entering scheduling module, an entering program distribution module and a storage module, wherein an FPGA computing unit supporting type judging unit is arranged between the entering scheduling module and the entering program distribution module and used for judging whether to enter the entering scheduling module.
Preferably, the entering scheduling module comprises a routing module, the routing module is used for wireless data transmission of the FPGA computing unit, and a network port of an arm system included in the FPGA computing unit corresponds to the routing module.
Preferably, the number of the FPGA computation units is at least one, each group of the FPGA computation units comprises at least 24 groups of independent FPGA chips, and each group of the FPGA chips is embedded in the arm system.
Preferably, when the PS side sends a start command to the PL side, the FPGA chip uses the control parameters previously transmitted to start the calculation.
Compared with the prior art, the invention has the beneficial effects that: the invention adopts an arm linux operating system and an FPGA high-cost performance chip to realize a cloud parallel computing module, is an FPGA cloud cluster scheme with low cost, high efficiency and high configurability, configures private socket service in an arm system, and adopts a customized protocol to realize different requests; the FPGA chip with high cost performance and the micro service provided by the embedded system are directly used for intervening in the cloud computing system through the network cable, so that the problem of high cost is solved, one FPGA server is provided with 24 FPGA high cost performance chips which are independently operated, the cost is 1/10 of that of one GPU server with the same power, and the FPGA server can be configured into different algorithm structures according to requirements and can be operated simultaneously. The micro-service provided by the embedded system mainly refers to running a socket server in the embedded linux system, and processing a calculation request from a client. When the calculation amount of the request calculation is large, the task allocation is carried out by the dispatcher, for example, when the CNN is used for carrying out image processing, the pictures can be cut, the cut pictures are respectively sent to different socket servers, the server can call the FPGA to carry out calculation after receiving the request, the calculation result is returned to the client, and then the client carries out the combination of the pictures. The scheme can greatly save hardware cost, and can be deployed in a cloud only by a switch, a network cable and a high-cost-performance FPGA chip. Meanwhile, the configuration in the cloud is flexible, the FPGA cluster can be conveniently established, and the cluster can be configured with different functions according to the needs of users, such as different network structures for running CNNs (computer network networks) in the cluster, and also can be configured into the network structures of RNNs.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a diagram of a cloud computing overall framework of the present invention;
FIG. 2 is a diagram of an FPGA computational cell of the present invention;
FIG. 3 is a diagram of a framework of a call FPGA computing module of the present invention;
FIG. 4 is a diagram of an internal computing framework of a computing unit of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1-4, the present invention provides a technical solution: a high cost performance cloud computing service system based on an FPGA chip comprises a cloud storage system, a scheduler module and an FPGA computing unit;
the cloud storage system comprises a pod node unit, the cloud storage system is accessed to a scheduler module through the pod node unit, and the pod node unit and the scheduler module are respectively corresponding;
the dispatcher module and the FPGA computing unit are corresponding to each other, and the dispatcher module and the FPGA computing unit can perform interaction of data and commands;
the scheduler module is used for receiving data, parameters and commands by the PS end.
In a further embodiment, the data transmitted by the scheduler module includes FPGA control parameters, weight parameters required by the CNN network, data and operation commands, and the data and operation commands are sequentially transmitted according to the FPGA control parameters, the weight parameters required by the CNN network, and the data and operation commands;
in a further embodiment, the scheduler module includes an entering program distribution module and an entering scheduler module;
the entering program distribution module is used for receiving a request command from the cloud and transmitting a picture cutting pair to the entering scheduling module;
the entering program distribution module comprises an FPGA parallel computation selection unit, and the FPGA parallel computation selection unit is used for selecting whether entering into the program distribution module is needed or not;
the entering scheduling module is used for merging the pictures which are scheduled and segmented and returning the result data to the user in an original way;
the system comprises an entering scheduling module, an entering program distribution module and a program distribution module, wherein an FPGA (field programmable gate array) computing unit supporting type judging unit is arranged between the entering scheduling module and the entering program distribution module and used for judging whether to enter the entering scheduling module;
in a further embodiment, the access scheduling module includes a routing module, the routing module is used for wireless data transmission of the FPGA computing unit, and a network port of the arm system included in the FPGA computing unit corresponds to the routing module;
in a further embodiment, the number of the FPGA computation units is at least one, each group of the FPGA computation units includes at least 24 groups of independent FPGA chips, and each group of FPGA chips is embedded in an arm system;
in a further embodiment, when the PS end sends a start command to the PL end, the FPGA chip uses the control parameters transmitted before to start calculation;
as shown in the overall frame diagram of fig. 1, an FPGA computing unit module is added in the existing k8s system in the cloud, a scheduler is accessed to the pod node of k8s, different pod nodes correspond to different schedulers, each scheduler module corresponds to different FPGA computing units, and when a user request arrives at the cloud, the flow is as described in 2.1.2, and the scheduler module performs data and command interaction with the FPGA computing units. The FPGA computing unit in the scheme exists as an independent functional module in the k8s system, and is easy to deploy.
For a single FPGA computation unit, each FPGA computation unit contains 24 individual FPGA chips, each embedded in an arm system, as shown in figure 2. The computing unit is provided with a routing module which is connected with network ports of all arm systems of the computing unit, thereby ensuring normal communication of the server.
As shown in fig. 3, when a user sends a data request (for example, CNN picture processing) through a terminal, the user sends picture data to a cloud terminal for processing, the cloud terminal determines whether to enter FPGA cluster calculation after receiving the request, if so, first determines whether the type of CNN algorithm of the request is already supported, if not, enters an abnormal processing flow, and if so, invokes a database to load the bit file already supporting the type into the corresponding FPGA cluster. Meanwhile, the method enters a dispatcher, the main task of the dispatcher is to judge the size of a picture and calculate the task, if the picture is large, the picture is cut into relatively small pictures (such as 512 multiplied by 512), meanwhile, the cut pictures are transmitted to an embedded micro service on the FPGA side in parallel, and after the FPGA calculates, data is written into a designated memory address. The micro-service on the Server side returns the data to the dispatcher, and the dispatcher merges the divided pictures and returns the split pictures to the user terminal in the original way.
As shown in the internal calculation flow chart of the arm-end FPGA chip in fig. 4, the PS end receives information such as data/parameters/commands from the scheduler module, and according to the actual function requirement, the PS end transmits FPGA control parameters first, then transmits weight parameters required by the CNN network, and then when transmitting data and operation commands, when the PS end sends start command to the PL (FPGA) end, the FPGA chip starts calculation using the control parameters transmitted before, and the data required in the calculation process is obtained by sharing with the memory of the PS end. After the operation result is finished, the PL will notify the PS end to acquire data. Likewise, data is acquired through shared memory.
In the description of the present specification, the descriptions of the terms "one embodiment," "example," "specific example," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The preferred embodiments of the invention disclosed above are intended only to assist in the explanation of the invention. The preferred embodiments are not exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. The invention is limited only by the claims and the full scope and equivalents thereof.
Claims (5)
1. A high cost performance cloud computing service system based on an FPGA chip is characterized in that: the system comprises a cloud storage system, a scheduler module and an FPGA computing unit;
the cloud storage system comprises a pod node unit, the cloud storage system is accessed to a scheduler module through the pod node unit, and the pod node unit and the scheduler module are respectively corresponding;
the dispatcher module and the FPGA computing unit are corresponding to each other, and the dispatcher module and the FPGA computing unit can perform interaction of data and commands;
the scheduler module is used for receiving data, parameters and commands by the PS end;
the scheduler module comprises an entering program distribution module and an entering scheduling module;
the entering program distribution module is used for receiving a request command from the cloud and transmitting a picture cutting pair to the entering scheduling module;
the entering program distribution module comprises an FPGA parallel computation selection unit, and the FPGA parallel computation selection unit is used for selecting whether entering into the program distribution module is needed or not;
the entering scheduling module is used for merging the pictures which are scheduled and segmented and returning the result data to the user in an original way;
the system comprises an entering scheduling module, an entering program distribution module and a storage module, wherein an FPGA computing unit supporting type judging unit is arranged between the entering scheduling module and the entering program distribution module and used for judging whether to enter the entering scheduling module.
2. The cost-effective cloud computing service system based on FPGA chips of claim 1, wherein: the data transmitted by the scheduler module comprises FPGA control parameters, weight parameters required by the CNN network, data and operation commands, and the data are sequentially transmitted according to the FPGA control parameters, the weight parameters required by the CNN network, the data and the operation commands.
3. The cost-effective cloud computing service system based on FPGA chips of claim 1, wherein: the entering scheduling module comprises a routing module, wherein the routing module is used for wireless data transmission of the FPGA computing unit, and a network port of an arm system included in the FPGA computing unit corresponds to the routing module.
4. A cost effective cloud computing service system based on FPGA chips as defined in claim 3, wherein: the number of the FPGA computing units is at least one, each group of the FPGA computing units comprises at least 24 groups of independent FPGA chips, and each group of the FPGA chips is embedded in an arm system.
5. The cost-effective cloud computing service system based on FPGA chips as defined in any one of claims 1 and 4, wherein: when the PS side sends a start command to the PL side, the FPGA chip uses the previously transmitted control parameters to start the calculation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811248254.XA CN111090503B (en) | 2018-10-24 | 2018-10-24 | High-cost-performance cloud computing service system based on FPGA chip |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811248254.XA CN111090503B (en) | 2018-10-24 | 2018-10-24 | High-cost-performance cloud computing service system based on FPGA chip |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111090503A CN111090503A (en) | 2020-05-01 |
CN111090503B true CN111090503B (en) | 2023-07-21 |
Family
ID=70392205
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811248254.XA Active CN111090503B (en) | 2018-10-24 | 2018-10-24 | High-cost-performance cloud computing service system based on FPGA chip |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111090503B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106228238A (en) * | 2016-07-27 | 2016-12-14 | 中国科学技术大学苏州研究院 | The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform |
CN108228354A (en) * | 2017-12-29 | 2018-06-29 | 杭州朗和科技有限公司 | Dispatching method, system, computer equipment and medium |
CN108304250A (en) * | 2018-03-05 | 2018-07-20 | 北京百度网讯科技有限公司 | Method and apparatus for the node for determining operation machine learning task |
US10108465B1 (en) * | 2016-06-23 | 2018-10-23 | EMC IP Holding Company LLC | Automated cloud service evaluation and workload migration utilizing standardized virtual service units |
-
2018
- 2018-10-24 CN CN201811248254.XA patent/CN111090503B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10108465B1 (en) * | 2016-06-23 | 2018-10-23 | EMC IP Holding Company LLC | Automated cloud service evaluation and workload migration utilizing standardized virtual service units |
CN106228238A (en) * | 2016-07-27 | 2016-12-14 | 中国科学技术大学苏州研究院 | The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform |
CN108228354A (en) * | 2017-12-29 | 2018-06-29 | 杭州朗和科技有限公司 | Dispatching method, system, computer equipment and medium |
CN108304250A (en) * | 2018-03-05 | 2018-07-20 | 北京百度网讯科技有限公司 | Method and apparatus for the node for determining operation machine learning task |
Non-Patent Citations (3)
Title |
---|
Kaiyuan Guo.Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA.《IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 》.2017,全文. * |
朱金升.基于FPGA的无人机航拍图像特定目标识别技术应用研究.《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》.2018,(第01期),全文. * |
树岸 ; 彭鑫 ; 赵文耘 ; .基于容器技术的云计算资源自适应管理方法.计算机科学.2017,(第07期),全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN111090503A (en) | 2020-05-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107087019B (en) | Task scheduling method and device based on end cloud cooperative computing architecture | |
CN102447624B (en) | Load balancing method in server cluster, as well as node server and cluster | |
CN110602156A (en) | Load balancing scheduling method and device | |
US10609125B2 (en) | Method and system for transmitting communication data | |
CN112631788B (en) | Data transmission method and data transmission server | |
CN101442493A (en) | Method for distributing IP message, cluster system and load equalizer | |
CN110830574B (en) | Method for realizing intranet load balance based on docker container | |
CN105472291A (en) | Digital video recorder with multiprocessor cluster and realization method of digital video recorder | |
WO2021120633A1 (en) | Load balancing method and related device | |
CN103441937A (en) | Sending method and receiving method of multicast data | |
WO2017050036A1 (en) | Resource allocation information transmission and data distribution method and device | |
US11736403B2 (en) | Systems and methods for enhanced autonegotiation | |
EP3631639B1 (en) | Communications for field programmable gate array device | |
CN109245926A (en) | Intelligent network adapter, intelligent network adapter system and control method | |
CN114710571B (en) | Data packet processing system | |
WO2013189069A1 (en) | Load sharing method and device, and single board | |
CN104104736A (en) | Cloud server and use method thereof | |
CN112422251B (en) | Data transmission method and device, terminal and storage medium | |
CN111090503B (en) | High-cost-performance cloud computing service system based on FPGA chip | |
CN109525443B (en) | processing method and device for distributed pre-acquisition communication link and computer equipment | |
CN111147603A (en) | Method and device for networking reasoning service | |
CN109831467B (en) | Data transmission method, equipment and system | |
CN111245878A (en) | Method for computing and offloading communication network based on hybrid cloud computing and fog computing | |
CN110166368B (en) | Cloud storage network bandwidth control system and method | |
US11303524B2 (en) | Network bandwidth configuration |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |