CN111090503A - High-cost-performance cloud computing service system based on FPGA chip - Google Patents

High-cost-performance cloud computing service system based on FPGA chip Download PDF

Info

Publication number
CN111090503A
CN111090503A CN201811248254.XA CN201811248254A CN111090503A CN 111090503 A CN111090503 A CN 111090503A CN 201811248254 A CN201811248254 A CN 201811248254A CN 111090503 A CN111090503 A CN 111090503A
Authority
CN
China
Prior art keywords
fpga
module
scheduling
unit
computing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811248254.XA
Other languages
Chinese (zh)
Other versions
CN111090503B (en
Inventor
张强
杨付收
赵小吾
龙瞻
田志明
荣义然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Xuehu Information Technology Co Ltd
Original Assignee
Shanghai Xuehu Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Xuehu Information Technology Co Ltd filed Critical Shanghai Xuehu Information Technology Co Ltd
Priority to CN201811248254.XA priority Critical patent/CN111090503B/en
Publication of CN111090503A publication Critical patent/CN111090503A/en
Application granted granted Critical
Publication of CN111090503B publication Critical patent/CN111090503B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/549Remote execution
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a high cost performance cloud computing service system based on an FPGA chip, belonging to the technical field of computer hardware and software FPGA chips, comprising a cloud storage system, a scheduling program module and an FPGA computing unit; the cloud storage system comprises a pod node unit, the cloud storage system is connected to the scheduling program module through the pod node unit, and the pod node unit and the scheduling program module correspond to each other; the scheduling program module and the FPGA computing unit are corresponding to each other, and data and command interaction can be carried out between the scheduling program module and the FPGA computing unit; the scheduling program module is used for receiving data, parameters and commands by the PS end, the parallel computing module of the cloud is realized by adopting an arm linux operating system and an FPGA high cost performance chip, the FPGA cloud clustering scheme is low in cost, high in efficiency and high in configurability, private socket service is configured in an arm system, and different requests are realized by adopting a customized protocol.

Description

High-cost-performance cloud computing service system based on FPGA chip
Technical Field
The invention relates to the technical field of computer hardware and software FPGA chips, in particular to a high-cost-performance cloud computing service system based on an FPGA chip.
Background
The existing FPGA chips used as cloud computing schemes are high-end configuration chips and communicate through a PCIE interface, and the scheme has the disadvantages that the chips are very expensive, and meanwhile, higher pc-end configurations are required on a PCIE interface, and the configurations themselves are also expensive; the existing product has high cost, the FPGA chip has low utilization rate, convenient deployment and flexible configuration, the cost of the existing product is very high when the existing product is deployed in a large scale due to the limitation of the cost, and one server can only be used as a calculation mode if a high-end chip is used.
Based on the technical scheme, the invention designs the high-cost-performance cloud computing service system based on the FPGA chip to solve the problems.
Disclosure of Invention
The invention aims to provide a high-cost-performance cloud computing service system based on an FPGA chip, and aims to solve the problems that the existing product provided in the background technology is high in cost, low in utilization rate of the FPGA chip, convenient to deploy and flexible in configuration, and the existing product is very expensive in cost when deployed in a large scale due to cost limitation.
In order to achieve the purpose, the invention provides the following technical scheme: a high-cost-performance cloud computing service system based on an FPGA chip comprises a cloud storage system, a scheduling program module and an FPGA computing unit;
the cloud storage system comprises a pod node unit, the cloud storage system is connected to the scheduling program module through the pod node unit, and the pod node unit and the scheduling program module correspond to each other;
the scheduling program module and the FPGA computing unit are corresponding to each other, and data and command interaction can be carried out between the scheduling program module and the FPGA computing unit;
and the scheduling program module is used for receiving data, parameters and commands by the PS terminal.
Preferably, the data transmitted by the scheduler module includes FPGA control parameters, weight parameters and data required by the CNN network, and operation commands, and is sequentially transmitted according to the order of the FPGA control parameters, the weight parameters and the data and the operation commands required by the CNN network.
Preferably, the scheduling program module comprises an entering program distribution module and an entering scheduling module;
the program entering distribution module is used for receiving a request command from a cloud end and transmitting the picture cutting pair entering scheduling module;
the program entering distribution module comprises an FPGA parallel computing selection unit, and the FPGA parallel computing selection unit is used for selecting whether to enter the program entering distribution module;
the scheduling module is used for merging the scheduled and segmented pictures and returning the result data to the user in an original way;
the system comprises an entry scheduling module and an entry program distribution module, wherein an FPGA (field programmable gate array) calculation unit support type judgment unit is arranged between the entry scheduling module and the entry program distribution module and used for judging whether to enter the entry scheduling module or not.
Preferably, the entry scheduling module includes a routing module, the routing module is configured to transmit wireless data of the FPGA computing unit, and an internet access of an arm system included in the FPGA computing unit corresponds to the routing module.
Preferably, the number of the FPGA computing units is at least one, each group of the FPGA computing units includes at least 24 independent FPGA chips, and each group of the FPGA chips is embedded in the arm system.
Preferably, when the PS terminal sends a start command to the PL terminal, the FPGA chip uses the control parameters transmitted before to start the calculation.
Compared with the prior art, the invention has the beneficial effects that: the parallel computing module of the cloud is realized by adopting an arm linux operating system and an FPGA high-cost-performance chip, and the FPGA cloud cluster scheme is low in cost, high in efficiency and high in configurability, private socket service is configured in an arm system, and different requests are realized by adopting a customized protocol; the method has the advantages that the FPGA chip with high cost performance and the micro service provided by the embedded system are used, the cloud computing system is directly involved through a network cable, the problem of high cost is solved, one FPGA server has 24 FPGA chips with high cost performance which independently operate, the cost is 1/10 of the GPU server with the same computing power, and meanwhile, different algorithm structures can be configured according to needs and the FPga chips and the micro service can operate simultaneously. The micro-service provided by the embedded system mainly refers to that a socket server is operated in the embedded linux system to process a calculation request from a client side. When the calculation amount of the request calculation is large, the scheduling program is used for distributing tasks, for example, when CNN is used for image processing, the pictures can be cut, the cut pictures are respectively sent to different socket servers, the servers call the FPGA for calculation after receiving the request, the calculation result is returned to the client, and the client combines the pictures. According to the scheme, hardware cost can be greatly saved, and a large amount of deployment can be carried out at the cloud end only by the switch, the network cable and the high cost performance FPGA chip. Meanwhile, the configuration at the cloud end is flexible, an FPGA cluster can be conveniently established, and different functions can be configured for the cluster according to the needs of users, for example, different network structures for operating CNN can be configured in the cluster, and the network structures for operating RNN can also be configured.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a general framework diagram of cloud computing according to the present invention;
FIG. 2 is a diagram of an FPGA computational unit of the present invention;
FIG. 3 is a block diagram of the present invention calling FPGA calculation modules;
FIG. 4 is a diagram of the internal computing framework of the computing unit of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-4, the present invention provides a technical solution: a high-cost-performance cloud computing service system based on an FPGA chip comprises a cloud storage system, a scheduling program module and an FPGA computing unit;
the cloud storage system comprises a pod node unit, the cloud storage system is connected to the scheduling program module through the pod node unit, and the pod node unit and the scheduling program module correspond to each other;
the scheduling program module and the FPGA computing unit are corresponding to each other, and data and command interaction can be carried out between the scheduling program module and the FPGA computing unit;
and the scheduling program module is used for receiving data, parameters and commands by the PS terminal.
In a further embodiment, the data transmitted by the scheduler module includes FPGA control parameters, weight parameters and data required by the CNN network and operation commands, and is sequentially transmitted according to the order of the FPGA control parameters, the weight parameters and the data required by the CNN network and the operation commands;
in a further embodiment, the scheduler module comprises an entry program distribution module and an entry scheduling module;
the program entering distribution module is used for receiving a request command from a cloud end and transmitting the picture cutting pair entering scheduling module;
the program entering distribution module comprises an FPGA parallel computing selection unit, and the FPGA parallel computing selection unit is used for selecting whether to enter the program entering distribution module;
the scheduling module is used for merging the scheduled and segmented pictures and returning the result data to the user in an original way;
the system comprises an entry scheduling module, an entry program distribution module and an FPGA (field programmable gate array) calculation unit, wherein an FPGA calculation unit support type judgment unit is arranged between the entry scheduling module and the entry program distribution module and used for judging whether to enter the entry scheduling module or not;
in a further embodiment, the entry scheduling module includes a routing module, the routing module is configured to transmit wireless data of the FPGA computing unit, and an internet access of an arm system included in the FPGA computing unit corresponds to the routing module;
in a further embodiment, the number of the FPGA computing units is at least one, and each group of the FPGA computing units includes at least 24 independent FPGA chips, and each group of the FPGA chips is embedded in an arm system;
in a further implementation, when the PS terminal sends a start command to the PL terminal, the FPGA chip uses the previously transmitted control parameters to start the calculation;
as shown in the general framework diagram of fig. 1, an FPGA computing unit module is added to an existing k8s system in the cloud, a scheduler is connected to a pod node of k8s, different pod nodes correspond to different schedulers, each scheduler module corresponds to a different FPGA computing unit, and when a user request arrives at the cloud, the flow is as described in 2.1.2, and the scheduler module performs data and command interaction with the FPGA computing unit. The FPGA computing unit in the scheme exists as an independent functional module in a k8s system, and is easy to deploy.
For a single FPGA computing unit, as shown in fig. 2, each FPGA computing unit contains 24 independent FPGA chips, each FPGA chip being embedded in the arm system. The computing unit is provided with a routing module which is connected with the network ports of all the arm systems of the computing unit, thereby ensuring the normal communication of the server.
As shown in fig. 3, when a user sends a data request (for example, CNN picture processing) through a terminal, the user sends picture data to a cloud for processing, the cloud receives the request and then determines whether to enter FPGA cluster computing, if so, first determines whether the CNN algorithm type of the request is already supported, if not, enters an exception handling process, and if so, calls a database to load a bit file, which already supports the type, into a corresponding FPGA cluster. And simultaneously entering a scheduling program, wherein the main task of the scheduling program is to judge the size of the picture and calculate the task, if the picture is very large, the picture is cut into relatively small pictures (such as 512 multiplied by 512), the cut pictures are simultaneously transmitted to the embedded micro-service at the FPGA side in parallel, and the FPGA writes data into a specified memory address after calculation. The microserver at the Server side returns the data to the scheduler, and the scheduler merges the divided pictures and returns the merged pictures to the user terminal.
As shown in the internal calculation flow diagram of the arm-side FPGA chip of fig. 4, the PS-side receives information such as data/parameters/commands from the scheduler module, and transmits the information according to a certain rule according to the actual function requirement, the FPGA control parameters are transmitted first, the weight parameters required by the CNN network are transmitted, then data and an operation command are transmitted, when the PS-side sends a start command to the pl (FPGA) side, the FPGA chip starts calculation using the previously transmitted control parameters, and data required in the calculation process is obtained by sharing with the memory of the PS-side. When the computation is complete, the PL will notify the PS to obtain data. Similarly, the data is obtained through the shared memory.
In the description herein, references to the description of "one embodiment," "an example," "a specific example" or the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The preferred embodiments of the invention disclosed above are intended to be illustrative only. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise embodiments disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention. The invention is limited only by the claims and their full scope and equivalents.

Claims (6)

1. The utility model provides a high price/performance ratio cloud computing service system based on FPGA chip which characterized in that: the system comprises a cloud storage system, a scheduling program module and an FPGA (field programmable gate array) computing unit;
the cloud storage system comprises a pod node unit, the cloud storage system is connected to the scheduling program module through the pod node unit, and the pod node unit and the scheduling program module correspond to each other;
the scheduling program module and the FPGA computing unit are corresponding to each other, and data and command interaction can be carried out between the scheduling program module and the FPGA computing unit;
and the scheduling program module is used for receiving data, parameters and commands by the PS terminal.
2. The FPGA chip-based cost-effective cloud computing service system of claim 1, wherein: the data transmitted by the scheduling program module comprises FPGA control parameters, weight parameters and data required by the CNN network and operation commands, and the data are transmitted in sequence according to the sequence of the FPGA control parameters, the weight parameters and the data required by the CNN network and the operation commands.
3. The FPGA chip-based cost-effective cloud computing service system of claim 1, wherein: the scheduling program module comprises an entering program distribution module and an entering scheduling module;
the program entering distribution module is used for receiving a request command from a cloud end and transmitting the picture cutting pair entering scheduling module;
the program entering distribution module comprises an FPGA parallel computing selection unit, and the FPGA parallel computing selection unit is used for selecting whether to enter the program entering distribution module;
the scheduling module is used for merging the scheduled and segmented pictures and returning the result data to the user in an original way;
the system comprises an entry scheduling module and an entry program distribution module, wherein an FPGA (field programmable gate array) calculation unit support type judgment unit is arranged between the entry scheduling module and the entry program distribution module and used for judging whether to enter the entry scheduling module or not.
4. The FPGA chip-based cost-effective cloud computing service system of claim 3, wherein: the entry scheduling module comprises a routing module, the routing module is used for wireless data transmission of the FPGA computing unit, and a network port of an arm system included by the FPGA computing unit corresponds to the routing module.
5. The FPGA chip-based cost-effective cloud computing service system of claim 4, wherein: the number of the FPGA computing units is at least one group, each group of the FPGA computing units comprises at least 24 groups of independent FPGA chips, and each group of the FPGA chips are all embedded in an arm system.
6. The FPGA chip-based cost-effective cloud computing service system according to any one of claims 2 and 5, characterized in that: when the PS terminal sends a start command to the PL terminal, the FPGA chip uses the control parameters transmitted before to start calculation.
CN201811248254.XA 2018-10-24 2018-10-24 High-cost-performance cloud computing service system based on FPGA chip Active CN111090503B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811248254.XA CN111090503B (en) 2018-10-24 2018-10-24 High-cost-performance cloud computing service system based on FPGA chip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811248254.XA CN111090503B (en) 2018-10-24 2018-10-24 High-cost-performance cloud computing service system based on FPGA chip

Publications (2)

Publication Number Publication Date
CN111090503A true CN111090503A (en) 2020-05-01
CN111090503B CN111090503B (en) 2023-07-21

Family

ID=70392205

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811248254.XA Active CN111090503B (en) 2018-10-24 2018-10-24 High-cost-performance cloud computing service system based on FPGA chip

Country Status (1)

Country Link
CN (1) CN111090503B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228238A (en) * 2016-07-27 2016-12-14 中国科学技术大学苏州研究院 The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform
CN108228354A (en) * 2017-12-29 2018-06-29 杭州朗和科技有限公司 Dispatching method, system, computer equipment and medium
CN108304250A (en) * 2018-03-05 2018-07-20 北京百度网讯科技有限公司 Method and apparatus for the node for determining operation machine learning task
US10108465B1 (en) * 2016-06-23 2018-10-23 EMC IP Holding Company LLC Automated cloud service evaluation and workload migration utilizing standardized virtual service units

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10108465B1 (en) * 2016-06-23 2018-10-23 EMC IP Holding Company LLC Automated cloud service evaluation and workload migration utilizing standardized virtual service units
CN106228238A (en) * 2016-07-27 2016-12-14 中国科学技术大学苏州研究院 The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform
CN108228354A (en) * 2017-12-29 2018-06-29 杭州朗和科技有限公司 Dispatching method, system, computer equipment and medium
CN108304250A (en) * 2018-03-05 2018-07-20 北京百度网讯科技有限公司 Method and apparatus for the node for determining operation machine learning task

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KAIYUAN GUO: "Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA" *
朱金升: "基于FPGA的无人机航拍图像特定目标识别技术应用研究" *
树岸;彭鑫;赵文耘;: "基于容器技术的云计算资源自适应管理方法" *

Also Published As

Publication number Publication date
CN111090503B (en) 2023-07-21

Similar Documents

Publication Publication Date Title
CN107087019B (en) Task scheduling method and device based on end cloud cooperative computing architecture
US10305823B2 (en) Network interface card configuration method and resource management center
CN109218355B (en) Load balancing engine, client, distributed computing system and load balancing method
CN107018175B (en) Scheduling method and device of mobile cloud computing platform
CN110602156A (en) Load balancing scheduling method and device
WO2006019512B1 (en) Apparatus and method for supporting connection establishment in an offload of network protocol processing
CN103516744A (en) A data processing method, an application server and an application server cluster
US20230275832A1 (en) Networking processor and network device
CN101442493A (en) Method for distributing IP message, cluster system and load equalizer
CN105472291A (en) Digital video recorder with multiprocessor cluster and realization method of digital video recorder
CN113422842B (en) Distributed power utilization information data acquisition system considering network load
CN105144109A (en) Distributed data center technology
CN102571568A (en) Method and device for processing task
JP2016531372A (en) Memory module access method and apparatus
CN109525443B (en) processing method and device for distributed pre-acquisition communication link and computer equipment
CN103299298A (en) Service processing method and system
CN113259408A (en) Data transmission method and system
CN111090503A (en) High-cost-performance cloud computing service system based on FPGA chip
CN109831467B (en) Data transmission method, equipment and system
US20150120815A1 (en) Remote multi-client accommodating system and host computer
CN114020417A (en) Virtual load balancing system and working method thereof
CN113608861A (en) Software load computing resource virtualization distribution method and device
CN105874757A (en) Data processing method and multi-core processor system
CN114816651A (en) Communication method, device and system
JP2002342193A (en) Method, device and program for selecting data transfer destination server and storage medium with data transfer destination server selection program stored therein

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant