CN110059050A - AI supercomputer based on the restructural elastic calculation of high-performance - Google Patents

AI supercomputer based on the restructural elastic calculation of high-performance Download PDF

Info

Publication number
CN110059050A
CN110059050A CN201910350877.6A CN201910350877A CN110059050A CN 110059050 A CN110059050 A CN 110059050A CN 201910350877 A CN201910350877 A CN 201910350877A CN 110059050 A CN110059050 A CN 110059050A
Authority
CN
China
Prior art keywords
rpu
supercomputer
hec
compiling
control
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910350877.6A
Other languages
Chinese (zh)
Other versions
CN110059050B (en
Inventor
向志宏
杨延辉
吴君安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Meilian Dongqing Technology Co ltd
Original Assignee
Beijing Super Dimension Computing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Super Dimension Computing Technology Co Ltd filed Critical Beijing Super Dimension Computing Technology Co Ltd
Priority to CN201910350877.6A priority Critical patent/CN110059050B/en
Publication of CN110059050A publication Critical patent/CN110059050A/en
Application granted granted Critical
Publication of CN110059050B publication Critical patent/CN110059050B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7867Architectures of general purpose stored program computers comprising a single central processing unit with reconfigurable architecture
    • G06F15/7871Reconfiguration support, e.g. configuration loading, configuration switching, or hardware OS
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7867Architectures of general purpose stored program computers comprising a single central processing unit with reconfigurable architecture
    • G06F15/7885Runtime interface, e.g. data exchange, runtime control
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

A kind of AI supercomputer based on the restructural elastic calculation of high-performance.In embodiment, which includes: machine perceptron;Reconfigurable Computation unit R PU array, under the control of configuration information, the reconfigurable data for executing AI supercomputer based on reconfigurable data is calculated;Machine behavior device;High-performance elastic based on Reconfigurable Computation calculates master control system, connects machine perceptron and machine behavior device, Reconfigurable Computation cell array;High-performance elastic based on Reconfigurable Computation connects system, and connection high-performance elastic calculates master control system and RPU array under the control of control information;AI supercomputer compiling system based on restructural elastic calculation, for by compiling of application generate the control code of master controller, the control information of HEC_Link, RPU array every configuration information.This specification embodiment may be implemented stronger calculation power deployment, more parallel computations and be performed simultaneously a variety of different tasks.

Description

AI supercomputer based on the restructural elastic calculation of high-performance
Technical field
This specification embodiment is related to a kind of supercomputer, more particularly to a kind of based on the restructural elastic calculation of high-performance (HEC) AI (artificial intelligence, artificial intelligence) supercomputer (AISC).
Background technique
The development of artificial intelligence (AI) is advanced by leaps and bounds, but it operation platform it is most of or based on CPU, GPU, FPGA and The platform that ASIC and combinations thereof is formed.These operation platforms cause when AI product allocation to developer and user much tired It disturbs:
(1) CPU flexibility highest, but its Energy Efficiency Ratio is very low under the scene of parallel computation a large amount of for needs such as AI;GPU Solves the problems, such as a part of parallel computation with FPGA, but power consumption and cost are always an important factor for influencing its deployment;ASIC With good Energy Efficiency Ratio, but it can only adapt to fixed algorithm, helpless to algorithm evolution.
(2) platform being made of the one or more of CPU, GPU, FPGA and ASIC calculates power in the complexity of system architecture Expandability, the power consumption of system, cost etc. it is all difficult such as people's will.
Summary of the invention
This specification embodiment proposes a kind of AI supercomputer based on the restructural elastic calculation of high-performance, the computer It can flexibly be disposed according to product demand and use environment and calculate power, can support edge calculations, large-scale calculations and great scale It calculates, can support the various neural computings without order-driven, support on-line training and on-line Algorithm iteration and have High versatility, flexibility and Energy Efficiency Ratio.
The AI supercomputer based on the restructural elastic calculation of high-performance of this specification embodiment, comprising: machine perception Machine enters information as reconfigurable data for providing environment sensing information or equipment;Reconfigurable Computation unit R PU array;Machine Device behavior device, for exporting the result of the calculating of AI supercomputer or reasoning or executing the relevant instruction of AI supercomputer;Based on restructural The high-performance elastic of calculating calculates master control system HEC;High-performance elastic based on Reconfigurable Computation connects system HEC_Link;Base In the AI supercomputer compiling system of restructural elastic calculation, for compiling of application to be generated to control code, the HEC_ of master controller Every configuration information of the control information of Link, RPU array, so that master control system connects machine perception under the control of control code Machine and machine behavior device, HEC_Link connection high-performance elastic under the control of control information calculate HEC and RPU gusts of master control system Column, under the control of configuration information, the reconfigurable data for executing AI supercomputer based on reconfigurable data calculates RPU array.
Under possible embodiment, the HEC include: by multilayer system bus, configuration bus, on-chip DMA controller, System-level the weighing of at least one of on-chip memory, on piece storage control, peripheral control unit and chip external memory composition Structure data path.
Under possible embodiment, high-performance elastic connection system (HEC_Link) is under the control of control information It realizes between RPU and RPU and the high speed data transfer between RPU and Master control chip system and Master control chip and RPU Between configuration status communicated with configuration information.
Under possible embodiment, RPU array includes being connected between one or more RPU, RPU by HEC_Link Together;Each RPU passes through HEC_Link and obtains configuration information.
Under possible embodiment, AI supercomputer includes the AI supercomputer operation system based on restructural elastic calculation System, for managing the software and hardware resource and peripheral resources of AI supercomputer, for executing the compiling file of compiling system output, It is that AI is calculated as a result, for executing for being executed in control machine behavior device for obtaining the information obtained from machine perceptron Compiled online.
Under possible embodiment, compiling system is adapted to according to the characteristics of neural network.
Under possible embodiment, the compiling system compiling is for utilizing according to HEC_Link and RPU array feature HEC_Link carries out the control information of the extension of columns and/or rows to RPU array.
Under possible embodiment, the compiling system compiling is for changing the connection relationship of RPU by HEC_Link Control information.
Under possible embodiment, the control information of the compiling system compiling makes one or more RPU: while holding The one or more different training missions of row;It is performed simultaneously one or more different reasoning tasks;It is performed simultaneously multiple training Task and multiple reasoning tasks;Alternatively, same data are input to trained network and inference network simultaneously.
Under possible embodiment, when HEC_Link is disposed using distributed deployment or cloud, the compiling system root Be compiled according to HEC_Link link information, deployment master control system, HEC_Link and and each RPU task.
Under possible embodiment, the compiling system is compiled offline mode, and output compiling file is directly passed to Operating system executes;Or the compiled online mode of operation on an operating system, after compiling finishes, as needed directly by grasping Make system execution.
Under possible embodiment, AI supercomputer includes: that compiling system passes through the network structure of on-line training Compiled online deployed in real time goes down, and realizes the online real-time update of neural network.
This specification embodiment may be implemented stronger calculation power deployment, more parallel computations and be performed simultaneously a variety of Different tasks.
Detailed description of the invention
In order to which technical solution in this specification embodiment and advantage is more clearly understood, below in conjunction with attached drawing to this theory The exemplary embodiment of bright book is described in more detail, it is clear that described embodiment is only one of this specification Point embodiment, rather than the exhaustion of all embodiments.
Fig. 1 is the AI supercomputer system architecture diagram based on the restructural elastic calculation of high-performance;
Fig. 2 (a) and Fig. 2 (b) illustrate the calculation power expander graphs of AI supercomputer;Fig. 2 (a) is tried hard to before extending, Fig. 2 (b) is tried hard to after extending.
Fig. 3 is that AI supercomputer flexible adjustment calculates power, executes the schematic diagram of two tasks parallel.
Specific embodiment
With reference to the accompanying drawing, the scheme provided this specification is described.
Fig. 1 is the AI supercomputer system architecture diagram based on the restructural elastic calculation of high-performance.As shown in Figure 1, being based on The AI supercomputer (AISC) of the restructural elastic calculation of high-performance includes that the high-performance elastic based on Reconfigurable Computation calculates master Control system (HEC), the high-performance elastic based on Reconfigurable Computation connect system (HEC_Link), Reconfigurable Computation cell array (RPU array), at least one machine perceptron 1-P, machine behavior device 1-Q and the AI supercomputer based on restructural elastic calculation are compiled Translate system.P, Q is natural number.
High-performance elastic based on Reconfigurable Computation calculates master control system (HEC) for connecting machine under the control of control code Device perceptron and machine behavior device connect Reconfigurable Computation unit (RPU) array.HEC be AI supercomputer (AISuperComputer, AISC central processing unit) provides hardware operation platform for the operation of compiled online system;It is connection Reconfigurable Computation unit (RPU) control platform of array;It is the peripheral hardware control platform for connecting machine perceptron and machine behavior device.
In one example, HEC may include by multilayer system bus, configuration bus, peripheral control unit, on-chip DMA control Device processed, on-chip memory, on piece storage control, chip external memory, one or more compositions in chip external memory controller System-level reconfigurable data access.In the operation of system-level reconfigurable data access, dma controller is by the master in system After controller setting, chip external memory can be accessed by piece file memory controller, by data (computing object) from piece external storage Device reads and is written to on-chip memory or data (calculated result) read from on-chip memory and are written to piece external storage Device.
In one example, HEC may include the system-level reconstructing controller being made of master controller, configuration bus; The system-level reconstructing controller delivery system grade control task is controlled with will pass through multilayer system bus to the peripheral hardware in system Device, dma controller and piece file memory controller are controlled, to complete system-level control.
In one example, master controller can also undertake the partial function of universal cpu processor.
High-performance elastic connection system (HEC_Link) based on Reconfigurable Computation is used under the control of control information connect It connects high-performance elastic and calculates master control system and RPU array.HEC_Link is that AI supercomputer realizes the auxiliary control for calculating the configuration of power elasticity Unit and the main different mode and the corresponding HEC_ of different application scene configuration realizing carrier, being extended according to power is calculated Link.HEC_Link is can be configured by connection relationship of the master control system to RPU array, and control information is exactly for fixed Adopted connection relationship.
In one example, HEC_Link calculates master control system and Reconfigurable Computation unit for connecting high-performance elastic RPU array, realize RPU and RPU between and high speed data transfer and main control between RPU and Master control chip system Configuration status between chip and RPU is communicated with information.
In one example, the HEC_Link can according to need, and to the RPU array having determined, be connected by changing Relationship is connect, allows between each RPU and is combined in different ways.Combination includes but is not limited to be grouped to RPU, defeated respectively Enter different reconfigurable datas and execute different task, input the different same tasks of data execution, and input is same Data execute different tasks, and input same data and be effectively carried out the same task dispatching etc..
In one example, the HEC_Link can be used for that RPU is carried out calculating depth and be calculated wide according to calculation power demand The extension of degree obtains the ability for executing bigger program or more.
In one example, the HEC_Link can be used for carrying out array deployment to RPU array according to calculation power demand, divide Cloth deployment or cloud deployment.
In one example, there are many forms by HEC_Link, include: protocol controller in various forms (for assisting View conversion) and/or bridge controller, bridgt circuit.These controllers or circuit can reside in main control chip, individual chips, Exist in RPU chip or in the form of other.
Reconfigurable Computation cell array (RPU array) is used under the control of configuration information, is executed based on reconfigurable data The reconfigurable data of AI supercomputer calculates.RPU array is the main operational unit of AI supercomputer elastic calculation, the restructural number of AI supercomputer It can all be completed on RPU array according to calculating;Can according to demand additions and deletions RPU or change RPU array arrangement mode and net Network framework realizes the calculation power configuration of Reconfigurable Computation.
In one example, Reconfigurable Computation cell array (RPU array) includes the multiple RPU, RPU by M row N column arrangement It is linked together between RPU by HEC_Link, each RPU can obtain corresponding configuration information by HEC_Link.
In one example, the reconfigurable data for being input to RPU comes from master control system from other RPU or by HEC_Link System, the calculated result of RPU are output to other RPU or are output to master control system, input source and output purpose by HEC_Link Control depending on master control system and HEC_Link.
Reconfigurable data access in RPU array can exchange data by chip external memory.Weighing in the RPU array Structure data path and system-level reconfigurable data access and the two collectively form the main body of the reconfigurable data access of AI supercomputer.
Settable reconstructing controller in each RPU in RPU array.System-level reconstructing controller and HEC_Link control Reconstructing controller in device, RPU array in each RPU collectively forms the main body of the reconstructing controller of AI supercomputer.
At least one machine perceptron 1-N enters information as restructural number for providing environment sensing information or equipment According to.Machine perceptron is the peripheral hardware of AI supercomputer, for providing vision, the sense of hearing, tactile, the sense of taste, geographical location, pose for AI supercomputer The environment sensings information such as variation or equipment input information.
In one example, machine perceptron includes end sensor, completes the letter of terminal, peripheral environment and oneself state Breath acquisition.End sensor includes but is not limited to imaging sensor (Camera), millimetre-wave radar (Radar), ultrasonic radar (Ultrasonic), laser radar (Lidar), Inertial Measurement Unit (IMU), microphone (MIC), Global Satellite Navigation System (GNSS), touch screen (Touch Panel), stress induction device etc..
In one example, machine perceptron further include: have the sensor module of perception analysis and computing capability, to end The data of end sensor acquisition carry out secondary analysis and calculate the new environment sensing information of generation.Sensor module includes but is not limited to RGB-D depth camera, binocular depth camera, VIO three-dimensional reconstruction camera etc.;Its perception analysis and computing capability can based on CPU, GPU, FPGA or DSP can also be based on RPU.
At least one machine behavior device 1-P is the peripheral hardware of AI supercomputer, for exporting the result of the calculating of AI supercomputer or reasoning Or execute the instruction of AI supercomputer.
In one example, the machine behavior device based on restructural elastic calculation include but is not limited to communication unit, it is man-machine One or more of interface, servo mechanism, control unit etc..
AI supercomputer compiling system based on restructural elastic calculation, for compiling of application to be generated to the control of master controller Code processed, the control information of HEC_Link, RPU array every configuration information.
Specifically, application program can be marked and be pre-processed by AI supercomputer compiling system, resolve into master control system It executes code and RPU executes code, code is then executed to RPU according to RPU array and carries out code conversion and optimization, task time domain It divides, task RPU division, task configuration information generation, final compiling generates the control of the control code, HEC_Link of master controller Every configuration information of information, RPU array.
In one example, the input of the compiling system of AI supercomputer is the application program that high-level programming language is write, as C, Python etc..Compiling system further includes the application framework that high-level language is relied on, such as TensorFlow, Caffe.The compiling system What system exported is the control code, the control information of HEC_Link, the items of RPU array of the master controller of Reconfigurable Computation with confidence The process state information of the compiler tasks such as the universal executive program of breath and master control system.
Further, the compiling system is supported without order-driven neural network.
In one example, the support is without order-driven neural network, comprising:
(1) task is executed without order-driven neural network.The compiling file that neural network is formed after completing compiling, Every configuration information of the control information of control code, HEC_Link including master controller, RPU array.These files are executing When, do not need instruction control, so that it may realize the output that is input to from reconfigurable data, the result that master control system obtains is exactly mind Result end to end through network.
(2) various neural networks can be supported by extension compiling system.Include in the training and reasoning process of neural network A large amount of parallel computations with compute repeatedly, compiling system can be adapted to according to the characteristics of various neural networks.
Further, the compiling system, can be in compiling, and cooperation HEC_Link realizes the multi-mode bullet to RPU array Property deployment.Multi-mode elasticity deployment mode includes but is not limited to:
(1) according to HEC_Link and RPU array feature, the extension of columns and rows is carried out to RPU array using HEC_Link. The column extension of RPU array is conducive to support more parallel computations, and the row extension of RPU array is conducive to accelerate the longer sequence of calculation Pipeline speed.HEC_Link is Open architecture, can extend RPU according to the calculation power demand and environmental demand of AI supercomputer Array.
Fig. 2 (a) and Fig. 2 (b) illustrate a kind of calculation power expander graphs of AI supercomputer.Fig. 2 (a) is tried hard to before extending.? In Fig. 2 (a), RPU array includes 6 RPU.Configuration information from master control system be configured bus be transmitted to HEC_Link and RPU array, so that RPU array forms such as flowering structure: 6 RPU line up 2 column, 3 row being connected in parallel to each other, and the RPU between every row cannot Be exchanged with each other reconfigurable data, Crossbar according to configuration information or agreement by any one RPU of lastrow export can Reconstruct data pass to any one RPU in next line, and Crossbar is according to configuration information or agreement by last line The reconfigurable data of any one RPU output returns to any one RPU in the first row.
Fig. 2 (b) is tried hard to after extending.In Fig. 2 (b), RPU array extension is 16 RPU, is divided into 4 column 4 in parallel Column, the RPU between every row cannot be exchanged with each other reconfigurable data, and Crossbar is according to configuration information or agreement by lastrow The reconfigurable data of any one RPU output passes to any one RPU in next line, Crossbar according to configuration information or Any one RPU of last line reconfigurable data exported is returned to any one RPU in the first row by person's agreement.
(2) following setting can be done in compiling, changes the connection relationship of RPU by HEC_Link, for having determined RPU array, carry out the adjustment of width perhaps depth to adapt to appointing for more parallel computations or longer calculating cycle respectively Business.
(3) following setting is done in compiling, changes the connection relationship of RPU by HEC_Link, coordinate one in RPU array A or several RPU form an independent RPU group, multiple RPU groups can be formed in a RPU array, asynchronous parallel executes respectively Multitask, and not interfering between each other, to realize the multitask of MIMD and MISD, highly-parallel and the calculating mesh of efficient operation Mark.
Fig. 3 is that AI another kind supercomputer flexible adjustment calculates power, executes the schematic diagram of two tasks parallel.Such as the upper half figure of Fig. 3 It is shown, under the control that master control system is configured the configuration information of bus transmission, the intersection bridge CrossBridge of HEC_Link Form structure so: then interface of the data of task A through HEC_Link inputs RPU0 through A node and HEC_Link interface, The operation result of RPU0 enters RPU1 through HEC_Link interface and CrossBridge.The operation result of RPU1 is equally through HEC_ Link interface and CrossBridge enter RPU2.Hereby it is achieved that the operation of task A.
Equally, the stream compression of task B includes RPU3, RPU4, RPU5, RPU6, finally arrives RPU7.Realize task B's Operation.
It can be seen from figure 3 that task A and task B are carried out parallel substantially.The lower half figure of Fig. 3 briefly illustrate two tasks into Market condition.
Still further, each independent PRU group can be performed simultaneously one or more different training missions;It can also be same The one or more different reasoning tasks of Shi Zhihang;It may also be performed simultaneously multiple training missions and multiple reasoning tasks, and Same data are input to trained network and inference network simultaneously.
(4) it when HEC_Link is disposed using distributed deployment or cloud, can be compiled according to HEC_Link link information Translate, deployment master control system, HEC_Link and and each RPU task, the calculation power to adapt to great scale dispose with it is parallel in terms of It calculates.
Further, when RPU array changes, the including but not limited to variation of RPU quantity, the variation of RPU connection type Deng, it can be by recompilating, realization part or the overall situation redeploy task.
Further, the compiling system can support multi-mode to compile.
In one example, compilation process can be compiled offline mode, and output compiling file is directly passed to operation system System executes;
In one example, it is also possible to run compiled online mode on an operating system, after compiling finishes, according to It needs directly to be executed by operating system.
In one example, the network structure of on-line training can be passed through compiled online deployed in real time by compiled online system Go down, realizes the online real-time update of neural network.
Optionally, AI supercomputer includes the AI supercomputer operating system based on restructural elastic calculation, for managing AI The hardware and software resource of supercomputer manages the peripheral resources of AI supercomputer, and according to the output of compiling system as a result, executing control Code, and configuration information and reconfigurable data are output to RPU array, control executes Reconfigurable Computation program and returns the result.It is main Control system HEC provides hardware operation platform for the operating system of AI supercomputer.
In one example, the operating system is used to obtain the information obtained from machine perceptron, in control machine Device behavior device executes the result that AI is calculated.
In one example, the operating system is for executing compiled online.
Above-described specific embodiment has carried out further the purpose of the present invention, technical scheme and beneficial effects It is described in detail, it should be understood that being not intended to limit the present invention the foregoing is merely a specific embodiment of the invention Protection scope, all any modification, equivalent substitution, improvement and etc. on the basis of technical solution of the present invention, done should all Including within protection scope of the present invention.

Claims (12)

1. a kind of AI supercomputer based on the restructural elastic calculation of high-performance characterized by comprising
Machine perceptron enters information as reconfigurable data for providing environment sensing information or equipment;
Reconfigurable Computation unit (RPU) array;
Machine behavior device, for exporting the result of the calculating of AI supercomputer or reasoning or executing the relevant instruction of AI supercomputer;
High-performance elastic based on Reconfigurable Computation calculates master control system (HEC);
High-performance elastic based on Reconfigurable Computation connects system (HEC_Link);
AI supercomputer compiling system based on restructural elastic calculation, for compiling of application to be generated to the control of master controller Every configuration information of code, the control information of HEC_Link, RPU array, so that master control system connects under the control of control code Machine perceptron and machine behavior device, HEC_Link connection high-performance elastic under the control of control information calculate master control system (HEC) and RPU array, RPU array execute the reconfigurable data of AI supercomputer based on reconfigurable data under the control of configuration information It calculates.
2. AI supercomputer according to claim 1, which is characterized in that the HEC includes:
By multilayer system bus, configuration bus, on-chip DMA controller, on-chip memory, on piece storage control, peripheral hardware control The system-level reconfigurable data access of at least one of device and chip external memory composition.
3. AI supercomputer according to claim 1, it is characterised in that: the high-performance elastic connects system (HEC_Link) high speed between RPU and RPU and between RPU and Master control chip system is being realized under the control of control information Data transmission and the configuration status between Master control chip and RPU are communicated with configuration information.
4. AI supercomputer according to claim 1, which is characterized in that RPU array includes one or more RPU, It is linked together between RPU by HEC_Link;Each RPU passes through HEC_Link and obtains configuration information.
5. AI supercomputer according to claim 1, which is characterized in that including the AI based on restructural elastic calculation Supercomputer operating system, for managing the software and hardware resource and peripheral resources of AI supercomputer, for executing compiling system output Compiling file, for obtain from machine perceptron obtain information, for control machine behavior device execute AI calculate knot Fruit, for executing compiled online.
6. AI supercomputer according to claim 1, it is characterised in that: the characteristics of compiling system is according to neural network It is adapted to.
7. AI supercomputer according to claim 1, which is characterized in that the compiling system compiling is used for basis The characteristics of HEC_Link and RPU array, carries out the control information of the extension of columns and/or rows using HEC_Link to RPU array.
8. AI supercomputer according to claim 1, which is characterized in that the compiling system compiling is for passing through HEC_Link changes the control information of the connection relationship of RPU.
9. AI supercomputer according to claim 1, which is characterized in that the control information of the compiling system compiling So that one or more RPU:
It is performed simultaneously one or more different training missions;
It is performed simultaneously one or more different reasoning tasks;
It is performed simultaneously multiple training missions and multiple reasoning tasks;Or
Same data are input to trained network and inference network simultaneously.
10. AI supercomputer according to claim 1, which is characterized in that use distributed deployment in HEC_Link Or cloud dispose when, the compiling system is compiled according to HEC_Link link information, deployment master control system, HEC_Link and with And the task of each RPU.
11. AI supercomputer according to claim 1, which is characterized in that the compiling system is compiled offline mould Formula, output compiling file are directly passed to operating system execution;Or the compiled online mode of operation on an operating system, compiling After finishing, directly executed as needed by operating system.
12. AI supercomputer according to claim 1 characterized by comprising
Compiling system goes down the network structure of on-line training by compiled online deployed in real time, realizes the online reality of neural network Shi Gengxin.
CN201910350877.6A 2019-04-28 2019-04-28 AI supercomputer based on high-performance reconfigurable elastic calculation Active CN110059050B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910350877.6A CN110059050B (en) 2019-04-28 2019-04-28 AI supercomputer based on high-performance reconfigurable elastic calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910350877.6A CN110059050B (en) 2019-04-28 2019-04-28 AI supercomputer based on high-performance reconfigurable elastic calculation

Publications (2)

Publication Number Publication Date
CN110059050A true CN110059050A (en) 2019-07-26
CN110059050B CN110059050B (en) 2023-07-25

Family

ID=67321429

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910350877.6A Active CN110059050B (en) 2019-04-28 2019-04-28 AI supercomputer based on high-performance reconfigurable elastic calculation

Country Status (1)

Country Link
CN (1) CN110059050B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110515889A (en) * 2019-07-27 2019-11-29 西南电子技术研究所(中国电子科技集团公司第十研究所) Embedded FPGA swarm intelligence computing platform hardware frame
CN110768739A (en) * 2019-09-27 2020-02-07 上海畲贡自动化科技有限公司 Synchronous serial control system and method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8429394B1 (en) * 2008-03-12 2013-04-23 Stone Ridge Technology Reconfigurable computing system that shares processing between a host processor and one or more reconfigurable hardware modules
CN105468568A (en) * 2015-11-13 2016-04-06 上海交通大学 High-efficiency coarse granularity reconfigurable computing system
CN105487838A (en) * 2015-11-23 2016-04-13 上海交通大学 Task-level parallel scheduling method and system for dynamically reconfigurable processor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8429394B1 (en) * 2008-03-12 2013-04-23 Stone Ridge Technology Reconfigurable computing system that shares processing between a host processor and one or more reconfigurable hardware modules
CN105468568A (en) * 2015-11-13 2016-04-06 上海交通大学 High-efficiency coarse granularity reconfigurable computing system
CN105487838A (en) * 2015-11-23 2016-04-13 上海交通大学 Task-level parallel scheduling method and system for dynamically reconfigurable processor

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
史莉雯: "《可重构指令集计算机综述》", 《微处理机》 *
李柏楠: "《面向领域应用的可重构系统关键技术研究》", 《中国博士学位论文全文数据库信息科技辑》 *
魏少军等: "可重构计算处理器技术", 《中国科学:信息科学》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110515889A (en) * 2019-07-27 2019-11-29 西南电子技术研究所(中国电子科技集团公司第十研究所) Embedded FPGA swarm intelligence computing platform hardware frame
CN110515889B (en) * 2019-07-27 2022-12-13 西南电子技术研究所(中国电子科技集团公司第十研究所) Embedded FPGA cluster intelligent computing platform hardware framework
CN110768739A (en) * 2019-09-27 2020-02-07 上海畲贡自动化科技有限公司 Synchronous serial control system and method
CN110768739B (en) * 2019-09-27 2021-04-20 上海鹰钛智能科技有限公司 Synchronous serial control system and method

Also Published As

Publication number Publication date
CN110059050B (en) 2023-07-25

Similar Documents

Publication Publication Date Title
Huang et al. Cosa: Scheduling by constrained optimization for spatial accelerators
US20200042856A1 (en) Scheduler for mapping neural networks onto an array of neural cores in an inference processing unit
Jin et al. Modeling spiking neural networks on SpiNNaker
CN109376843A (en) EEG signals rapid classification method, implementation method and device based on FPGA
Barker et al. A performance evaluation of the Nehalem quad-core processor for scientific computing
Gong et al. Multi2Sim Kepler: A detailed architectural GPU simulator
CN110059050A (en) AI supercomputer based on the restructural elastic calculation of high-performance
Smaragdos et al. BrainFrame: a node-level heterogeneous accelerator platform for neuron simulations
Pastorino et al. Hard real-time multibody simulations using ARM-based embedded systems
CN110262996A (en) A kind of supercomputer based on high-performance Reconfigurable Computation
Kuo et al. The state of the art in parallel production systems
Mudalige et al. A plug-and-play model for evaluating wavefront computations on parallel architectures
García-Quismondo et al. Implementing enzymatic numerical P systems for AI applications by means of graphic processing units
Nocetti et al. Parallel processing in digital control
KR102188044B1 (en) Framework system for intelligent application development based on neuromorphic architecture
Wehner et al. Performance of a distributed memory finite difference atmospheric general circulation model
Kerckhoffs et al. Speeding up backpropagation training on a hypercube computer
Kitajima et al. Modelling parallel program behaviour in ALPES
Savas et al. Dataflow implementation of qr decomposition on a manycore
Cai Ecological Study of the Application of Flowers Plant Real-time Observation and 3D Reconstruction based on Kinect.
Mani et al. Massively parallel real-time reasoning with very large knowledge bases: An interim report
Jendrsczok et al. Generated horizontal and vertical data parallel gca machines for the n-body force calculation
Suluhan et al. PyTorch and CEDR: Enabling Deployment of Machine Learning Models on Heterogeneous Computing Systems
Oudshoorn et al. Conditional task scheduling on loosely-coupled distributed processors
Maharjan Modeling humanoid swarm robots with petri nets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20230703

Address after: Room 233, Floor 2, Building 10, No. 16, Caixiang East Road, Nancai, Shunyi District, Beijing 101300

Applicant after: Beijing Meilian Dongqing Technology Co.,Ltd.

Address before: 100142 907, area 1, floor 9, No. 160, North West Fourth Ring Road, Haidian District, Beijing

Applicant before: BEIJING HYPERX AI COMPUTING TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant