CN108132908A - Concurrent computational system - Google Patents

Concurrent computational system Download PDF

Info

Publication number
CN108132908A
CN108132908A CN201611082574.3A CN201611082574A CN108132908A CN 108132908 A CN108132908 A CN 108132908A CN 201611082574 A CN201611082574 A CN 201611082574A CN 108132908 A CN108132908 A CN 108132908A
Authority
CN
China
Prior art keywords
processing unit
call processing
multiple call
parallel
mcpu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611082574.3A
Other languages
Chinese (zh)
Other versions
CN108132908B (en
Inventor
赵雪峰
杜望宁
刘辰
张戈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Longxin Zhongke (Nanjing) Technology Co., Ltd
Original Assignee
Loongson Technology Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Loongson Technology Corp Ltd filed Critical Loongson Technology Corp Ltd
Priority to CN201611082574.3A priority Critical patent/CN108132908B/en
Publication of CN108132908A publication Critical patent/CN108132908A/en
Application granted granted Critical
Publication of CN108132908B publication Critical patent/CN108132908B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/177Initialisation or configuration control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17337Direct connection machines, e.g. completely connected computers, point to point communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Multi Processors (AREA)

Abstract

The present invention provides a kind of concurrent computational system.The computer system, including:One circuit board, a front server unit and a MCPU multiple call processing unit, MCPU multiple call processing unit include multiple calculate nodes, and front server unit connect with MCPU multiple call processing unit and is respectively provided on circuit boards;Front server unit is MCPU multiple call processing unit selection target parallel schema from preset multiple parallel schemas, and the CPU for calculate node each in MCPU multiple call processing unit distributes the ID number under target parallel pattern;MCPU multiple call processing unit, for being started according to the ID number of target parallel pattern and the CPU of each calculate node.Technical scheme of the present invention, front server unit MCPU multiple call processing unit selection target parallel schema according to actual needs, and then improve the application range of concurrent computational system, avoid the problem that concurrent computational system use single fixed parallel schema when caused by the wasting of resources.

Description

Concurrent computational system
Technical field
The present invention relates to computer technology more particularly to a kind of concurrent computational systems.
Background technology
With the continuous development of computer hardware technique, processor has entered at present from monokaryon (Single-core) Multinuclear (Multi-core) is got over using epoch, the application of the parallel computer (i.e. multichannel process device computer) based on multiprocessor Come it is more extensive, such as cloud computing, big data processing etc. application.
Multichannel process device computer system refers to be integrated with multiple central processing units in a computer system (Central Processing Unit, abbreviation CPU), multichannel process device computer system is broadly divided into double path system and four tunnels System.Double path system is the multichannel computer system using two CPU on a mainboard, and four tunnel systems is use on a mainboard The multichannel computer system of four CPU.It is carried out mutually by high-speed bus outside piece between each CPU in multichannel process device computer system Connect, and shared drive subsystem and input/output (Input/Output, abbreviation IO) bus.
Multichannel process device computer system fixed as single multiloop loop system (such as can only can only be used as two-way at present System or four tunnel systems) it uses, cause the waste of resource.
Invention content
The present invention provides a kind of concurrent computational system, to overcome current multichannel process device computer system can only be fixed The problem of being used as single multiloop loop system, causing the waste of resource.
The present invention provides a kind of concurrent computational system, including:One circuit board, a front server unit and one MCPU multiple call processing unit, the MCPU multiple call processing unit include multiple calculate nodes, at the front server unit and the multichannel Reason unit is connected and is arranged on the circuit board;
The front server unit, for being the MCPU multiple call processing unit selection mesh from preset multiple parallel schemas Parallel schema is marked, and the central processor CPU for calculate node each in the MCPU multiple call processing unit is distributed in the target ID number under parallel schema;
The MCPU multiple call processing unit, for the ID according to the target parallel pattern and the CPU of each calculate node Number started;
The MCPU multiple call processing unit, for the ID according to the target parallel pattern and the CPU of each calculate node Number started.
In the feasible realization method of the another kind of the present invention, the system also includes exchange chip, the preposition service Device unit includes the first network interface and the second network interface;
First network interface is connect with external network, and second network interface passes through the exchange chip and the multichannel process The network interface connection of each calculate node in unit.
The present invention another feasible realization method in, the front server unit further include Network File System and Kernel of the MCPU multiple call processing unit under different parallel schemas;
The MCPU multiple call processing unit is additionally operable to through second network interface and the exchange chip from the front server The kernel under the Network File System and the target parallel pattern is read in unit.
In another feasible realization method of the present invention, the front server unit further includes register, described to post Storage is connect with the CPU of the calculate node;
The register, for preserving the CPU of the target parallel pattern and each calculate node in the target parallel ID number under pattern.
In another feasible realization method of the present invention, the front server unit is specifically used for:
Obtain selection parallel schema input by user;
According to the selection parallel schema, obtained and the selection parallel mode matching from preset multiple parallel schemas Target parallel pattern;
It is that the CPU of each calculate node in the MCPU multiple call processing unit is distributed described according to the target parallel pattern ID number under target parallel pattern, and by the CPU of the target parallel pattern and each calculate node in the target parallel mould ID number under formula is preserved into the register.
In another feasible realization method of the present invention, the front server unit further includes first serial and second Serial ports;
The first serial is connect with external equipment, for exporting the data information of the front server unit;
The second serial is connect with any master computing node in the MCPU multiple call processing unit, described any for exporting Calculating information on master computing node.
The present invention another feasible realization method in, the system also includes selection chip, it is described selection chip with The second serial connection, the selection chip are used for according to default selection rule and any master in the MCPU multiple call processing unit Calculate node connects.
In another feasible realization method of the present invention, connected between the calculate node by end-to-end HT buses.
Further, the front server unit further includes:Hard disk slot, memory, versabus USB interface and video Graphic array USB interface.
Optionally, the parallel schema is included to single channel parallel schema, two-way parallel schema and four road parallel schemas.
Concurrent computational system provided by the invention, by by a MCPU multiple call processing unit and a front server unit It connects and sets on a circuit board so that the front server unit is the multichannel from preset multiple parallel schemas Processing unit selection target parallel schema, and be the central processor CPU of calculate node each in the MCPU multiple call processing unit Distribute ID number under the target parallel pattern so that MCPU multiple call processing unit is according to the target parallel pattern and described each The ID number of the CPU of calculate node is started.The concurrent computational system of the present embodiment, front server unit is according to practical need To be the ID number that the CPU of each calculate node under target parallel pattern and the target parallel pattern is configured in MCPU multiple call processing unit, into And the unrestricted choice to the parallel schema of MCPU multiple call processing unit is realized, and then improve the application model of concurrent computational system Enclose, improve the research in the fields such as parallel computation, avoid concurrent computational system use single fixed parallel schema when caused by The problem of wasting of resources.
Description of the drawings
It, below will be to embodiment or the prior art in order to illustrate more clearly of the present invention or technical solution of the prior art Attached drawing is briefly described needed in description, it should be apparent that, the accompanying drawings in the following description is the one of the present invention A little embodiments, for those of ordinary skill in the art, without having to pay creative labor, can also be according to this A little attached drawings obtain other attached drawings.
Fig. 1 is the structure diagram of concurrent computational system embodiment one provided by the invention;
Fig. 1 a are the structure diagram of MCPU multiple call processing unit of the present invention;
Fig. 2 is the structure diagram of concurrent computational system embodiment two provided by the invention;
Fig. 3 is the structure diagram of concurrent computational system embodiment three provided by the invention;
Fig. 4 is the structure diagram of concurrent computational system example IV provided by the invention.
Reference sign:
100:Concurrent computational system;
101:Circuit board;
10:Front server unit;
20:MCPU multiple call processing unit;
30:Calculate node;
11:First network interface;
12:Second network interface;
40:Exchange chip;
13:First serial;
14:Second serial;
50:Select chip;
60:Register.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached in the embodiment of the present invention Figure, is clearly and completely described the technical solution in the embodiment of the present invention, it is clear that described embodiment is the present invention Part of the embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not having All other embodiments obtained under the premise of creative work are made, shall fall within the protection scope of the present invention.
Existing multichannel process computer since its parallel schema is fixed from changing, when need other parallel schemas it is more When road handles computer, purchase can only be re-started, in turn results in the waste of resource.
Concurrent computational system provided by the invention, by by a front server unit and a MCPU multiple call processing unit Connection so that the front server unit is MCPU multiple call processing unit selection target parallel schema, and is every in MCPU multiple call processing unit The CPU of a calculate node distributes the ID number under the target parallel pattern so that MCPU multiple call processing unit is according to the target parallel mould The ID number of the CPU of formula and each calculate node is started, and then is realized to the flexible of the parallel schema of MCPU multiple call processing unit Selection so that different parallel schemas can be configured in a concurrent computational system as needed for MCPU multiple call processing unit, so as to The reuse of concurrent computational system is improved, is economized on resources.
Meanwhile in the concurrent computational system of the present embodiment, front server unit is located at same with MCPU multiple call processing unit Side (i.e. simultaneously positioned at network layer), it is this positioned at the same side be not the group for simply using server+multichannel process device device Syntype, but front server unit is used to replace server, and can be by by preposition service unit and MCPU multiple call processing unit The mode in same mainboard or in same equipment (such as PC machine, mobile terminal) is integrated in realize so that entire parallel computation Machine system structure is simple and convenient to operate.It should be noted that:Above-mentioned this integrate is not at simple server and multichannel The superposition of device device is managed, but those skilled in the art is needed to pay performing creative labour and changed accordingly to prior art Into what can be realized.
In the following, the technical solution shown in the application is described in detail by specific embodiment.Under it should be noted that These specific embodiments of face can be combined with each other, may be in certain embodiments for the same or similar concept or process It repeats no more.
MCPU multiple call processing unit in the concurrent computational system of the present invention can include multiple calculate nodes, for the ease of explaining It states, the present embodiment is described so that MCPU multiple call processing unit includes four calculate nodes as an example, includes it for MCPU multiple call processing unit The situation of the calculate node of his number, the principle that four calculate nodes are included with MCPU multiple call processing unit is identical, and details are not described herein.
Fig. 1 is the structure diagram of concurrent computational system embodiment one provided by the invention.As shown in Figure 1, this implementation The concurrent computational system 100 of example can include:At one circuit board, 101, front server units 10 and a multichannel Unit 20 is managed, the MCPU multiple call processing unit 20 includes multiple calculate nodes 30, the front server unit 10 and the multichannel Processing unit 20 connects, and is arranged on the circuit board 101 so that the front server 10 and MCPU multiple call processing unit 20 It is integral;The front server unit 10, for being the MCPU multiple call processing unit 20 from preset multiple parallel schemas Selection target parallel schema, and be the central processor CPU distribution of calculate node 30 each in the MCPU multiple call processing unit 20 ID number under the target parallel pattern;The MCPU multiple call processing unit 20, for according to the target parallel pattern and described The ID number of the CPU of each calculate node 30 is started.
Specifically, as shown in Figure 1, the concurrent computational system of the present embodiment, by a front server unit 10 and one The connection of MCPU multiple call processing unit 20 is arranged on same circuit board 101, wherein front server unit 10 and MCPU multiple call processing unit 20 can be connected by corresponding bus, realize communication between the two.MCPU multiple call processing unit 20 includes multiple calculate nodes 30, Fig. 1 merely illustrates the situation that MCPU multiple call processing unit 20 includes four calculate nodes 30, and the present embodiment is in MCPU multiple call processing unit 20 Including the specific number of calculate node 30 be not limited.
Optionally, the processor of the front server unit 10 of the present embodiment can be monolithic processor, such as Godson 2H Processor.
Fig. 1 a are the structure diagram of MCPU multiple call processing unit of the present invention.As shown in Figure 1, the MCPU multiple call processing unit 20 of the present embodiment It can include 4 symmetrical calculate nodes 30, wherein, each calculate node 30 includes CPU, and (such as tetra- cores of Godson 3A are handled Device), memory (such as DDR3 memories), flash memory Flash basic input output system (Basic Input Output System, Abbreviation BIOS), serial ports, network interface (such as gigabit Ethernet mouth).As shown in Figure 1a, the BIOS Flash of each calculate node 30 lead to It crosses low pin count (Low Pin Count, abbreviation LPC) bus to connect with CPU, serial ports passes through universal asynchronous receiving-transmitting transmitter (Universal Asynchronous Receiver/Transmitter, abbreviation UART) is connect with CPU, and network interface passes through peripheral hardware Component connection standard (Peripheral Component Interconnect, abbreviation PCI) bus is connect with CPU.It is each to calculate The CPU of node 30 is realized by super transmission (HyperTransport, abbreviation HT) bus in processing board and interconnected.
In the present embodiment, user is mounted with that the selection of 20 each parallel schema of MCPU multiple call processing unit is soft in front-end server When part and MCPU multiple call processing unit 20 start under different parallel schemas, the setting of the ID number of the CPU of each calculate node 30 is soft The software of part and control 30 power on/off function of calculate node.
MCPU multiple call processing unit 20 as shown in Figure 1a, there are three types of parallel schemas, respectively:4 single channel parallel schemas, 2 A two-way parallel schema and 1 four road parallel schema.Front server unit 10 selects one from above-mentioned three kinds of parallel schemas Target parallel pattern of the parallel schema as MCPU multiple call processing unit 20 so that MCPU multiple call processing unit 20 is opened with the target parallel pattern It is dynamic.For example, when preposition server unit 10 be the target parallel pattern that selects of MCPU multiple call processing unit 20 for two-way parallel schema when, Front server unit 10 controls corresponding calculate node 30 in the parallel schema lower switch machine so that MCPU multiple call processing unit 20 with The two-way parallel schema starts.
It is corresponding, it is preposition after preposition server unit 10 has selected target parallel pattern for MCPU multiple call processing unit 20 Server unit 10 sets ID number for the CPU of each calculate node 30, and the ID number of the CPU determines the calculate node 30 in the mesh It marks under parallel schema as which of concurrent computational system 100 component part, that is, determines that the calculate node 30 is main Calculate node 30 is still from calculate node 30.
Below by taking the MCPU multiple call processing unit 20 shown in Fig. 1 a as an example, three kinds of parallel moulds of MCPU multiple call processing unit 20 are specifically introduced The setting of the ID number of the CPU of calculate node 30 under the setting of formula and each parallel schema:
Single channel parallel schema:According to preset rules, using each calculate node 30 in MCPU multiple call processing unit 20 as 1 list Road system can be considered a chip multiprocessors (Chip multiprocessors, abbreviation CMP) or symmetric multiprocessor (Symmetrical Multi-Processing, abbreviation SMP).It is preceding when MCPU multiple call processing unit 20 is started with this parallel schema The ID number of the CPU of each calculate node 30 can be set by putting server unit 10 (being specifically the software in front server unit 10) 00 is set to, MCPU multiple call processing unit at this time 20 is configured as 4 single-path systems.
Wherein, preset rules determine which calculate node 30 in MCPU multiple call processing unit 20 needs to be connected with each other, with And which calculate node 30 is determined as master computing node during connection, which calculate node is determined as saving from calculating Point.
Two-way parallel schema:According to preset rules, by two in four calculate nodes 30 in MCPU multiple call processing unit 20 Calculate node 30 is by HT bus interconnections, by remaining two calculate nodes 30 also by HT bus bars, multichannel process at this time The cache coherent non-uniform storage that unit 20 constitutes a binode accesses (Cache-Coherent Non- Uniform Memory Access, abbreviation CC-NUMA) system.It is preceding when MCPU multiple call processing unit 20 is started with this parallel schema Server unit 10 (being specifically the software in front server unit 10) is put to match MCPU multiple call processing unit 20 according to preset rules It is set to 2 double path systems.Corresponding, front server unit 10 is by the CPU of a calculate node 30 in each double path system ID number be set as 00, the ID number of the CPU of another calculate node 30 is arranged to 01, in the double path system, wherein CPU's The calculate node 30 that ID number is 00 is master computing node, and the calculate node 30 that the ID number of CPU is 01 is the slave meter of master computing node Operator node.
Four road parallel schemas:It is according to preset rules, four calculate nodes 30 in MCPU multiple call processing unit 20 are total by HT Line interconnects, and forms the CC-NUMA ring systems of four nodes.It is preceding when MCPU multiple call processing unit 20 is started with this parallel schema Server unit 10 (being specifically the software in front server unit 10) is put to match MCPU multiple call processing unit 20 according to preset rules It is set to 14 tunnel system.Corresponding, front server unit 10 is by the ID number of the CPU of 4 calculate nodes 30 in the 4 tunnel system Set gradually is 00,01,10,11.Wherein, the calculate node 30 that the ID number of CPU is 00 is master computing node, and the ID number of CPU is 01st, 10,11 calculate node 30 is the slave calculate node of master computing node.
In the present embodiment, it is with what that front server unit 10 can select MCPU multiple call processing unit 20 according to actual needs Kind parallel schema is started, such as when 100 problem to be treated of concurrent computational system is more complicated, front server list Member 10 can select MCPU multiple call processing unit 20 to start with above-mentioned four road parallel schema, for completing the calculating of big data, work as needs When the problem of processing is simpler, front server unit 10 can select MCPU multiple call processing unit 20 to be opened with above-mentioned single channel parallel schema It is dynamic so that one of calculate node 30 handles the simple problem.
It follows that the concurrent computational system 100 of the present invention, is flexibly MCPU multiple call processing unit 20 according to actual needs Selection target parallel schema, and then realize free switching of the MCPU multiple call processing unit 20 between different parallel schemas, Jin Er great The big research for facilitating the fields such as parallel computation and Computer Architecture.Meanwhile the concurrent computational system 100 of the present embodiment is Each MCPU multiple call processing unit 20 connects a front-end server, and then improve the reliability of MCPU multiple call processing unit 20.
During actual use, front server unit 10 chooses target parallel pattern for MCPU multiple call processing unit 20 Later, the ID number under the target parallel pattern is distributed for the CPU of each calculate node 30 in MCPU multiple call processing unit 20, such as It, then will be in MCPU multiple call processing unit 20 when preposition server unit 10 selects MCPU multiple call processing unit 20 to start with four road parallel schemas The ID number of CPU of calculate node 30 set gradually into 00,01,10 and 11.When MCPU multiple call processing unit 20 it is next electrically activate when, The ID number 00,01,10 and 11 of 20 Yi Gai of MCPU multiple call processing unit, tetra- road parallel schemas and the CPU of the calculate node 30 of above-mentioned setting Start.
Further, existing concurrent computational system, front server are located at network side, and MCPU multiple call processing unit, which is located at, to be used Family side, so that whole system is complicated, it has not been convenient to manage.Meanwhile control unit is set simultaneously for MCPU multiple call processing unit It during row pattern, is realized by the combination of toggle switch, button, grank switch etc., the complexity of pattern and serial ports handover operation Degree is higher.And the front server unit 10 of the present invention is located at the same side (i.e. network layer) with MCPU multiple call processing unit 20, structure Simply, it is easy to operate, realize the flexible selection to the parallel schema of MCPU multiple call processing unit
Concurrent computational system provided by the invention, by by a MCPU multiple call processing unit and a front server unit It connects and sets on a circuit board so that the front server unit is the multichannel from preset multiple parallel schemas Processing unit selection target parallel schema, and be the central processor CPU of calculate node each in the MCPU multiple call processing unit Distribute ID number under the target parallel pattern so that MCPU multiple call processing unit is according to the target parallel pattern and described each The ID number of the CPU of calculate node is started.The concurrent computational system of the present embodiment, front server unit is according to practical need To be the ID number that the CPU of each calculate node under target parallel pattern and the target parallel pattern is configured in MCPU multiple call processing unit, into And the unrestricted choice to the parallel schema of MCPU multiple call processing unit is realized, and then improve the application model of concurrent computational system Enclose, improve the research in the fields such as parallel computation, avoid concurrent computational system use single fixed parallel schema when caused by The problem of wasting of resources.
Fig. 2 is the structure diagram of concurrent computational system embodiment two provided by the invention, in the base of above-described embodiment On plinth, as shown in Fig. 2, the concurrent computational system 100 of the present embodiment can also include exchange chip 40, front server unit 10 include the first network interface 11 and the second network interface 12, wherein, the first network interface 11 is connect with external network, and second network interface 12 passes through The exchange chip 40 and the network interface connection of each calculate node 30 in the MCPU multiple call processing unit 20.
Specifically, as shown in Fig. 2, in the present embodiment, the first network interface 11 and the external network of front server unit 10 It connects (such as being connect with external ethernet), one end of the second network interface 12 and exchange chip 40 of front server unit 10 connects It connects, the other end of exchange chip 40 and the network interface connection of each calculate node 30 shown in Fig. 1 a, and then by front server list Member 10 is connected with each calculate node 30 in MCPU multiple call processing unit 20 so that front server unit 10 can pass through net Mouth and each calculate node 30 communicate.
Optionally, the exchange chip 40 in the present embodiment can be the network switch.
Further, the front server unit 10 of the present embodiment further includes Network File System and the multichannel process list Kernel of the member under different parallel schemas.
The MCPU multiple call processing unit 20 is by second network interface 12 and the exchange chip 40 from the front server The kernel under the Network File System and the target parallel pattern is read in unit 10.
With reference to shown in Fig. 1 a, since the MCPU multiple call processing unit 20 of the present embodiment does not include hard-disk interface, so that storage The hard disk for having the systems such as operating system cannot connect in the MCPU multiple call processing unit 20, therefore, the multichannel process list of the present embodiment Member 20 can only start Network File System.In order to solve this problem, the front server unit 10 of the present embodiment is by network MCPU multiple call processing unit 20 provides Network File System and kernel, and then ensure that the normal of MCPU multiple call processing unit 20 starts operation.
For example, front server unit 10 builds Network File System (Network File for MCPU multiple call processing unit 20 System, abbreviation NFS) service so that front server unit 10 is used as nfs server, and net is provided for MCPU multiple call processing unit 20 Network file system.Meanwhile front server unit 10 also builds Simple File Transfer Protocol for MCPU multiple call processing unit 20 (Trivial File Transfer Protocol, abbreviation TFTP) is serviced, and is MCPU multiple call processing unit 20 as tftp server Kernel download service is provided.
Specifically, the IP of MCPU multiple call processing unit 20 is defaulted in some network segment, front server unit 10 to Intranet The IP of card also defaults to phase same network segment.It provides to place under the catalogue of NFS services and cuts out the operating system file with postponing, and set Permission allows the network segment of MCPU multiple call processing unit 20 to access.When MCPU multiple call processing unit 20 is as single channel parallel schema, quite In 4 independent single channel hosts, and each calculate node 30 can be logged on in same NFS file system, therefore is needed in NFS The relevant configuration of user's separate category is carried out in file system, different each calculate nodes 30 is identified as different users, A NFS file system can be shared each other in this way without influencing each other.It provides and places multichannel under the catalogue of TFTP services The kernel of processing unit 20, such as symmetric multiprocessor system (Symmetrical Multi- during as single channel parallel schema Processing, abbreviation SMP) 4core (core) kernel, cache coherent non-uniform storage during as two-way parallel schema Access structure (Cache Coherence-Non Uniform Memory Access, abbreviation CC-NUMA) 8core kernels, as CC-NUMA 16core kernels during four road parallel schemas.When MCPU multiple call processing unit 20 starts according to different modes, BIOS It reads and is recorded in current parallel schema a to variable from hardware, different according to the value of variable, selection loading is corresponding Kernel under pattern.
It should be noted that the common knowledge built as those skilled in the art of nfs server and tftp server, This is repeated no more.
The technical solution of the present embodiment, before the startup of MCPU multiple call processing unit 20, front server unit 10 is according to reality It needs for 20 selection target parallel schema of MCPU multiple call processing unit, and is each calculate node 30 in MCPU multiple call processing unit 20 CPU distributes the ID number under the target pattern.Meanwhile before the startup of MCPU multiple call processing unit 20, front server unit 10 is also Nfs server and tftp server are put up for MCPU multiple call processing unit 20.When MCPU multiple call processing unit 20 must electrically activate, at multichannel Reason unit 20 is started according to the ID number of target parallel pattern and the CPU of each calculate node 30, while from preposition service The nfs server that device unit 10 is built obtains kernel from Network File System is obtained from tftp server, and then realizes more The normal of road processing unit 20 starts operation.
Further, the concurrent computational system 100 of the present embodiment, the one preposition clothes of collocation of MCPU multiple call processing unit 20 Business device unit 10 so that the front server unit 10 provides network service for the MCPU multiple call processing unit 20, and then ensures to be somebody's turn to do MCPU multiple call processing unit 20 can obtain enough bandwidth resources, ensure that MCPU multiple call processing unit 20 can obtain network file in time System and kernel, so as to improve the reliability of MCPU multiple call processing unit 20 and calculating speed.
Concurrent computational system provided by the invention, by setting the first network interface and the second net on front server unit Mouthful, front server unit is connect by the first network interface with external network, passes through the second network interface and exchange chip and each meter The network interface connection of operator node so that front server unit provides Network File System and interior by network for MCPU multiple call processing unit Core, and then ensure that the normal of MCPU multiple call processing unit starts operation.
Fig. 3 is the structure diagram of concurrent computational system embodiment three provided by the invention, in the base of above-described embodiment On plinth, the front server unit 10 of the present embodiment can also include register 60, the register 60 and the calculate node 30 CPU connections.
Specifically, as shown in figure 3, the front server unit 10 of the present embodiment include register 60, the register 60 with The CPU connections of each calculate node 30 in MCPU multiple call processing unit 20.In actual use, front server unit 10 is by user The target parallel pattern of input is parsed, and is converted into control signal, and the control signal is written in register 60.Meanwhile Front server unit 10 is that CPU in each calculate node 30 is arranged on ID number under the target parallel pattern, and by the ID In number write-in register 60.When MCPU multiple call processing unit 20 starts, CPU in each calculate node 30 is from preposition server unit Its corresponding target parallel pattern and ID number are read in 10 register 60, and is opened with the target parallel pattern and ID number It is dynamic.
Meanwhile start when calculate node 30 determines to be started with target parallel pattern, such as with two-way parallel schema, Then the host node in the two-way parallel schema loads the kernel under the two-way parallel schema from preposition server unit 10.
In actual use, it is assumed that when user's selection target parallel schema is single channel parallel schema, front server list The input of user can be resolved to [setting, pattern, setting value] by the corresponding software in member 10, and mode settings 1 are resolved to Hardware control signal is written in register 60.And the CPU ID of each calculate node 30 under the target parallel pattern are set successively It is 00,00,00,00, the CPU ID of each calculate node 30 is also written in register 60.Start in MCPU multiple call processing unit 20 When, the CPU (being specifically that CPU passes through BIOS) of calculate node 30 reads oneself from the register 60 of preposition server unit 10 Start-up mode, when learning as single channel parallel schema, the BIOS of each meeting of calculate node 30 isolated operation oneself.Meanwhile each Calculate node 30 loads the kernel under single channel parallel schema from preposition server unit 10.
Concurrent computational system provided by the invention is posted by setting register in front server unit, and by this Storage is connect with the CPU of each calculate node so that the CPU of each calculate node reads the startup mould of oneself from the register Formula and ID number.
In another feasible realization method of the present invention, on the basis of above-described embodiment, the present embodiment refers to Front server unit 10 is 20 selection target parallel schema of MCPU multiple call processing unit and distributes the CPU of each calculate node 30 at this The detailed process of ID number under target parallel pattern.That is the front server unit 10 of the present embodiment is specifically used for:
Obtain selection parallel schema input by user;
According to the selection parallel schema, obtained and the selection parallel mode matching from preset multiple parallel schemas Target parallel pattern;
It is that the CPU distribution of each calculate node 30 in the MCPU multiple call processing unit 20 exists according to the target parallel pattern ID number under the target parallel pattern, and by the CPU of the target parallel pattern and each calculate node 30 in the target ID number under parallel schema is preserved into the register 60.
Be equipped in the front server unit 10 of the present embodiment 20 each parallel schema of MCPU multiple call processing unit selection software, The distribution of the ID number of the CPU of each calculate node 30 corresponding when starting under different parallel schemas of MCPU multiple call processing unit 20 is soft Part and the software for controlling 30 power on/off function of calculate node.Above-mentioned each software is by the kernel in front-end server unit 10 What the module of layer and the System Programming of application layer were completed jointly.Wherein, the selection software of 20 each parallel schema of MCPU multiple call processing unit, The distribution of the ID number of the CPU of each calculate node 30 corresponding when starting under different parallel schemas of MCPU multiple call processing unit 20 is soft Part and the software for controlling 30 power on/off function of calculate node are completed by the kernel module on front server unit 10. The system program of application layer is used for obtaining selection parallel schema input by user, and being parsed selection parallel schema into, passes through The module that ioctl mechanism is transferred to inner nuclear layer carries out practical configuration.
Specifically, when the selection parallel schema that user inputs to preposition server unit 10, front server unit 10 will Selection parallel schema input by user is matched with multiple parallel schemas that it is preserved, and is therefrom obtained and selection parallel schema Target parallel pattern of the parallel schema matched as MCPU multiple call processing unit 20.Then, exist for the CUP of each calculate node 30 distribution ID number under the target parallel pattern, and by the CUP of above-mentioned target parallel pattern and each calculate node 30 in the target parallel ID number under pattern is saved in register 60 so that MCPU multiple call processing unit 20 read from register 60 target parallel pattern and ID numbers of the CUP of each calculate node 30 under the target parallel pattern, and started with the target parallel pattern.
For example, when user to preposition server unit 10 input when selecting parallel schema as single channel parallel schema, it is preposition The input of user can be resolved to [setting, pattern, setting value] by the application program in server unit 10, be then passed to front end clothes It is engaged in the kernel of device unit 10.The mode settings of single channel parallel schema are 1 at this time, which can be resolved to hardware by program Control signal, be written in register 60, and the ID of the CPU of each calculate node 30 under the pattern is set to be followed successively by 00,00,00, 00.In this way when MCPU multiple call processing unit 20 starts, BIOS can read the start-up mode of oneself from register 60, learn as single channel When parallel schema starts, the BIOS of each meeting of calculate node 30 isolated operation oneself.The parallel mould of single channel can be carried out in bios code CPU under formula, memory and address window code configuration, load the kernel under the pattern.It is when the input for receiving user operates Switching on and shutting down or when restarting, application program, which can input user, simply resolve to [setting acts, CPU ID], pass to kernel, interior Core can check that current pattern for single channel parallel schema, is then configured corresponding calculate node 30 according to ID number.
When user to preposition server unit 10 input when selecting parallel schema as two-way parallel schema, front server User's input is resolved to [setting, pattern, setting value] by the application program in unit 10, is then passed to front-end server unit 10 Kernel in.The mode settings of two-way parallel schema are 2 at this time, which can be resolved to hardware control signal by program, It is written in register 60, and the ID of the CPU of each calculate node under the pattern is set to be followed successively by 00,01,00,01.In this way when more When road processing unit 20 starts, BIOS can read the start-up mode of oneself from register 60, learn and opened for two-way parallel schema When dynamic, the calculate node 30 that ID number is 00 can be identified as host node by hardware, only run BIOS thereon.It can be into bios code CPU under row two-way parallel schema, memory and address window code configuration, and master computing node to startup and from calculate node HT interconnection configurations are carried out, two calculate nodes is made to be combined into an entirety, then load the kernel under the pattern.When receiving When the input of user is switching on and shutting down or restarts, application program, which can input user, simply resolve to [setting, action, CPU ID], Kernel is passed to, kernel can check current parallel schema, then carry out user institute to corresponding calculate node 30 according to ID number " action " hardware configuration needed.If CPU ID numbers are 0 or 1, kernel can carry out relevant operation to the 1st double path system;Such as Fruit CPU ID numbers are 2 or 3, then kernel can carry out relevant operation to the 2nd double path system.
When user to preposition server unit 10 input when selecting parallel schema as four road parallel schemas, front server Application program in unit 10, which can input user, resolves to [setting, pattern, setting value], is then passed to front-end server unit In 10 kernel.The mode settings of four road parallel schemas are 4 at this time, which can be resolved to hardware controls letter by program Number, it is written in register 60, and the ID of the CPU of calculate node 30 under the pattern is set to be followed successively by 00,01,11,10.At multichannel When managing the startup of unit 20, BIOS can read the start-up mode of oneself from register 60, when learning as four road parallel schemas startups, The calculate node 30 that ID number is 00 can be identified as host node by hardware, only run BIOS thereon.Bios code can run four tunnels CPU, memory and address window code configuration under parallel schema, and match to master computing node and from calculate node progress HT interconnections It puts, four calculate nodes is made to be combined into a ring-type, load the kernel under the pattern.When the input for receiving user is switch Machine or when restarting, application program, which can input user, simply resolve to [setting acts, CPU ID], and pass to kernel, kernel Current pattern can be checked, then according to ID number to " action " hardware configuration needed for corresponding calculate node progress user.CPU When ID number is 0 or 1 or 2 or 3, kernel can carry out relevant operation to unique four tunnel system.
Concurrent computational system provided by the invention, front server unit obtain selection parallel schema input by user, The target parallel pattern with selection parallel mode matching is obtained from preset multiple parallel schemas, and in MCPU multiple call processing unit The CPU of each calculate node is arranged on the ID number under the target parallel pattern, and by target parallel pattern and each calculate node ID numbers of the CPU under target parallel pattern preserve into register, read for MCPU multiple call processing unit.
Fig. 4 is the structure diagram of concurrent computational system example IV provided by the invention, in the base of above-described embodiment On plinth.As shown in figure 4, the front server unit 10 of the present embodiment can include first serial 13 and second serial 14;Wherein, First serial 13 is connect with external equipment, for exporting the data information of the front server unit 10;Second serial 14 with Any master computing node 30 in the MCPU multiple call processing unit 20 connects, by exporting based on any master computing node 30 Calculate information.
Specifically, as shown in figure 4, the first serial 13 of the front server unit 10 of the present embodiment connects with external equipment It connects, for exporting the data information of front server unit 10, such as a display equipment is passed through into first serial 13 and preposition clothes Business device unit 10 connects, its own data information by first serial 13 is exported and gives display equipment by front server unit 10. Optionally, user can also input the request for the data information for it is expected 10 display of front server unit on the display device, preceding Put server unit 10 according to the request to display equipment output data.
As shown in figure 4, the second serial 14 of front server unit 10 and any host computer in MCPU multiple call processing unit 20 Node 30 connects, for exporting the calculating information on the master computing node 30 being connect with second serial 14.
Preferably, the concurrent computational system of the present embodiment can also include selection chip 50 so that second serial 14 passes through Selection chip 50 is connect with master computing node 30.Specifically, the second serial 14 of selection chip 50 and front server unit 10 Connection, selects chip 50 determine to be connect with which of MCPU multiple call processing unit 20 calculate node 30 according to default selection rule (being specifically to be connect with the UART of calculate node 30).
The basic function of the serial ports of the present embodiment be front-end server unit 10 kernel and system default installation look into See that the software of serial ports is completed jointly.The handoff functionality of serial ports is realized by the System Programming of application layer.Such as user couple The kernel of front-end server unit 10 is configured, and the 8 external line serial ports of acquiescence are configured 42 line serial ports.It thus can be with One is used for externally as the first serial of of front-end server unit 10 itself, another is carried out as with MCPU multiple call processing unit 20 Interactive second serial.System can be given tacit consent to is identified as ttyS0 by first serial, and second serial is identified as ttyS1, in this way when need When checking the information of calculate node 30, it will check that equipment is set as ttyS1.
In actual use, user can be received for carrying out the system program of serial ports switching on front server unit 10 Input, according to the CPU ID numbers of calculate node 30 that user needs to check, configuration selection chip 50 is connected to different calculating On node 30.Such as when MCPU multiple call processing unit 20 is started with single channel parallel schema, second serial 14 can be in multichannel process list Arbitrarily switch in 4 calculate nodes of member 20.When MCPU multiple call processing unit 20 is started with two-way parallel schema, second serial 14 is only CPU ID are to be switched on 00 master computing node in 2 double path systems.When MCPU multiple call processing unit 20 is with the parallel mould in four tunnels When formula starts, the only exportable CPU ID of second serial 14 are the data in 00 calculate node.
The concurrent computational system 100 of the present embodiment is set by the software on front-end server, can be by front end services The second serial 14 of device is switched in different calculate nodes 30, is exported on the master computing node 30 being connect with second serial 14 Information.And then realize in MCPU multiple call processing unit 20 on master computing node 30 information flexible acquisition.For example, work as multichannel process For unit 20 when carrying out complicated calculations, data information that user can be at any time checked on master computing node 30 to grasp operation in time Situation, and then improve the reliability of parallel computation.
Further, the front server unit 10 of the present embodiment can also include hard disk slot (such as M-SATA is inserted Slot), memory, flash memory FLSAH (such as 256MB NAND FLSAH), universal serial bus (Universal Serial Bus, letter Claim USB) interface and Video Graphics Array (Video Graphics Array, abbreviation VGA) interface etc..The front end clothes of the present embodiment Business device can be a separate but complete system, can be used as individual PC machine.
Concurrent computational system provided by the invention, by setting first serial and the second string on front server unit Mouthful, first serial is connect with external equipment, for exporting the data information of the front server unit;Second serial and multichannel Any master computing node connection in processing unit, for exporting the calculating information on any master computing node so that Yong Huke To check the data information on master computing node at any time, to grasp the situation of operation in time, and then improve parallel computation can By property.
One of ordinary skill in the art will appreciate that:Realizing all or part of step of above-mentioned each method embodiment can lead to The relevant hardware of program instruction is crossed to complete.Aforementioned program can be stored in a computer read/write memory medium.The journey Sequence when being executed, performs the step of including above-mentioned each method embodiment;And aforementioned storage medium includes:ROM, RAM, magnetic disc or The various media that can store program code such as person's CD.
Finally it should be noted that:The above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent Pipe is described in detail the present invention with reference to foregoing embodiments, it will be understood by those of ordinary skill in the art that:Its according to Can so modify to the technical solution recorded in foregoing embodiments either to which part or all technical features into Row equivalent replacement;And these modifications or replacement, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution The range of scheme.

Claims (10)

1. a kind of concurrent computational system, which is characterized in that including:One circuit board, a front server unit and one MCPU multiple call processing unit, the MCPU multiple call processing unit include multiple calculate nodes, at the front server unit and the multichannel Reason unit is connected and is arranged on the circuit board;
The front server unit, for be from preset multiple parallel schemas the MCPU multiple call processing unit selection target simultaneously Row pattern, and the central processor CPU for calculate node each in the MCPU multiple call processing unit is distributed in the target parallel ID number under pattern;
The MCPU multiple call processing unit, for according to the target parallel pattern and the ID number of the CPU of each calculate node into Row starts.
2. system according to claim 1, which is characterized in that the system also includes exchange chip, the preposition service Device unit includes the first network interface and the second network interface;
First network interface is connect with external network, and second network interface passes through the exchange chip and the MCPU multiple call processing unit In each calculate node network interface connection.
3. system according to claim 2, which is characterized in that the front server unit further includes Network File System With kernel of the MCPU multiple call processing unit under different parallel schemas;
The MCPU multiple call processing unit is additionally operable to through second network interface and the exchange chip from the front server unit The middle kernel read under the Network File System and the target parallel pattern.
4. system according to claim 3, which is characterized in that the front server unit further includes register, described Register is connect with the CPU of the calculate node;
The register, for preserving the CPU of the target parallel pattern and each calculate node in the target parallel pattern Under ID number.
5. system according to claim 4, which is characterized in that the front server unit is specifically used for:
Obtain selection parallel schema input by user;
According to the selection parallel schema, the mesh with the selection parallel mode matching is obtained from preset multiple parallel schemas Mark parallel schema;
It is that the CPU of each calculate node in the MCPU multiple call processing unit is distributed in the target according to the target parallel pattern ID number under parallel schema, and by the CPU of the target parallel pattern and each calculate node under the target parallel pattern ID number preserve into the register.
6. system according to claim 1, which is characterized in that the front server unit further includes first serial and Two serial ports;
The first serial is connect with external equipment, for exporting the data information of the front server unit;
The second serial is connect with any master computing node in the MCPU multiple call processing unit, for exporting any analytic accounting Calculating information on operator node.
7. system according to claim 4, which is characterized in that the system also includes selection chip, the selection chips It is connect with the second serial, the selection chip is used for according to default selection rule and any in the MCPU multiple call processing unit Master computing node connects.
8. according to claim 1-7 any one of them systems, which is characterized in that pass through end-to-end HT between the calculate node Bus connects.
9. system according to claim 8, which is characterized in that the front server unit further includes:It is hard disk slot, interior It deposits, versabus USB interface and Video Graphics Array USB interface.
10. system according to claim 9, which is characterized in that the parallel schema is included to single channel parallel schema, two-way Parallel schema and four road parallel schemas.
CN201611082574.3A 2016-11-30 2016-11-30 Parallel computer system Active CN108132908B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611082574.3A CN108132908B (en) 2016-11-30 2016-11-30 Parallel computer system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611082574.3A CN108132908B (en) 2016-11-30 2016-11-30 Parallel computer system

Publications (2)

Publication Number Publication Date
CN108132908A true CN108132908A (en) 2018-06-08
CN108132908B CN108132908B (en) 2020-10-23

Family

ID=62388002

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611082574.3A Active CN108132908B (en) 2016-11-30 2016-11-30 Parallel computer system

Country Status (1)

Country Link
CN (1) CN108132908B (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6977520B1 (en) * 2002-08-13 2005-12-20 Altera Corporation Time-multiplexed routing in a programmable logic device architecture
CN101262218B (en) * 2008-03-11 2012-02-22 东南大学 Data multi-channel and clockwise/anticlockwise output control circuit
CN204302973U (en) * 2014-12-30 2015-04-29 龙芯中科技术有限公司 Configurable processor computing machine

Also Published As

Publication number Publication date
CN108132908B (en) 2020-10-23

Similar Documents

Publication Publication Date Title
US10126954B1 (en) Chipset and server system using the same
JP4128956B2 (en) Switch / network adapter port for cluster computers using a series of multi-adaptive processors in dual inline memory module format
CN104202194B (en) The collocation method and device of PCIe topologys
US7987348B2 (en) Instant on video
JP3628595B2 (en) Interconnected processing nodes configurable as at least one NUMA (NON-UNIFORMMOMERYACCESS) data processing system
US20150033001A1 (en) Method, device and system for control signalling in a data path module of a data stream processing engine
CN107278299A (en) The functional methods, devices and systems of secondary bus are realized via reconfigurable virtual switch
CN109582605A (en) Pass through the consistency memory devices of PCIe
US10437762B2 (en) Partitioned interconnect slot for inter-processor operation
CN104299466A (en) Remote hardware experimental method and system based on cloud computing platform
CN113872796B (en) Server and node equipment information acquisition method, device, equipment and medium thereof
US10846256B2 (en) Multi-endpoint device sideband communication system
US11055106B1 (en) Bootstrapping a programmable integrated circuit based network interface card
CN110968352B (en) Reset system and server system of PCIE equipment
CN115905094A (en) Electronic equipment and PCIe topology configuration method and device thereof
CN109426566A (en) Accelerator resource is connected using switch
CN117561505A (en) Systems, methods, apparatuses, and architectures for dynamically configuring device structures
CN116225177B (en) Memory system, memory resource adjusting method and device, electronic equipment and medium
US20090213755A1 (en) Method for establishing a routing map in a computer system including multiple processing nodes
CN216352292U (en) Server mainboard and server
CN108132908A (en) Concurrent computational system
CN206563961U (en) Concurrent computational system
CN111722930B (en) Data preprocessing system
CN113434445A (en) Management system and server for I3C to access DIMM
US10409940B1 (en) System and method to proxy networking statistics for FPGA cards

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20191223

Address after: Room 606, block B, Chuangzhi building, No. 17, Xinghuo Road, Jiangbei new district, Nanjing City, Jiangsu Province

Applicant after: Longxin Zhongke (Nanjing) Technology Co., Ltd

Address before: 100095, Beijing, Zhongguancun Haidian District environmental science and technology demonstration park, Liuzhou Industrial Park, No. 2 building

Applicant before: Longxin Zhongke Technology Co., Ltd.

GR01 Patent grant
GR01 Patent grant