CN101639814B - Input-output system facing to multi-core platform and networking operation system and method thereof - Google Patents

Input-output system facing to multi-core platform and networking operation system and method thereof Download PDF

Info

Publication number
CN101639814B
CN101639814B CN2009100906104A CN200910090610A CN101639814B CN 101639814 B CN101639814 B CN 101639814B CN 2009100906104 A CN2009100906104 A CN 2009100906104A CN 200910090610 A CN200910090610 A CN 200910090610A CN 101639814 B CN101639814 B CN 101639814B
Authority
CN
China
Prior art keywords
access
request
special
territory
operating system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2009100906104A
Other languages
Chinese (zh)
Other versions
CN101639814A (en
Inventor
杨亚军
王若倪
孙毓忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN2009100906104A priority Critical patent/CN101639814B/en
Publication of CN101639814A publication Critical patent/CN101639814A/en
Application granted granted Critical
Publication of CN101639814B publication Critical patent/CN101639814B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Multi Processors (AREA)

Abstract

The invention provides an input-output system of an operating system including a resource management module for detecting a local physical I/O equipment; a special I/O domain established for each physical I/O equipment through the resource management module and used for providing I/O service which visits the I/O equipment for application; a local proxy domain of the physical local I/O equipment established through the resource management module used for receiving a visit request which visits the local physical I/O equipment sent by a remote application and forwarding the visit request to the special I/O domain of the local physical I/O equipment, then receiving an I/O request handling result sent after the special I/O domain handles the visit request, and forwarding the handling result to the remote application. The invention makes the drive programs of the each physical I/O equipment have good isolation, and better and effectively handle an I/O process.

Description

Input-output system and method towards multi-core platform and networked operating system
Technical field
The present invention relates to operation system technology, especially I/O (I/O) system and the I/O method of involvement aspect in multi-core platform and networked operating system.
Background technology
CPU (processor) from the date of birth, dominant frequency is just in continuous improve.Nowadays, flex point has been gone on the road that promotes dominant frequency.The dominant frequency of the CPU of desktop computer reached 1GHz in 2000, and calendar year 2001 reaches 2GHz, reached 3GHz in 2002.But the today after 6 years, people still do not see the appearance of 4GHz CPU.Improving aspect the dominant frequency of CPU, voltage and thermal value become topmost obstacle, cause particularly aspect the CPU of notebook computer, can't promoting the performance of CPU at the CPU of desktop computer again by simple lifting clock frequency.
Simultaneously, along with the fast development in computer technology and market, energy consumption problem is also more and more outstanding.Information industry has become high energy-consuming industry.According to the research of the IDC whole world, the installation amount of global server is about 3,500 ten thousand at present, and the electricity expense that annual server consumed is about 50% of the annual server expense of buying, probably is 29,000,000,000 dollars; The cost that energy resource consumption brought has become 30% to 50% of infotech (IT) operation cost that industry consumes.Total power consumption of China's IT product in 2007 is approximately 30,000,000,000--50,000,000,000 kilowatt hours, and near the generating total amount in 1 year in power station, Three Gorges.And the energy consumption of modern supercomputing environment is also high.It is reported bright 15 minutes to one hour of the electricity-saving lamp that the electric energy that Google search is consumed can enough 11 watts.The electric energy of Google data center consumes in 2006 is 1.5% of the whole America consumed power.Google had three thoroughly upgradings to server, average each the renewal, and 1.8 times of performance boosts, power consumption almost will increase by 2 times.Research institution's statistics shows that data center's power consumption is still in the speed increment with every year 15%~20%.
In the face of the situation of walking to be at the end in the road of dominant frequency liftings, Intel and AMD begin to seek alternate manner and promote performance of processors, and the mode of tool practical significance is the quantity that increases processing core in the CPU.
Polycaryon processor is installed in two or more complete, separate processor kernels on the chip, and directly is inserted in the socket on the mainboard, and operating system is carried out kernel with each and is considered as the logic processor that can independently control.Owing to have a plurality of independent CPUs nuclears, can allow each CPU nuclear with lower frequency operation, therefore reduce thermal value, and made multi-core platform have powerful computation capability.Simultaneously, integrated a plurality of CPU examine on a chip, rather than they are inserted on a plurality of independently sockets, have accelerated the communication speed between the CPU nuclear greatly, have improved the executed in parallel efficient of system.
Many core design support two or more kernels to move simultaneously with lower execution frequency, and thermal value is much lower.The computing power of these kernels combines the processing power that can provide more powerful, has surpassed the highest level of single core processor now, and power consumption is also significantly reduced.Like this, the performance of server platform has still improved in CPU production firm as Moore's Law is foretold, and this technology is no longer pushed physical restriction to the new limit simultaneously.
The processor Development Prospect has been expanded in the appearance of multi-core platform greatly, and makes that frequency and temperature no longer are the bottlenecks of restriction processor development.According to estimates, by 2016, processor will adopt 11 nanometer production technologies, and the processor number of transistors reaches 1,280 hundred million; By 2018, processor will adopt 8 nanometer production technologies, and the transistorized quantity of processor reaches 2,560 hundred million.
Though by a large amount of processor cores being integrated into many difficult problems that can solve on the chip in the realization of processor design now, allow the Moore's Law sustainable development, and can accomplish other processor of TFLOPS (T FloatingPoint Operations Per Second, per second moves a TFlops floating point instruction) level.But so many hardware are integrated into a chip internal, how to utilize them, bring into play their usefulness, and will be the severe challenge that software faces.Existing operating system and software can not make full use of the computation capability of hardware fast development.
Computer I/O technology is a crucial gordian technique all the time in development of computer, and it has material impact to the entire system performance.Fundamentally, no matter still in the future now, the I/O technology all will restrict the application and the development of computer technology, especially in high-end calculating field.
In computer system, all the interface that is connected with each other must be arranged between each subsystem.Need to communicate by letter with CPU such as storer, CPU equally also needs to communicate by letter with I/O equipment.This is all finished by bus usually.Two advantages of bus are low expense and versatility.Communication link between CPU and the I/O equipment just is I/O bus.The I/O bus is the communication link of sharing between each subsystem.By defining a kind of interconnected pattern, a new I/O equipment can add bus easily to and get on, and external unit can move use in having the computer system of versabus.Typical I/O bus has: AGP bus, pci bus, isa bus etc.
Usually, operating system is being controlled all resources of computer system, comprises I/O equipment.Operation to I/O equipment must be finished by operating system, and Fig. 1 shows the structure of existing operating system.
With reference to figure 1, existing operating system is divided into application layer 10 and inner nuclear layer.Use (being user program) 101 and operate in application layer 10, and communicate by letter with operating system nucleus layer 30 by system call interfaces 20.When using 101 pairs of I/O equipment 40 and operating, all will be by system call interfaces 20 the operation requests of equipment 40 is passed to inner nuclear layer 30, after pre-service is carried out in 30 pairs of operation of equipment requests of inner nuclear layer, give the device drives unit (being device driver) 301 of inner nuclear layer 30, by the 301 actual operations of finishing I/O equipment 40 of device drives unit.Network processes layer 303 among Fig. 1 is used for the network service of operating system, and file system layer 302 is used for managing file system.
From the visual angle towards multi-core platform, there is following shortcoming in the I/O system of existing operating system:
(1) the intermediate treatment level of I/O visit is many
Existing operating system owing to do not consider the development of multi-core technology and the technical characterstic of multinuclear when design, can't be given full play to the computing power of multi-core platform mainly towards single CPU platform.With linux operating system is example, the call flow chart of its typical block device read operation as shown in Figure 2, concrete steps are as follows:
With block device I/O visit is example, is described in the I/O framework in the foregoing existing operating system, and the user uses in the time of will visiting the I/O resource, needs the following canonical process of experience:
At first use 101 and give operating system nucleus layer 30 the I/O request of access by system call interfaces 20;
Operating system nucleus layer 30 is given the file system layer 302 of inner nuclear layer 30 after treatment the I/O request of access;
File system layer 302 is given the generic block mechanical floor (not shown) of operating system nucleus layer 30 more after treatment the I/O request of access;
Generic block mechanical floor in the operating system nucleus layer 30 is given the (not shown) of the I/O dispatch layer in the inner nuclear layer 30 after treatment the I/O request of access;
I/O dispatch layer in the operating system nucleus layer 30 is given device drives unit 301 in the inner nuclear layer 30 after treatment the I/O request of access, is driven I/O equipment 40 and is finished the I/O request of access by it.
When the I/O request of access processed intact after, need return by original route.
With reference to figure 2, can see from entire I/O processing procedure, existing in the operating system of single CPU platform, the I/O request of access from application be issued to use obtain result need be through 4 levels, through 9 intermediate treatment.This loaded down with trivial details intermediate treatment process greatly reduces the I/O performance of system, thereby has also limited the performance of whole computer system computing power greatly, and is not suitable for the operating system towards multi-core platform.
(2) device driver is not isolated mutually, causes system reliability not high
Multi-core platform has very high computation capability, and numerous virtual machine and users of common executed in parallel uses on it.And in existing I/O system in the operating system of monokaryon platform, all physical equipments drive and all are arranged in open kernel.As everyone knows, because the I/O device type is numerous, every kind of I/O equipment all has a large amount of manufacturers, each serial product type is also very complicated, make that I/O device driver quantity is very huge, its quality is difficult to guarantee that therefore the possibility that has problems is bigger, and the collapse of operating system has sizable ratio to cause owing to the physical equipment driving has problems when moving.In case physical equipment drives and has problems, and may cause the collapse of whole multi-core platform, make the collapse that the user of the One's name is legion that moves on this platform uses cause serious consequence, and this to be unacceptable.
(3) resource contention and task switching can cause the decline of system performance
Multi-core platform has very high computation capability, and numerous user of common executed in parallel uses on it.And in existing inner nuclear layer in the operating system of monokaryon, I/O handles relevant thread and does not have special priority.Finish the work by considerable kernel thread such as block device driver.These kernel threads need be competed cpu resource with numerous The Application of Thread, and cause the processing of I/O equipment can not give full play to its hardware performance because there being enough cpu resources.Simultaneously, these threads also may be by operation on different CPU or CPU nuclear with the Interrupt Process process, like this, can cause, and then reduce the performance performance of total system because of contextual switching and switchover operation brings between CPU or CPU nuclear performance cost.
(4) migration to the I/O device access does not provide good support
Multi-core platform more and more is being applied to large-scale distributed computing environment such as cluster, data center.In this environment, owing to will realize balancing dynamic load and make that various computational resources keep moving between the physical node of user's application need in distributed computing environment than higher utilization factor.And in the I/O system of existing operating system, its physical resource border is confined to local physical node, can not support the migration to the I/O visit, people need could realize the support to the migration of I/O visit in the distributed computing environment by concentrated technology such as remote I/O resource.If user's application access is resource on the local physical node, then these users use and can't move on other physical node, and this has also limited the realization of balancing dynamic load and the raising of resource utilization in the distributed computing system.
Summary of the invention
The objective of the invention is to overcome the above shortcoming of existing I/O (I/O) system in the operating system of monokaryon platform, provide a kind of can be towards the I/O system of multi-core platform and new network operating system, and with above-mentioned I/O system corresponding a kind of can be towards I/O (I/O) method of multi-core platform and networked operating system, it can make the I/O visit have less intermediate treatment level, the I/O device drives is isolated mutually, the I/O system that realizes customizations is to improve the I/O performance of total system, thereby realize succinct efficient in multi-core platform and the distributed computing environment, safe and reliable, the I/O access mechanism that extensibility is good, and can use for numerous users of parallel running on the multi-core platform efficient I/O access mechanism is provided, the migration of using for the user in the distributed computing environment provides good support.
For above-mentioned purpose, the invention provides a kind of input-output system of operating system, this operating system is on computing platform; This input-output system comprises:
Being used for by resource management module is the device that physical I/O equipment is created special I/O territory;
Be used for using and send the device of I/O request of access to special I/O territory;
Be used for after special I/O territory receives this I/O request of access, handle this I/O request of access and result is returned to the device of application;
Wherein, when application migration became remote application to the remote operating system node, this input-output system also comprised:
Be used for creating the device of the home agent field of I/O equipment by resource management module;
Be used for remote application and send the device of I/O request of access to home agent field;
Be used for the device that home agent field sends to this I/O request of access in special I/O territory;
Be used for after special I/O territory receives this I/O request of access, handle this I/O request of access and result is returned to the device of home agent field; And
Be used for the device that home agent field returns to I/O request of access result remote application.
Further, described operating system is on the computing platform with a plurality of CPU nuclears.
Further, this special I/O territory 51 comprises: initialization module, it is used for special I/O territory corresponding physical I/O equipment is carried out initialization.
Further, described I/O service comprises: receive and use the I/O request of access to physical I/O equipment of sending; After this I/O request of access handled, result is returned to corresponding application.
Further, this input-output system also comprises:
The equipment control interface is used for sending the resource Associate Command by it to described resource management module; Described resource Associate Command comprises the corresponding relation between special I/O territory and the physical I/O equipment.
For above-mentioned purpose, the present invention also provides a kind of input output method of operating system, and this operating system is on computing platform; This input output method comprises the steps:
Step S10 is that physical I/O equipment is created special I/O territory by resource management module;
Step S20 uses and sends the I/O request of access to special I/O territory;
Step S30 after special I/O territory receives this I/O request of access, handles this I/O request of access, and result is returned to application;
Wherein, when application migration became remote application to the remote operating system node, this input output method also comprised:
Step S40 is by the home agent field of resource management module establishment I/O equipment;
Step S50, remote application is sent the I/O request of access to home agent field;
Step S60, home agent field sends to special I/O territory with this I/O request of access;
Step S70 after special I/O territory receives this I/O request of access, handles this I/O request of access, and result is returned to home agent field;
Step S80, home agent field returns to remote application with I/O request of access result.
Further, described operating system is on the computing platform with a plurality of CPU nuclears.
Further, step S10 also comprises: when os starting, resource management module is surveyed physical I/O equipment.
Further, step S20 specifically comprises:
Step S201, back-up environment is handled when using the I/O request of access send and giving the operation of application by system call;
Step S202, back-up environment sends to special I/O territory with this I/O request of access after this I/O request of access is processed during operation.
Further, before step S201, also comprise:
Step S201 ', operating system is created and use and create the I/O access path between application and special I/O territory;
Wherein, in step S202, back-up environment sends to special I/O territory by the I/O access path between application and the special I/O territory with this I/O request of access during the operation of application.
Further, step S30 specifically comprises:
Step S301 after special I/O territory receives this I/O request of access, handles this I/O request of access, and back-up environment when result returned to the operation of application;
Step S302, back-up environment returns to application with result during the operation of application.
Further, step S50 specifically comprises:
Step S501, back-up environment was handled when remote application was given the operation of remote application with the I/O request of access by system call;
Step S502, back-up environment processes the back to this I/O request of access this I/O request of access is sent to home agent field during the operation of remote application.
Further, before step S501, also comprise:
Step S501 ', system arrives long-range physical node with local application migration, when the operation of remote application, set up the I/O access path between back-up environment and the home agent field, and between home agent field and special I/O territory, set up the I/O access path, and make these two I/O access paths associate at the home agent field place;
Wherein in step S502, back-up environment transmission I/O request of access is to send to home agent field by the I/O access path between remote application and the home agent field during operation of remote application.
Further, step S80 specifically comprises:
Step S801, back-up environment when home agent field returns to the operation of remote application with I/O request of access result;
Step S802, back-up environment returns to remote application with result during the operation of remote application.
Useful technique effect of the present invention is:
(1) intermediate level and the intermediate treatment number of times of I/O visit are less
The present invention has significantly reduced processing I/O request of access need be passed through in the operating system the intermediate level and intermediate treatment number of times, has therefore significantly reduced and has handled the expense that the I/O request of access is brought computer system, has effectively improved computer system I/O performance; Simultaneously, therefore also significantly reduced the number of times of the operation set redirect that the redirect between the level brings, make and to significantly reduce because of cache miss rate and TLB (TLB:translation Lookaside Buffer, bypass conversion buffered) miss rate that frequently redirect causes between a plurality of processing are gathered.These can both significantly improve the overall performance of computer system, make same system can support more user, and can provide service more efficiently for the user.
(2) all has good isolation between each physical I/O device driver
In the I/O system of described existing operating system, all physical equipments drive and all are arranged in open kernel, do not have isolation each other in front.In case a physical I/O device drives has problems, and may cause the collapse of whole multi-core platform, make and the collapse that the user of the One's name is legion that moves on this platform uses cause serious consequence.And in I/O of the present invention system, each physical I/O equipment all has special-purpose I/O territory to be responsible for specially to handle relative I/O request of access, have good isolation between each I/O territory, effectively improved the reliability of whole computer system.
(3) I/O processing procedure more efficiently
Multi-core platform has very high computation capability, and numerous user of common executed in parallel uses on it.And in existing kernel in the operating system of monokaryon platform, I/O handles relevant thread and does not have special priority.Finish the work by considerable kernel thread such as block device driver.These kernel threads need be competed cpu resource with numerous application, and cause the processing of I/O equipment can not give full play to its hardware performance because there being enough cpu resources.Simultaneously, these threads and Interrupt Process process also may operations on different CPU or nuclear, like this, can cause because of contextual switching and switchover operation brings between CPU or nuclear performance cost, and then reduce the performance performance of total system.Customizations I/O provided by the invention territory does not then have these problems: it uses a CPU nuclear specially, is responsible for the operation to an I/O equipment specially.Like this, do not have other kernel thread and its contention cpu resource, the generation that also can not bring the above numerous context to switch, and, therefore will bring very high cache (cache memory) hit rate because its processing is positioned on the fixing CPU nuclear all the time.The problem that all these has effectively avoided the I/O system of existing operating system in the face of the monokaryon platform to bring has improved the performance of system.
(4) good support that the I/O device access is moved
In the I/O system in aforesaid existing operating system, its migration to the I/O visit does not provide good support.And in I/O framework of the present invention, after the user is applied in and moves to long-range physical node, still can conduct interviews to the I/O resource on the original physical node by the Agent Domain on the original physical node, therefore the migration that the user is applied in the distributed system has obtained good support, also make intrasystem load balance more effectively to realize, system also can more efficiently utilization be distributed in the computational resource of system's overall situation.
Description of drawings
Fig. 1 is the structural representation of existing operating system towards the monokaryon platform;
Fig. 2 is the existing process flow diagram that the block device read operation is called in the linux kernel of monokaryon platform;
Fig. 3 is the structural representation of the I/O system towards multi-core platform and networked operating system of the present invention;
Fig. 4 is the process flow diagram of the I/O method towards multi-core platform and networked operating system of the present invention;
I/O request of access and to the route map of the signals such as result of I/O request of access when Fig. 5 is the local I/O equipment of I/O method medium-long range application access of the present invention.
Embodiment
In order to make purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, I/O system and I/O method of work towards multi-core platform and networked operating system of the present invention further described, and identical parts adopt identical Reference numeral in the accompanying drawings.Should be appreciated that specific embodiment described herein only in order to explanation the present invention, and be not used in qualification the present invention.
In the specific embodiment of the present invention, networked operating system is the modern supercomputer system towards multinuclear computing platform and highly energy-consuming.In networked operating system, particle (physical node) is based in the supercomputing environment of multinuclear and calculates, network, storage, resources such as IO and the high-effect logic operation entity that is combined to form thereof, it has the change yardstick, open, networking, intelligence, restructural, features such as isolation, and its resource polymerization border can change along with needs: such as the common resource entity of forming a whole network operating system of the resource of a plurality of physical nodes, provide service for certain user jointly, and unlike existing operating system towards the monokaryon platform, its resource entity is confined to the single physical node.Networking operating system has proposed the mutual flattening of granulating, application and the interior nuclear resource of calculating and resource, customizations, calculating and the distribution of resourceization etc. of processing; with the combination of multi-core platform, provide the computing power service of high-efficiency low energy consumption for numerous users.I/O system provided by the invention and I/O method can satisfy the requirement of above-mentioned networked operating system and multi-core platform.
In addition, I/O system provided by the invention and I/O method also can run on the monokaryon platform, and the difference of monokaryon and multinuclear computing platform is not the content that the present invention will discuss.
Be the I/O system towards multi-core platform and networked operating system according to the specific embodiment of the present invention as shown in Figure 3, wherein multi-core platform is the computing platform with a plurality of CPU nuclears, and this I/O system comprises:
Resource management module 50, it is used to detect local physical I/O equipment 40, creates a special I/O territory 51 by this resource management module 50 for each physical I/O equipment 40; This special I/O territory 51 is used for providing the I/O of visit I/O equipment 40 to serve to using 101.Described detection is physical I/O equipment and relevant information such as type etc. that the resource management module check system is had.
In the I/O system of multi-core platform and networked operating system, the user uses the 101 I/O request of access of sending and is handled by special I/O territory 51, then the result is returned to the user and uses 101 above-mentioned.Like this, the I/O request of access is issued to and uses 101 and obtain result from using 101, only need be through 2 levels.Like this, just simplify the processing procedure of I/O request of access greatly, effectively improved the I/O performance of system.Simultaneously, the minimizing of the intermediate level has significantly reduced the level number of hops in the I/O request of access processing procedure, make because of frequent redirect causes between a plurality of processing set cache miss rate and TLB (TLB:translation Lookaside Buffer, bypass conversion buffered) miss rate significantly reduces, and will effectively improve the overall performance of system.
In an embodiment, this special I/O territory 51 comprises: the initialization module (not shown), it is used for special I/O territory 51 corresponding physical I/O equipment 40 are carried out initialization.Described initialized content comprises: to detection, analysis, configuration and the startup of physical I/O equipment 40.Described detection is meant I/O device driver in special I/O territory to the testing in depth testing of I/O equipment, be used to confirm resource informations such as its concrete model, size, whether can operate as normal.If the I/O equipment of surveying energy operate as normal then then is configured it, and continues this I/O equipment is carried out initialization according to the startup flow process.
In this embodiment, after 40 initialization of I/O equipment were finished, special I/O territory 51 began to provide I/O service to using 101, and this I/O service comprises: receive and use the 101 I/O request of access to physical I/O equipment 40 of sending; After this I/O request of access handled, the result is returned to corresponding application 101.Need to prove, as an embodiment of the present invention, the I/O system of this operating system of the present invention is the characteristics that are more suitable for multi-core platform, but those skilled in the art are to be understood that, the I/O system of operating system of the present invention can not only operate on the multi-core platform, and can operate on the monokaryon platform.
In this embodiment, this I/O system towards multi-core platform and networked operating system also comprises:
Equipment control interface 52 is used for sending the resource Associate Command by it to described resource management module 50.Described resource Associate Command comprises: the corresponding relation between special I/O territory 51 and physical I/O equipment 40, also promptly determine physical I/O equipment 40 by the I/O territory 51 of correspondence be in charge of, maintenance and operation.
(described physical node is a complete computing machine, and it has a plurality of CPU nuclears, is a multi-core platform when running on local physical node in order to satisfy.And a plurality of such physical nodes can be arranged in the whole computing system.) on the user use 101 and need move to because of the overall situation and still can carry out local I/O visit behind the long-range physical node, the I/O system towards multi-core platform and networked operating system according to this embodiment of the present invention further comprises:
The home agent field 53 of local physical I/O equipment 40 of creating by this resource management module 50, it is used for the request of access that receiving remote is used the local physical I of the visit of sending/O equipment 40, and this request is transmitted to the special I/O territory 51 of local physical I/O equipment 40, receive then and send the I/O request of access result of returning after this I/O request of access is handled in special I/O territory 51, and this result is transmitted to remote application.
Like this, even the user on the local physical node uses by migration, also still can continue normally to visit local I/O device resource.
I/O system according to the specific embodiment of the present invention towards multi-core platform and networked operating system; each physical I/O equipment all has corresponding special I/O territory; these I/O territories are based on unusual simple and high-efficient running environment; and wherein only operation is handled relevant code with I/O; reduced considerable intermediate treatment level; and has good isolation performance; satisfy the requirement of networked operating system flattening, granulating, customizations, improved the reliability of system and the efficient that I/O handles.
Fig. 4 illustrates the input output method towards multi-core platform and networked operating system according to the specific embodiment of the present invention, is the process that example illustrates I/O system handles I/O request of access of the present invention with block device I/O wherein.In the typical running environment of this embodiment, the supervisory instruction that resource management module receiving equipment management interface sends, carry out related to physical I/O equipment with the I/O territory, and send the I/O passage and create instruction, the user use and management distribute to create between the I/O territory of I/O equipment of application the I/O access path (the I/O access path using and the I/O territory between make up).The user is undertaken by this I/O access path and I/O territory after being applied in and setting up the I/O access path with the I/O territory alternately.In the scene of this embodiment, the user uses app1 and moves, one of them application need visit physical block equipment dev1, its corresponding special I/O territory is io_dom1, comprises the steps: according to the I/O method towards multi-core platform and networked operating system of this embodiment
Step S10 is that physical I/O equipment dev1 creates special I/O territory io_dom1 by resource management module;
In this step, the keeper gives an order to resource management module by the equipment control interface, for physical I/O equipment dev1 creates special-purpose I/O territory io_dom1.
Preferably, in step S10, also comprise: when os starting, resource management module is surveyed physical block equipment dev1.
Step S20, user use app1 and send the I/O request of access to special I/O territory;
Preferably, step S20 specifically comprises:
Back-up environment (run-time) was handled when step S201, user used I/O request of access that app1 sends and at first give the operation that the user uses app1 by system call;
Step S202, back-up environment sends to special I/O territory io_dom1 with this I/O request of access after this I/O request of access is processed during operation.
Preferably, before step S201, also comprise:
Step S201 ', the system creation user uses app1, and uses establishment I/O access path between app1 and the special I/O territory io_dom1 the user.
Among the step S202, the run-time that the user uses app1 sends to special I/O territory io_dom1 by the I/O access path that the user uses between app1 and the io_dom1 with this I/O request of access.
Step S30 after special I/O territory io_dom1 receives this I/O request of access, handles this I/O request of access, and result is returned to the user uses app1.
Preferably, step S30 specifically comprises:
Step S301 after special I/O territory io_dom1 receives this I/O request of access, handles this I/O request of access, and result is returned to the run-time of app1; Wherein, returning this result is by the I/O access path between app1 and the io_dom1;
The run-time that step S302, user use app1 returns to app1 with result.
In above-mentioned entire I/O access process process, the less level of I/O request of access process that application is sent: use run-time (the back-up environment during operation) level that the I/O request of access of sending at first arrives the support applications operation, by directly transferring to special I/O territory after the run-time processing, handle by special I/O territory, then the result is returned to run-time, by run-time the result is returned to again and be used for using.Like this, the I/O request of access is issued to use from application and obtains result, only need be through 2 levels, through 3 intermediate treatment.Like this, just simplify the processing procedure of I/O request of access greatly, effectively improved the I/O performance of system.Simultaneously, the minimizing of the intermediate level and intermediate treatment number of times has significantly reduced the level number of hops in the I/O request of access processing procedure, make because of frequent redirect causes between a plurality of processing set cache miss rate and TLB (TLB:translation Lookaside Buffer, bypass conversion buffered) miss rate significantly reduces, and improved the overall performance of system greatly.The reduction of the lifting of I/O performance and buffer memory miss rate and TLB miss rate makes the computing power of whole computer system to be not fully exerted.
For the user who satisfies on running on local physical node uses (for example app1) visits local I/O equipment when moving to long-range physical node because of overall situation needs needs, then the I/O method towards multi-core platform and networked operating system according to this embodiment also comprises the steps (in this embodiment, for convenience of description, long-distance user's application of moving on other physical node is designated as app2, and its physical equipment that need visit still is dev1):
Step S40 is by the home agent field of resource management module establishment I/O equipment;
Step S50, long-distance user use app2 and send the I/O request of access to home agent field;
Preferably, step S50 specifically comprises:
Step S501, long-distance user use app2 and give the run-time processing that the long-distance user uses app2 with the I/O request of access by system call;
After the run-time that step S502, long-distance user use app2 processes this I/O request of access, this I/O request of access is sent to home agent field.
Preferably, before step S501, also comprise:
Step S501 ', system uses app1 with the local user moves to long-range physical node, use between the run-time of app2 and the home agent field the user of moving to long-range physical node and to set up the I/O access path, and between home agent field and special I/O territory io_dom1, set up the I/O access path, and make two I/O access paths associate at the home agent field place.
Among the step S502, after the run-time that the long-distance user uses app2 processes this I/O request of access, this I/O request of access is sent to home agent field.Sending the I/O request of access herein is to send to home agent field by the I/O access path between app2 and the home agent field.
Step S60, home agent field sends to special I/O territory io_dom1 with this I/O request of access.
Wherein, home agent field by and io_dom1 between the I/O access path this I/O request of access is sent to special I/O territory io_dom1
Step S70 after special I/O territory io_dom1 receives this I/O request of access, handles this I/O request of access, and result is returned to home agent field.Returning result herein is by the I/O access path between home agent field and the io_dom1.
Step S80, home agent field returns to the long-distance user with I/O request of access result and uses app2.
Preferably, step S80 specifically comprises:
Step S801, home agent field returns to the run-time that the long-distance user uses app2 with I/O request of access result;
Wherein, home agent field returns to the run-time that the long-distance user uses app2 by the I/O access path between the run-time of itself and app2 with I/O request of access result;
The run-time that step S802, long-distance user use app2 returns to the long-distance user with result and uses app2.
Fig. 5 illustrates the route map of remote application signals such as result of I/O request of access and I/O request of access when the local I/O equipment of visit.
By above processing procedure, the user still can conduct interviews to the physical equipment of distributing to it originally after being applied in long-range physical node in the system of moving to.Therefore like this, the migration that the user is applied in the distributed system has obtained good support, also makes intrasystem load balance more effectively to realize, system also can more efficiently utilization be distributed in the computational resource of system's overall situation.
In conjunction with the embodiment detailed description of the present invention, useful technique effect of the present invention as can be seen is by above:
(1) intermediate level and the intermediate treatment number of times of I/O visit are less
The present invention has significantly reduced processing I/O request of access need be passed through in the operating system the intermediate level and intermediate treatment number of times, has therefore significantly reduced and has handled the expense that the I/O request of access is brought computer system, has effectively improved computer system I/O performance; Simultaneously, therefore also significantly reduced the number of times of the operation set redirect that the redirect between the level brings, make and to significantly reduce because of cache miss rate and TLB (TLB:translation Lookaside Buffer, bypass conversion buffered) miss rate that frequently redirect causes between a plurality of processing are gathered.These can both significantly improve the overall performance of computer system, make same system can support more user, and can provide service more efficiently for the user.
(2) physical I/O device program is isolated mutually, thus the reliability of the system that effectively improves
In the I/O system of described existing operating system, all physical equipments drive and all are arranged in open kernel, do not have isolation each other in front.In case a physical I/O device drives has problems, and may cause the collapse of whole multi-core platform, make and the collapse that the user of the One's name is legion that moves on this platform uses cause serious consequence.And in I/O of the present invention system, each physical I/O equipment all has special-purpose I/O territory to be responsible for specially to handle relative I/O request of access, have good isolation between each I/O territory, effectively improved the reliability of whole computer system.
(3) I/O access process process more efficiently
Multi-core platform has very high computation capability, and numerous user of common executed in parallel uses on it.And in existing kernel in the operating system of monokaryon platform, I/O handles relevant thread and does not have special priority.Finish the work by considerable kernel thread such as block device driver.These kernel threads need be competed cpu resource with numerous application, and cause the processing of I/O equipment can not give full play to its hardware performance because there being enough cpu resources.Simultaneously, these threads and Interrupt Process process also may operations on different CPU or nuclear, like this, can cause because of contextual switching and switchover operation brings between CPU or nuclear performance cost, and then reduce the performance performance of total system.Customizations I/O provided by the invention territory does not then have these problems: it uses a CPU nuclear specially, is responsible for the operation to an I/O equipment specially.Like this, do not have other kernel thread and its contention cpu resource, the generation that also can not bring the above numerous context to switch, and, therefore will bring very high cache hit rate because its processing is positioned on the fixing CPU nuclear all the time.The problem that all these has effectively avoided the I/O system of existing operating system in the face of the monokaryon platform to bring has improved the performance of system.
(4) to the good support of the migration of I/O device access
In the I/O system in aforesaid existing operating system, its migration to the I/O visit does not provide good support.And in I/O framework of the present invention, after the user is applied in and moves to long-range physical node, still can conduct interviews to the I/O resource on the original physical node by the Agent Domain on the original physical node, therefore the migration that the user is applied in the distributed system has obtained good support, also make intrasystem load balance more effectively to realize, system also can more efficiently utilization be distributed in the computational resource of system's overall situation.
Above said content; only for the concrete embodiment of the present invention, but protection scope of the present invention is not limited thereto, and anyly is familiar with those skilled in the art in the technical scope that the present invention discloses; the variation that can expect easily or replacement all should be encompassed in protection scope of the present invention.

Claims (14)

1. the input-output system of operating system, this operating system is characterized in that on computing platform this input-output system comprises:
Being used for by resource management module is the device that physical I/O equipment is created special I/O territory;
Be used for using and send the device of I/O request of access to special I/O territory;
Be used for after special I/O territory receives this I/O request of access, handle this I/O request of access and result is returned to the device of application;
Wherein, when application migration became remote application to the remote operating system node, this input-output system also comprised:
Be used for creating the device of the home agent field of I/O equipment by resource management module;
Be used for remote application and send the device of I/O request of access to home agent field;
Be used for the device that home agent field sends to this I/O request of access in special I/O territory;
Be used for after special I/O territory receives this I/O request of access, handle this I/O request of access and result is returned to the device of home agent field; And
Be used for the device that home agent field returns to I/O request of access result remote application.
2. according to the input-output system of the described operating system of claim 1, it is characterized in that described operating system is on the computing platform with a plurality of CPU nuclears.
3. the input-output system of operating system according to claim 1 and 2 is characterized in that, this special I/O territory comprises: initialization module, it is used for special I/O territory corresponding physical I/O equipment is carried out initialization.
4. the input-output system of operating system according to claim 1 and 2 is characterized in that, described I/O service comprises: receive and use the I/O request of access to physical I/O equipment of sending; After this I/O request of access handled, result is returned to corresponding application.
5. the input-output system of operating system according to claim 1 and 2 is characterized in that also comprising:
The equipment control interface is used for sending the resource Associate Command by it to described resource management module; Described resource Associate Command comprises the corresponding relation between special I/O territory and the physical I/O equipment.
6. the input output method of operating system, this operating system is characterized in that comprising the steps: on computing platform
Step S10 is that physical I/O equipment is created special I/O territory by resource management module;
Step S20 uses and sends the I/O request of access to special I/O territory;
Step S30 after special I/O territory receives this I/O request of access, handles this I/O request of access, and result is returned to application;
Wherein, when application migration became remote application to the remote operating system node, this input output method also comprised:
Step S40 is by the home agent field of resource management module establishment I/O equipment;
Step S50, remote application is sent the I/O request of access to home agent field;
Step S60, home agent field sends to special I/O territory with this I/O request of access;
Step S70 after special I/O territory receives this I/O request of access, handles this I/O request of access, and result is returned to home agent field;
Step S80, home agent field returns to remote application with I/O request of access result.
7. the input output method of operating system according to claim 6 is characterized in that, described operating system is on the computing platform with a plurality of CPU nuclears.
8. according to the input output method of claim 6 or 7 described operating systems, it is characterized in that step S10 also comprises: when os starting, resource management module is surveyed physical I/O equipment.
9. according to the input output method of claim 6 or 7 described operating systems, it is characterized in that step S20 specifically comprises:
Step S201, back-up environment is handled when using the I/O request of access send and giving the operation of application by system call;
Step S202, back-up environment sends to special I/O territory with this I/O request of access after this I/O request of access is processed during operation.
10. the input output method of operating system according to claim 9 is characterized in that, before step S201, also comprises:
Step S201 ', operating system is created and use and create the I/O access path between application and special I/O territory;
Wherein, in step S202, back-up environment sends to special I/O territory by the I/O access path between application and the special I/O territory with this I/O request of access during the operation of application.
11. the input output method according to claim 6 or 7 described operating systems is characterized in that step S30 specifically comprises:
Step S301 after special I/O territory receives this I/O request of access, handles this I/O request of access, and back-up environment when result returned to the operation of application;
Step S302, back-up environment returns to application with result during the operation of application.
12. the input output method of operating system according to claim 6 is characterized in that, step S50 specifically comprises:
Step S501, back-up environment was handled when remote application was given the operation of remote application with the I/O request of access by system call;
Step S502, back-up environment processes the back to this I/O request of access this I/O request of access is sent to home agent field during the operation of remote application.
13. the input output method of operating system according to claim 12 is characterized in that, also comprises before step S501:
Step S501 ', system arrives long-range physical node with local application migration, when the operation of remote application, set up the I/O access path between back-up environment and the home agent field, and between home agent field and special I/O territory, set up the I/O access path, and make these two I/O access paths associate at the home agent field place;
Wherein in step S502, back-up environment transmission I/O request of access is to send to home agent field by the I/O access path between remote application and the home agent field during operation of remote application.
14. the input output method of operating system according to claim 6 is characterized in that, step S80 specifically comprises:
Step S801, back-up environment when home agent field returns to the operation of remote application with I/O request of access result;
Step S802, back-up environment returns to remote application with result during the operation of remote application.
CN2009100906104A 2009-08-31 2009-08-31 Input-output system facing to multi-core platform and networking operation system and method thereof Active CN101639814B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009100906104A CN101639814B (en) 2009-08-31 2009-08-31 Input-output system facing to multi-core platform and networking operation system and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009100906104A CN101639814B (en) 2009-08-31 2009-08-31 Input-output system facing to multi-core platform and networking operation system and method thereof

Publications (2)

Publication Number Publication Date
CN101639814A CN101639814A (en) 2010-02-03
CN101639814B true CN101639814B (en) 2011-11-16

Family

ID=41614800

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009100906104A Active CN101639814B (en) 2009-08-31 2009-08-31 Input-output system facing to multi-core platform and networking operation system and method thereof

Country Status (1)

Country Link
CN (1) CN101639814B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101901207B (en) * 2010-07-23 2012-03-28 中国科学院计算技术研究所 Operating system of heterogeneous shared storage multiprocessor system and working method thereof
CN102566996B (en) * 2010-12-20 2015-04-01 中兴通讯股份有限公司 Method and system for realizing multi-task management input and output resources
CN103034295B (en) * 2012-12-26 2015-08-12 无锡江南计算技术研究所 The reconfigurable micro server that I/O capability strengthens
CN109426545B (en) * 2017-08-31 2023-02-03 阿里巴巴集团控股有限公司 Data communication method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080028076A1 (en) * 2006-07-26 2008-01-31 Diwaker Gupta Systems and methods for controlling resource usage by a driver domain on behalf of a virtual machine
CN101430670A (en) * 2008-12-16 2009-05-13 中国科学院计算技术研究所 I/O equipment reconstruction method and system in virtualization surroundings

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080028076A1 (en) * 2006-07-26 2008-01-31 Diwaker Gupta Systems and methods for controlling resource usage by a driver domain on behalf of a virtual machine
CN101430670A (en) * 2008-12-16 2009-05-13 中国科学院计算技术研究所 I/O equipment reconstruction method and system in virtualization surroundings

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Keir Fraser et al..Safe Hardware Access with the Xen Virtual Machine Monitor.《Proceedings of 1st Workshop on Operating System and Architectural Support for the on demand IT InfraStructure》.2004, *

Also Published As

Publication number Publication date
CN101639814A (en) 2010-02-03

Similar Documents

Publication Publication Date Title
US7694158B2 (en) Parallel processing method and system, for instance for supporting embedded cluster platforms, computer program product therefor
US8381002B2 (en) Transparently increasing power savings in a power management environment
US20140095769A1 (en) Flash memory dual in-line memory module management
EP2430541B1 (en) Power management in a multi-processor computer system
CN102819312B (en) For the super operating system of a heterogeneous computer system
US20210117244A1 (en) Resource manager access control
CN100430862C (en) Method and apparatus for stopping a bus clock while there are no activities present on a bus
TW201015289A (en) Coordinated link power management
KR20130032402A (en) Power-optimized interrupt delivery
CN101751284A (en) I/O resource scheduling method for distributed virtual machine monitor
KR102043276B1 (en) Apparatus and method for dynamic resource allocation based on interconnect fabric switching
CN105183554A (en) Hybrid computing system of high-performance computing and cloud computing, and resource management method therefor
CN102736595A (en) Unified platform of intelligent power distribution terminal based on 32 bit microprocessor and real time operating system (RTOS)
CN105492989A (en) Early wake-warn for clock gating control
CN101639814B (en) Input-output system facing to multi-core platform and networking operation system and method thereof
WO2014139379A1 (en) Method and device for kernel running in heterogeneous operating system
US11782694B2 (en) Pipeline rolling update
US9509562B2 (en) Method of providing a dynamic node service and device using the same
CN101901159A (en) Method and system for loading Linux operating system on multi-core CPU
CN105335223A (en) Virtual machine memory migration device, method and system on source host and destination host
WO2022271246A1 (en) Network interface device management of service execution failover
CN116389542A (en) Platform with configurable pooled resources
CN102096606A (en) Virtual machine migration method, device and system
CN101303687A (en) Method for implementing chip-on communication of built-in isomerization multicore architecture
Fu et al. MMPI: A flexible and efficient multiprocessor message passing interface for NoC-based MPSoC

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: HUAWEI TECHNOLOGY CO., LTD.

Free format text: FORMER OWNER: INSTITUTE OF COMPUTING TECHNOLOGY, CHINESE ACADEMY OF SCIENCES

Effective date: 20130530

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 100080 HAIDIAN, BEIJING TO: 518129 SHENZHEN, GUANGDONG PROVINCE

TR01 Transfer of patent right

Effective date of registration: 20130530

Address after: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee after: Huawei Technologies Co., Ltd.

Address before: 100080 Haidian District, Zhongguancun Academy of Sciences, South Road, No. 6, No.

Patentee before: Institute of Computing Technology, Chinese Academy of Sciences