CN1405698A - Zero-loss information network service system and method - Google Patents

Zero-loss information network service system and method Download PDF

Info

Publication number
CN1405698A
CN1405698A CN 01142177 CN01142177A CN1405698A CN 1405698 A CN1405698 A CN 1405698A CN 01142177 CN01142177 CN 01142177 CN 01142177 A CN01142177 A CN 01142177A CN 1405698 A CN1405698 A CN 1405698A
Authority
CN
China
Prior art keywords
server
requirement
acquisition
allocator
carry out
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 01142177
Other languages
Chinese (zh)
Other versions
CN1182477C (en
Inventor
杨竹星
罗孟彦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Accton Technology Corp
Original Assignee
Accton Technology Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Accton Technology Corp filed Critical Accton Technology Corp
Priority to CNB011421770A priority Critical patent/CN1182477C/en
Publication of CN1405698A publication Critical patent/CN1405698A/en
Application granted granted Critical
Publication of CN1182477C publication Critical patent/CN1182477C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Computer And Data Communications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The system comprises the server cluster and the dispatch device. The server cluster comprises the multiple servers connected to the network. Each server can provide the service for picking up the information on the WWW for the users connected to Internet. The dispatch device includes the routes. The dispatcher dispatches the intentions and the sorts of the picking request sent by the user to one among the servers in the cluster. The route mechanism distinguish the said intentions and the sorts of the picking request. When the service failure of the corresponding server happens, the executed but not finished picking request is transferred to another server. Thus, the service will not be interrupted or missed caused by the failure of the server system.

Description

Zero-loss information network service system and method
The present invention's technical field
The present invention relates to a kind of Information Network service (web service) system and method, particularly relate to Information Network service system and the method for a kind of fault tolerant (fault tolerant).More particularly, the present invention relates to bear the inefficacy of server (server) and overload (failure and overload) situation, to realize a kind of fault tolerant Information Network service system and the method for zero leakage service.
The present invention's background technology
Be implemented in the WWW (World Wide Web) on the Internet (Internet), for the commerce services of multiple character of all kinds, become foundation structure for a kind of important (under many situations or even necessary).Therefore, for both, the service fiduciary level of web server (service reliability) just becomes an important key factor for client and ISP (service provider).And, when the Information Network service becomes more diversified and more becomes complexity because of commercial the application in itself, how a server system that has height reliability, can not miss any user's service acquisition requirement (user request) is provided, more becomes one and have challenging important topic
Typical situation is that the fiduciary level of a website (website) mainly can be subjected to the influence of two kinds of problems: the inefficacy of server and overload problems.The situation that server lost efficacy is to belong to possibility because the problem of various types of hardware or the caused a kind of reliability of software fault aspect; And the situation of server overload then can cause responding the user to serve the speed that acquisition requires and ease up usually, even can't provide service for the user, thereby causes user's service acquisition to require the situation of leakage.For any one business website, reaction fast and the high-reliability requirement that is inevitable.Among the business environment of high competition today, the direct significance of the interruption representative of service is the loss of income and the damage of goodwill.
Make up a high-effect and reliable web server system, it is feasible that the multiple possible practice is arranged.Wherein, multiple formula server (server replication), perhaps, server cluster (serverclustering), be to be generally believed the most feasible system architecture by academia and industry member at present, because the perfect server cluster system of design will have many advantages: at first be high-effect, deal with the service acquisition requirement of number simultaneously in thousand, ten thousand clients because such system can assemble the ability of multi-section computer; Next is expandability (scalability), because this loosely-coupled system can improve the entire system load capacity by adding new server easily; Then be high availability (availability) at last, though just system wherein one or two server break down because all the other servers in the system still can continue to provide service, can make external the service be unlikely to thereby interrupt.
In several years, the scheme of existing multiple server cluster is suggested in the past.The example includes the article " Using smart clients to build scalableservices " that is proposed such as people such as C.Yoshikawa, the client practice of wherein being discussed in the 1997 USENIX AnnualTechnical Conference on January 6th to 10,1997; DNS another name (DNS aliasing) practice of being discussed in the article of people such as R.McGrath " NCSA ' s world wide web server:design and performance " in November nineteen ninety-five IEEEComputer; The article " A scalable and highlyavailable web server " that people such as Dias are proposed in the Proceedings of in February, 1996 the COMPCON ' 96, the article " Design and implementation of an environment for building scalableand highly available web server " that is carried among the Proceedings of the 1998 International Symposium on Internet Technology of people such as C.S.Yang during 1 day April 29 to May in 1998, and the article " Network dispatcher:a connection router for scalable Internet services " that during the Proceedings of in April, 1998 the 7th Intemational World Wide Web Conference, proposed of people such as G.D.H.Hunt etc., the TCP that is wherein discussed connects around sending (TCP connection routing) scheme; And people such as D.Anderson was in Journal of Parallel and Distributed Computing in 1997, Vol 42, the HTTP redirection practice of being discussed among the article " Towards a scalable distributed WWW server on networkedworkstations " among the pp.91-100 etc.Above-mentioned these various schemes, though can realize the high-effect of above-mentioned cluster server system really, expandability, with advantages such as high availability, but, it is not enough that the fiduciary level of said system is still disliked for current com, because when a certain server in the system hinders for some reason or overloads and when losing efficacy, though capturing requirement, follow-up user can share by other servers in the system, but originally still uncompleted service acquisition requirement will be fallen by leakage and can't be continued service on the server of fault, method proposed by the invention is clustered service device system and adds the anti-wrong performance (fault resilience of service) of service, other servers that make these uncompleted service acquisition requirements can be passed in the system get on and quilt continuation execution, and then can reach the purpose that system service zero is missed, and increase substantially the fiduciary level of the server system of Information Network service thus, can deal with the demand of various commercial websites.
Up to the present, the research work of most web server cluster is to concentrate on how to carry out the distribution of service request among server node group.But, the server system of cluster among this class scheme or product can only utilize its repeated configuration that higher availability is provided, and wherein do not handle the anti-wrong problem of service.People such as Ingham assessed some the existing scheme (2,000 1/2 month IEEE Internet Computing, the " Constructing dependable Web services " article among the Vol.4) that makes up height authentic communication network server.People such as Ingham have pointed out to provide the important of integrality (transactional integrity) for the transaction operation, just when having certain server to lose efficacy in the system, be not performed the service acquisition demand of finishing as yet at these servers and must manage the service of being continued, can not interrupt because of thrashing.According to their investigation result, among existing systems, the neither one system can provide the demand that so resists wrong ability and can satisfy transaction integrality.People such as Y.M.Wang utilize a kind of mode based on applet, attempt the problem (in April, 1997 article " HAWA:a client-side approach to high-availability webaccess " in Proceedings of the Sixth International World Wide WebConference) in the client process service availability.But, this scheme can't be handled at the fault-tolerant problem of server end, the people such as the people such as and the service based on transaction does not add explanation yet in addition.Singhai (in June, 1998 article " System support for scalable and fault-tolerant Internet services " in Proceedings of the28th Annual International Symposium on Fault-Tolerant Computing) and Chawathe (in September, 1998 article " System support forscalable and fault-tolerant Internet services " in Proceedings of Middleware ' 98), having proposed can be for the structure that makes up the Internet service with high availability. This type systematic can provide the higher system availability really, but these systems can't provide service anti-wrong ability, thereby does not have high-reliability.These schemes or system, when the situation that server lost efficacy takes place, all undeclared how can be at uncompleted service acquisition be required be transferred to other servers, and when taking place to require transfer, how still can provide problem such as service stably to handle.
Another problem of existing cluster server system then is the adaptability that lacks for the load outburst situation of sharply increasing sharply suddenly.Except the Information Network of quick increase is used client's population, the significant characteristic of another of WWW then is the height variability of its online flow, one is that the website of intermediate flow may at any time can be because of some special incident at ordinary times, moment flows into a large amount of users and triggers significant load outburst situation, and the load of this class outburst may continue the time of a few hours even a couple of days.For example, delivering of some off-the-shelf software or commodity redaction perhaps has certain website to be declared as incidents such as " best websites of this month " and all can cause flow to explode.Its result then is some node among server cluster, requires to overload owing to receiving the service acquisition that surpasses its load capacity far away, causes system response speed to ease up and then decommission.For these websites that E-business service is provided, the service request that these instantaneous burst pour in may only be that general webpage is browsed in requirement, yet these excess loads (for example may be pressed other even more important services, buying and selling of commodities service or to some service than the high charge client) the acquisition requirement, make this website to provide service to these important customers, this problem is irritating especially.For of the requirement of this class, even be under the extremely heavy duty situation, still should have availability at server than critical services.Or say, this class service request, than other comparatively inappreciable service requests, should be able to enjoy higher right of priority.
The present invention's general introduction
Therefore, the present invention's fundamental purpose is to provide a kind of Information Network service system and method for zero leakage, under the situation that server take place to lose efficacy, the present invention can allow that the ongoing Information Network service of far-end client acquisition requires still can exist and continues to provide service.
Another object of the present invention is to provide a kind of Zero-loss information network service system and method, takes place at server under the situation of overload, and the present invention can allow that the ongoing Information Network service of far-end client acquisition requires still can exist and continues to provide service.
The present invention's another purpose is to provide a kind of Zero-loss information network service system and method, can be adapted to the quick-fried situation about increasing of load.
The present invention's another purpose is to provide a kind of Zero-loss information network service system and method, require to shift generation during service stably still can be provided.
In order to realize above-mentioned and other purpose, the present invention integrated content know route (content-awarerouting), TCP connect setting up in advance of (TCP connection) and utilization again, two TCP connect between the no gap of packets of information pass on and fault-tolerant answers machine-processed, thereby utilized an allocator that includes a routing mechanism, when the situation of server generation service failure, client's acquisition is required to transfer on another server, thereby can among zero a leakage communication system that customer network acquisition service is provided, provide anti-mistake effectively.
The brief description of drawing
At the comment that reaches the back with reference to the accompanying drawings, cooperate explanation but not after the preferred embodiment of finitude is elaborated, can be easier to state before understanding the present invention purpose and other features and advantage.Among the figure:
The synoptic diagram of Fig. 1 shows an Information Network acquisition service adopting Zero-loss information network service system of the present invention and method;
Fig. 2 is a sequential chart, wherein shows the operation situation of handling according to one embodiment of the invention with one of interactive process related service requirement communications protocol;
Fig. 3 is a calcspar, wherein shows and can avoid the single point failure shortcoming in the present invention's Information Network service system, has the structure of an assignment device of a plurality of indivedual allocators; And
Fig. 4 is a process flow diagram, wherein is presented under the situation of server generation inefficacy, according to the treatment scheme that requires disposal route of one embodiment of the invention.
The present invention's detailed description
For the Information Network service of zero leakage is provided, the invention discloses a kind of structure mechanism, it can be under the situation that server takes place to lose efficacy or overloaded, arbitrary Information Network service acquisition of well afoot is required steadily successfully to transfer on another working node, and replied, proceed to handle.Its purpose is to provide the Zero-loss information network service of not having leakage fully.In other words, according to the constructed system of preferred embodiment of the present invention, even under the situation that takes place to lose efficacy or overloaded at server, also can guarantee the service request that the client sent of any use Information Network, the situation that can not miss.
Among the preferred embodiment of one of cluster server of the present invention, need a kind of service acquisition that requires routing mechanism (request-routing mechanism) will enter to server and require to be assigned, to reach best response characteristic.In order to want to support effectively the Information Network service of zero leakage, this routing mechanism that requires must have at least two kinds of abilities, that is, and and for the differentiation of sending into requirement (discrimination) and the ability that shifts (migration).System operator also can be specially adjusted the right of priority of the service of some classification, guaranteeing fault-tolerant support, or gives the service performance of higher degree.
Fig. 1 shows a WWW acquisition service having adopted Zero-loss information network service system 100 of the present invention in a schematic way.Among this system, a user (client) 110 can capture a specific website 150, to require its required service.Website 150 has by a plurality of servers 141,142 ... with 149 server clusters 140 of being formed.Cluster 140 is connected to (a kind of example of this network is the Internet) on the network 20 by an allocator 130.
It should be noted that at this shown allocator 130 includes an assignment device 131 and a network-switching equipment (network switch device) 132 in synoptic diagram of Fig. 1.What it will be appreciated by those skilled in the art that is, allocator 130 also can be one separately and limit the micro computer of purposes, or based on a kind of assignment device (dispatcher device) of microcontroller (microcontroller), this type of device can not need the cooperation of all network-switching equipments as shown in the figure 132, just can independently operate.But, in the synoptic diagram of Fig. 1, assignment device 131 can be to assign computer based on the special use of a PC, and it mainly acts on the assignment operation that is in this explanation will to be described.
Allocator 130 is responsible acquisition requirements that client 110 is sent, and assignment gives a certain proper server the suitableeest among the cluster 140 (so-called " just suitable " will have illustrated in the back).As the back will illustrate that for each service request that each client sent, allocator can be assigned to a suitable station server and handles sending the requirement that comes to; But, that server provides by being assigned in the cluster among the figure 140, with the response message that is sent, then can be without the processing of allocator 130, allocator 130 can be given the client 110 who sends requirement and these response messages are passed on loopbacks via network 120.For this reason, just allocator 130 must be carried out a kind of route mechanism, as among the figure with reference number 135 in a schematic way.
In order to reach the Information Network service of zero leakage under the situation that takes place at server to lose efficacy or overloaded, by this constructed route mechanism 135 of allocator 130, it must have two kinds of important abilities for the service that the cluster 140 of website is provided: state recording and answer.Also promptly, some interstage state of requirement that the user sends must be noted by recording mechanism.When situation about losing efficacy when server took place, answer mechanism promptly was able to take out on the node by lost efficacy (or overload) state of these requirements in just handling, and was proceeded to handle by other server.This can need provide one group of effective intermediateness just smooth for the working node that newly is assigned.
But, if send all requirements of coming to, all must be write down, just that whole work can become is excessively huge and unrealistic.Among the website of today, send the type of service acquisition requirements that comes to, can roughly be divided into static webpage, by the dynamic web content that is produced such as CGI language program (CommonGateway Interface scripts) or with three types of (transaction) the relevant persons' that concludes the business etc. service.In fact, according to the present invention, if the answer that will reach interrupt request does not need all requirements are all noted.So, the route mechanism 135 of web server cluster 140 must have the ability of knowing the requirement content, so that can have the requirement of key condition to be distinguished for the action of replying these.As for how an Information Network service request that interrupts losing efficacy can be replied, and can make it on another node, proceed to handle then, it is another complicated problems, particularly, a kind of response agency like this for the user, must be transparent (transparent), in other words, the user can not discover a much middle such mechanism fully.In addition, this process that will capture the requirement transfer must be steady as much as possible.The present invention proposes an efficient method just and reaches these requirements. The differentiation of requirement
Be suitable for a basic embodiment who requires route mechanism of the present invention's system, can make up based on allocator.Among Fig. 1, carry out the allocator node of routing mechanism, for example, node 131, the continuation TCP that this allocator will be set up a quantity in advance connects (persistentTCP connection) server node to the rear end, then analyzed that client sends here the service acquisition require after, this demand for services connected by the TCP that sets up in advance assigns the server node that so far is assigned and get on.Notice among the explanation paragraph below, the cause that describes for convenience, adopt the Internet as an example, but it will be understood by those skilled in the art that, any network 120, comprise internal network (corporate intranets), personal network (personal networks) and the home network similar network such as (home networks) of company, all can be suitable for.When a client attempts capturing a certain specific service or web page contents, at first the browser of client must produce a TCP (transmission control protocol) connection, so client can be sent a TCP connection and set up requirement, specifically, can send a packets of information (packet) that has TCP SYN.After allocator was perceived such TCP connection request, allocator can respond this connection foundation and require to be connected until setting up a TCP with it.After client and allocator are set up TCP and are connected, client just can be sent its real service acquisition requirement, just send and carried the packets of information that HTTP (HTML (Hypertext Markup Language)) requires, method, the URL (URL(uniform resource locator) that can include requirement among this HTTP requires, indicate its desired certain content) and other HTTP client's title (header) information (for example, main frame, cookies, etc.).
At this moment, allocator can be checked the content of HTTP title, so that determine the type of this requirement and how to pass on this requirement.When allocator has been selected server of the most suitable this requirement of processing, then can connect among the inventory by the available TCP of destination server, select connecting in advance of a free time.Allocator then just will be stored with a kind of internal information structure that is called " mapping table " relevant for the relevant information (for example, the state of TCP) of chosen connection, and with user's connection and be connected interosculate (binding) that sets up in advance.After the combination that TCP connects is determined, IP (Internet protocol) and TCP title that allocator just changes each packets of information get on so that this packets of information is transferred to the server that is assigned, allocator can be handled in the same way and passes on follow-up packets of information by the content in the corresponding tables subsequently, with user's connection with set up in advance be connected between, have no to pass on gap these packets of information, so that server can receive and identify these packets of information with complete transparent way.
On the basis of this basic route mechanism 135, one can insmod and 136 is inserted among the kernel program (kernel) of each back-end server.This can insmod and 136 be inserted between Network Interface Driver and TCP/IP pile up.This effect of 136 of can insmoding has two aspects.At first, allocator 130 can be given the information of combination can insmod 136, can insmod 136 then then to change the packets of information that is sent, so that these packets of information can directly advance to client 110, and does not need by allocator 130.Its result, because the quantity of information of delivering to client 110 from chosen server is normally significantly greater than the quantity of information of being delivered to this chosen server by the client, the processing of allocator 130 burden just can greatly reduce.Second purpose of this design is the problem (the single-point-failure problem) that is to avoid single point failure, and its details will be described in detail among the explanation paragraph of back.
Based on the aforesaid route mechanism that requires, the requirement of sending into is differentiated in the intellectuality that allocator 130 need have to a certain degree, passes on the decision of route so that make.For addressing this problem, can make up a kind of structure of internal data, promptly URL shows, so that safeguard and content-related information.This has comprised the information such as processing node of size, type, priority ranking, appointment etc. of content.When one send to the requirement that comes assign give back-end server one of them the time, allocator 130 can be inquired about the content of URL table.Among a preferred embodiment, the URL table should be able to be set up the model of the pyramid structure (hierarchical structure) that is stored in the content in the website.And the basis of this practice is normally to use a kind of fact of organizing for the pyramid structure on basis according to archives catalog according to web page contents.In view of the above can inference, the archives among same catalogue can have identical character usually.For example, the archives under/CGI-bin/ catalogue should be to be in order to produce the CGI language program of dynamic content usually.
Therefore, according to one of the present invention preferred embodiment, the URL table can be made a kind of multi-level hash tree (hash tree) structure, and each level wherein is respectively corresponding to a level in the content tree columnar structure, and each node is also all represented archives or catalogue.Basically, each project (archives or catalogue) in holding within the website all respectively should have the record of a correspondence in the URL table.But, for time and the scope that shortens search, the URL table among the present invention can be supported the mechanism of a kind of " general " (" wildcard "), for the project that indicates a whole group of equally all pointing to same nature.For example,, and also have identical content type, then just can have only "/meeting of html/ " to be present among the URL table if all items under sub-directory "/html/ " all is stored among identical node.Show to obtain information relevant for a URL item "/html/misc.html " if allocator plans to search URL, then allocator just only can utilize and search within one deck, and by obtaining this information among the node "/html/ " in showing.URL shows normally with a management system, utilizes analysis content pyramid structure and generation automatically, maintenance and management.If necessary, the supvr of network system also can organize the content of structure URL table. The transfer of requirement
After route mechanism 135 had been equipped with the ability of requirement differentiation, it among a process that requires to handle, when the situation of server inefficacy takes place, just can carry out the processing that this requirement is shifted among a preferred embodiment of the present invention.According to the system and method for a preferred embodiment of the present invention, in the process that server process requires, when the situation that lost efficacy or overloaded has taken place, just can carry out the smooth transfer of these requirements and reply processing.In addition, for the transfer and the answer of reaching requirement, the present invention's system needs the mechanism of a kind of state-detection that the overload of identified server apace or inefficacy situation taken place.
Usually, general Information Network requires to be categorized as three types requirement: the requirement of the requirement of static content, the requirement of dynamic content and interactive process related service.The type that requires of each all respectively has the method for the transfer and the answer of its correspondence.Allocator can utilize, for example, inquiry URL table and discern the type of each requirement, and take place that server lost efficacy or the situation of overload under, utilize corresponding scheme, shift and reply the requirement of each type. The requirement of static content
Among all Information Networks require, sizable ratio is arranged, be to belong to the static content of demanding such as HTML archives, image and audio/video short-movie (audio/video clips) etc.On another node that the present invention's mechanism can be utilized to transfer at the requirement of this type of static content in the server cluster 140.For realizing this purpose, at first, allocator 130 is selected a new server (for example, according to the mechanism of certain load balance), and selects the connection of setting up in advance that be connected to some extent with destination server, a free time again.Then, allocator is the connection of client, and overline is connected with the new server end of selecting and carries out combination.After new determining in conjunction with quilt of connecting, allocator 130 is promptly to new server end, to the connection of selected server node, the requirement (range request) of sending a scope.Area requirement has its definition among the communications protocol of HTTP 1.1 editions, it can allow that a client partly claims to some of a resource.Use this character, caused file to download a requirement of interrupting, just can continue to download its firm interrupted file by on a node owing to losing efficacy.
According to the TCP relevant information (that is, ACK number, sequence number) that is recorded in the mapping table, allocator 130 can inference draw the data how many bytes client 110 should successfully receive.Its result, allocator 130 then just can utilize the Range title is contained in wherein, and makes an area requirement, has wherein indicated the scope (usually opening the beginning by the ACK number from client's last notice) of required byte.
Between two TCP connect, the no gap of using again of connecting in advance and packets of information to be passed on after these two kinds of technology combine, the requirement of an inefficacy just can obtain to reply on another node in system reposefully.Should also be noted in that the response for an area requirement, be compared to the title of a general response, it can have the HTTP title (for example, wherein being loaded with 206 status codes) of a uniqueness.But this species diversity also need be changed by the loader module 136 of back-end server in the server cluster 140. The requirement of dynamic content
Some Information Networks requires to send at dynamic content, and its response is under demand, and major part is the parameter (arguments) that provides according to the client and (for example, CGI language program, the ASP etc.) that produced.The requirement of an inefficacy of dynamic content is to utilize the requirement transmission that only will have same parameter simply to another server node, and can be shifted and reply.Under some situation, a dynamic requirements that only will have identical parameters is simply passed on, and is to produce some problems.Wherein main problem is that the requirement of some dynamic is not to have " with power " (" idempotent ").In other words, at latter two continuous requirement of elder generation dynamic content, that have identical parameters, the result that it obtained might not be just the same.Its modal example is, the dynamic web page of setting up according to database.To two continuous the same requirements that same webpage sent, because the cause of database through upgrading just may obtain different results.This expression will utilize and the aforementioned practice at the identical area requirement of the processing of static content requirement, and, can not accomplish with result's " stitching " of two dynamic requirements together.If wanting to attempt replys any dynamic requirements on another node, then the client just must abandon the response results that it has been received, and sends identical requirement once more again.But, this practice for the user, will can not be a transparent treated, and also can't be with general employed browser is compatible at present.
Therefore, institute solved this problem with the method that illustrates below the present invention promptly adopted.Allocator must " store the response of a dynamic requirements before to send again " for 130 this moments.In other words, before allocator 130 was received complete results, allocator 130 can not give client 110 with the response transmission.Therefore, if server node is being handled the middle situation that interruption takes place of a dynamic requirements, allocator 130 just can be abandoned it and connect, and identical requirement is sent to another node once more.Have only after allocator 130 has been received complete results, just can begin it transmission that client 110 is carried out is handled.
Though so solved the problem of " with power ", such practice still has two kinds of possible shortcomings.At first, allocator 130 may be subjected to the influence of single point failure problem.In the back among another embodiment of system and method with the present invention of explanation, this problem deals with acquisition.Secondly, even just temporarily the response of server is stored among the allocator 130, the performance of allocator 130 is greatly decayed, and this flow (throughput) of understanding whole server system have negative influence.But, according to statistics, because the size of dynamic web page usually and little, so it is to the impact of performance and be unlikely too serious.
In order fully to eliminate the misgivings of performance degradation aspect, our design is to make allocator can have the function that reverse agency gets (reverse proxy) soon.Also promptly, allocator 130 can be stored in dynamic web page in the Cache (cache) so that follow-up requirement for identical dynamic web page, can be directly by obtaining content among the Cache, rather than program of repeated priming produces identical webpage once more.
Among once testing, this algorithm (algorithm) is through implementing, with the dynamic web page of managing cache.The result of test shows, adopted the present invention's system not only can carry out the answer of inefficacy, and the overall performance of the system benefit that also obtains to enlarge markedly because of this design.This is because for the requirement of dynamic content, the time regular meeting make the slow cause of getting off of running of web server.Utilize this design of the present invention, have these dynamic requirements, just can lower the burden of backend network servers by service directly is provided among the Cache with power. The requirement of interactive process related service
Among any primary information net acquisition process, may involve user's interactive operation for several times.Wherein, it is the webpage of merely browsing several mutual incoherent static state or dynamically producing that the user has more than, but under the control such as CGI language program, a server relevant with some state shared, is directed a process of process.Such as, a such state may include an electronics " shopping cart " institute content loaded (this content is a shopping list in certain shopping website), or lists inventory from a result of a search result.
The service that this type of is relevant with interactive process, normally based on so-called " tri-layer " structure, it is made of front end client (for example browser), intermediate network server and back segment database server.Basically, the front end client can provide user's interface, is used for web server is sent requirement for the user, with the acquisition of serving.On the other hand, web server executive utility and carry out business logic then.It can handle requirement of client, require to submit to (commit) to give database server these, the state of store results, and give the client with loopback as a result.Those of ordinary skill knows that all database server is the action in rear end management information and transaction.
If the situation that lost efficacy has taken place in the centre that a station information network server carries out at process of exchange, terminal user normally can't obtain any relevant for its require whether successfully to submit to wait, during the information of being correlated with such as any fault of being taken place.The user may just wait until always that in client the time has spent the time limit (timeout).Some user may send its requirement again, but because the cause that its state has been lost, these requirements will can not obtain any response.Some other users then may send all requirements again, and attempting to finish this process, and this may cause to have finished and surpasses the once risk of above transaction, makes that the user is repeatedly charged.This situation can be made the user angry, and understands the cause owing to service disruption, and seriously damages the confidence level of this website.
Will reply once interactive relevant process on another node, what it involved is more complicacy.This can know some details such as when whole procedure begins, when internal state, intermediate parameters, process finish or the like to be relevant to application program.In order to ensure the fault-tolerant ability of process itself, also need a kind of mechanism and duplicate middle various treatment states.
At first, need to enjoy these interactive relative programs of anti-mistake or superior performance service, need to be defined by the portal management person earlier.For example, the supvr can be defined as the signal that a process begins with the action of " user is added to first kind of commodity item in the shopping cart in one particular webpage ".In addition, the supvr also can be defined as the ending message of this process with " user supresses the button of checking out " action.The supvr can carry out this type of group structure definition action at an easy rate via the GUI of a management system.
The information of this type of group structure can be stored among the URL table.As mentioned above, allocator 130 should be inquired about URL table, assigns with the requirement that is sent into one of to give in the server cluster 40 portion's web server.When allocator 130 is found to have included the action of " beginning " in (according to its " judgement ") requirement, just the client makes a mark for this reason, be about to all subsequent requests of a client since then afterwards, one of all be directed in the two server (twin server), till among follow-up requirement, finding when containing the requiring of " end " action.
Aforementioned two server is a kind of logical combination for server, has wherein comprised a master server (primary server) and a backup server (backup server).Master server is a regular web server that provides transactional services for being exclusively used in.On the other hand, backup server then is responsible for master server backup is provided.It will be appreciated by persons skilled in the art that pairing constitutes the master server and the backup server of two server group, can be by system according to various conditions and by selected among the server cluster.
Can keep two IP addresses in the backup server.One is the address of itself, and another then is the address of master server.The another name of master server IP address system is provided on the network interface, is accepted to point to this address, the corresponding packets of information of sending into so that local communication protocol piles up.But, backup server is not exported this alias address via ARP (Address Resolution Protocol, address resolution protocol), because just can cause the situation of IP address conflict like this.In addition, the node of a backup usefulness also can be the master server of multi-section, and the service of backup is provided simultaneously.It in the sequential chart of Fig. 2 the details of operation situation that shows in a schematic way by the performed communications protocol of this two server.
Shown in the sequential chart of Fig. 2, as a user, as the client among Fig. 2 210, sent and be routed mechanism (136 among Fig. 1) and classify as a relevant requirement of interactive process, when being customer requirement 211, allocator 130 just is directed to this requirement the master server 220 among the two server group, to handle.Backup server 240 is owing to adopted the cause of master server address another name, therefore can receive all packets of information that are sent to master server, and a HTTP in the backup server 240 resides servo programe (HTTP daemon, do not show among the figure), also will receive this customer requirement 211, and this is required to require 212 as its backup client, and utilize " in a monastic silence " this requirement, require 213 to be noted with record, and reach synchronously with master server 220.Backup server 240 so promptly can be kept " process treatment state " information identical with master server 220.But, unless master server 220 had lost efficacy, otherwise the servo resident program of the HTTP in the backup server 240 can't send any packets of information or result.This can guarantee that 210 of clients can receive a result.
Successfully note at backup server 240, require shown in 213, just a customer requirement 211 and master server 220 is sent " beginning a to carry out " information 214 for this reason as the record among the figure with requiring.Carry out 214 at the beginning unless master server 220 receives this, do not give database server 250 otherwise just this customer requirement 211 can not submitted to.If master server 220 waits for that this " begins to carry out " information 214 and has spent considerable time, just can initiatively send an information to backup server 240, writing down this requirement, and then wait for once again and begin to carry out 214.If master server 220 still can't obtain the information 214 of this one " beginning to carry out ", can suspect that just backup server 240 had lost efficacy, just can produce a new backup server afterwards.
When master server connect 220 receive the information 214 that begins to carry out really after, customer requirement 211 just can trigger and begin to carry out a kind of traditional communications protocol with database server 250, two-phase is submitted (two-phase commit) 215 to, can guarantee that so same requirement keeps unanimity on the meaning of transaction.When database server 250 sends database server response 216 as response, master server 220 can send an information to backup server 240, record result 217 with before giving data server 250 with this requirement submission, gets off its outcome record that is about to occur.After receiving outcome record Ack (replying) information 218 that comes from backup server 240, master server 220 just can utilize the information of sending, and submits to and requires 219, gives database server 250 and formally this requirement of client is submitted to.
When database server 250 requires 219 and after finishing transaction this time in response to submission, just can be with a database server A ck information 221 response master servers 220.At the same time, because the cause of address another name, backup server 240 also can receive this database server Ack information 221.Then, backup server 240 also can be sent a backup server Ack information 222, informs master server 220, and it has got off the outcome record of this transaction.Afterwards, master server 220 just can be sent results web page 223 to client 210 receiving two Ack information 221 and 222 (coming from database server 250 and backup server 240 respectively).When master server 220 and backup server 240, when both all receive a client Ack information 224 by client 210, both just can obtain the completed conclusion of communications protocol, finish as if whole process this moment, then master server 220 just can send and abandon record data 225 instructions to backup server 240, the data that make it abandon noting.When master server 220 lost efficacy or overloaded, backup server 240 promptly can utilize the startup alias address, and the whole work of catcher master server 220, then, the servo resident program of HTTP just can take the state that duplicates to begin to send data, has replaced the action of master server 220.
It should be noted that allocator 130 does not need to know whether master server 220 lost efficacy.Allocator 130 can be passed on packets of information the address (that is another name IP address) to " representativeness " always.As long as backup server can be taken over the work of master server and start alias address, even when master server lost efficacy, the webpage service of managing on behalf of another (web hosting service) can be kept not interrupt.The such design of the present invention can discharge the burden on the allocator, and the overall performance of enhanced system.
Apache is widely used a kind of web server software on the current the Internet, and its source code is disclosed.Among one embodiment of the invention, Apache can implement aforesaid communications protocol through revising.The server capability of Apache is because its opening, reliability, efficient and epidemic cause expand its function and be selected to extend, with the aforementioned communications protocol of actual embodiment.Apache follows the pattern of a program one requirement and handles the requirement of sending into.Apache can set up batch processing in advance, and each program is all called out accept () system calling and accepted new connection.Usually, Apache follows a series of step to handle each requirement: (1) accepts a requirement, (2) analyze its parameter for later use, (3) translate URL, (4) check the acquisition mandate, (5) decision is required the mime type of archives, (6) processing requirements, (7) client is sent back in response, and (8) are noted with requiring.When reality enforcement is of the present invention, among step (6), can insert a decision logic.If its processing is the part that belongs to master server, its information that just can send a record arrives backup server, and waits for its reaction.Before the answer of receiving backup server, program can not given database server with requiring to submit to.The information of record is that origin source IP address and client's port numbers and program ID constitutes.If program is the part that belongs to backup server, it is processing requirements really, but can wait for the recorded information that comes from master server.This expression backup server will be kept one group of treatment state identical with master server, but skips steps (7).
As mentioned above, allocator 130 representatives among Fig. 1, may be its problem that may suffer from, a kind of being called " single point failure ".In other words, the inefficacy of allocator can make whole server system lose efficacy and collapse.In addition, as shown in Figure 1, allocator 130 since its centralized design with and based on the practice of software, also may become the potential resistance of system effectiveness or extendibility.Even so, the real work based on previous embodiment of the present invention, and carry out the result of Performance Evaluation, what it was shown is that its extendibility in a median size server farm is still quite good.
In order further to promote the extendibility and the fault freedom of system among Fig. 1, as shown in Figure 3, have assignment device 330, a kind of system of revising of the present invention, it can be applied in the web server of the present invention system and as the present invention's another kind of embodiment, several allocators 331 of the requirement that having works in coordination on the whole sends into assignment, 332 ..., 339.Should be noted in the discussion above that allocator 331,332, ..., among 339, its each also can be single and micro computer special-purpose, or based on the assignment device of microcontroller, it not need such as among the figure with the cooperation of 360 shown network-switching equipments, just can bring into play its function.But, among the shown system of Fig. 3, the allocator 331,332 of each ..., 339, can be for assigning computer based on PC, the special-purpose that is connected on the network 370.Among such structure, the practice of class such as DNS can be used to different client's mappings to different allocators.Organized the allocator node 331,332 of structure in a ring at Fig. 3 by logicality ground ..., implemented in 339 the combination, and the set of the resident program of fault-tolerant ability is provided, also among the group structure of once experimental assessment, implemented.These resident programs make up based on SwiFT tool box (SwiFT Toolkit).Its each allocator node is all carried out a resident program that can monitor its logic neighbor state and backup is provided.
Current experimental assessment carry out during, under the situation of normal operation, all allocators are all participated in sharing of load.There is no allocator during this time is assigned among the idle hot standby host state to wait for the inefficacy of splitters device.The action of allocator is based on two kinds of important states: URL table and the combining information that is connected.URL table is to belong to the status information that can be after server lost efficacy, generated once again.In comparison, the combining information of connection, then for should being duplicated by backup node, otherwise the status information that can't when master server lost efficacy, produce again by other backup servers.Therefore, the splitters device is by routine plan, so that keep the record of up-to-date change of the combining information of connection.And, relevant for the recorded information of state change, can periodically be copied among its backup node, so that upgrade its record sheet through duplicating.If master server had really lost efficacy, backup server just can utilize the state that duplicates to take over the work of master server.But, if take over because the replication status table is not updated as yet, or since periodically update information loss cause, and possibly can't obtain newly to set up some status information of connection.Nucleus module in the back-end server owing to also keep the cause of the combining information of connection in these modules, can be taken care of this class situation.If a server node in the quite a long time, does not receive the packets of information that comes from allocator, then just can broadcast an information, with the situation that exists of inquiry backup allocator, and then again its combining information is recorded in wherein.
The process flow diagram of Fig. 4 is presented under the situation of server generation inefficacy, according to the treatment scheme that requires disposal route of one embodiment of the invention.Basically, whether the flow process of the performed method of the present invention's system just has the situation that server lost efficacy that takes place with step 410 detection system after step 400 beginning processing requirements.If nothing lost efficacy or the situation of overload takes place, then total system just can finish at end step 450 places, comes processing requirements to comply with normal operation.If the situation that lost efficacy at the process generation server of processing requirements is really arranged, then the present invention's system and method can move and guarantees the Information Network service of zero leakage.Described as process flow diagram and above stated specification literal, the present invention judges in discriminating step 420,430 and 440, with at all requirements of three types, carries out the processing that requirement is shifted and replied respectively in the step 422,432 and 442 of correspondence.Finish the requirement of shifting and replying after handling, can the server after renewing in, in step 450, proceed its normal handling.
Therefore, the literal of above-mentioned illustrative is to concentrate on design and the actual practice of realizing that fault-tolerant Information Network service is necessary, require transfer mechanism.By repeatedly simulating and test the result who leads, can show and confirm the present invention's Zero-loss information network service system and the mechanism in the method, a kind of powerful solution can be provided really, can support information network the anti-mistake of service.
For example, after once long-time running was tested, what its most noticeable result was shown was that the total false rate of this time experiment is zero.In other words, though some node in the server system is intentionally caused inefficacy in test process,, do not missed but still there is any requirement with the situation that emulating server lost efficacy.Among this time simulation, when a server node lost efficacy, the requirement in just handling on this failure node was all successfully transferred among the system, other can obtain on the node of usefulness.When three nodes lost efficacy simultaneously, allocator was found to the situation that overload has been born by system, and successfully calls the node of two free time together and come load sharing.Only making an appointment with within one period blink of several seconds, total system promptly settles out, and recovers normal.The test of this class has confirmed disclosed Information Network service system and method effectively, has the fault-tolerant ability of zero leakage really, and can relax the overload situation of service.
Though the comment of front has been a complete explanation of specific embodiment of the present invention, structure that its various modification changes, changes and equivalent person's application are still possible.Therefore, the description of front explanation promptly should not brought qualification the present invention, and its category should be defined by the content of the attached claims in back.

Claims (19)

1, a kind of system that Zero-loss information network service is provided can be via a communication network, utilize the client with predetermined addressing parameter to capture to require and be that the client provides services on the Internet, and this system includes:
A plurality of servers that communicate with one another,
An allocator, include a route mechanism, can carry out communication with this communication network, and carry out communication with these servers, this allocator is according to should predetermined addressing parameter and an acquisition required to assign one of give in described a plurality of server, when losing efficacy one of in a plurality of servers, this route mechanism will capture requirement and be transferred in described a plurality of server another.
2, the system according to claim 1, wherein this allocator also includes a network switch, can utilize net connection and this allocator is connected to described a plurality of server.
3, the system according to claim 1, wherein this allocator is to assign computer based on the special use of PC.
4, the system according to claim 1, wherein this allocator is the special-purpose allocator based on microcontroller.
5, the system according to claim 1, wherein this allocator also includes a plurality of separate type allocators that can communicate with one another in an allocator network, the acquisition requirement that these separate type allocators are assigned the client with whole approach to cooperation, and require assignments one of to give in these a plurality of servers these.
6, the system according to claim 5, wherein this allocator is to assign computer based on the special use of PC.
7, the system according to claim 5, wherein this allocator is the special-purpose allocator based on microcontroller.
8, the system according to claim 1, wherein this communication network is the Internet.
9, the system according to claim 1, wherein this communication network is a company's internal network.
10, a kind of system that Zero-loss information network service is provided can be via a communication network, utilize the client with predetermined addressing parameter to capture to require and be that the client provides services on the Internet, and this system includes:
One server cluster wherein includes a plurality of servers that communicate with one another, and described a plurality of server is combined into the two server of many groups, and each two server respectively includes a master server and a backup server;
One allocator, include a route mechanism, can carry out communication with this communication network, and carry out communication with this server cluster, this allocator is according to the addressing parameter that should be scheduled to, one of give in described many group two servers a submitted master server and the acquisition that requires to carry out the interactive process related service required to assign, when this master server lost efficacy, this route mechanism should capture requirement and be transferred to backup server in this group two server.
11, the system according to claim 10, wherein, when this backup server in this submitted group two server lost efficacy, this allocator one of specified in this server cluster a server to replace this backup server.
12, the system according to claim 10, wherein, this acquisition that this backup server of the two server that this is submitted gives assignment this master server of this submitted two server requires to note, and carry out synchronously with this master server in this submitted two server, to carry out the transfer of this acquisition requirement.
13, a kind of Zero-loss information network service method that provides, can include among the system of a plurality of servers, via a communication network, utilize client to capture to require and be that the client provides services on the Internet with predetermined addressing parameter, the step of this method includes:
A) differentiate an acquisition requirement, and one of be categorized as in static information requirement, multidate information requirement or the interactive process related service requirement;
B), one of tasked in described a plurality of server and should capture the requirement branch according to addressing parameter that should be predetermined; And
C) when the inefficacy one of in described a plurality of servers, the acquisition that requires to carry out the interactive process related service is required to transfer to another server in described a plurality of server.
14, a kind of Zero-loss information network service method that provides, can include among the system of a plurality of servers, via a communication network, utilize client to capture to require and be that the client provides services on the Internet with predetermined addressing parameter, the step of this method includes:
A) differentiate an acquisition requirement, and one of be categorized as in static information requirement, multidate information requirement or the interactive process related service requirement;
B) in described a plurality of servers, form one group and submit two server to, include a master server and a backup server;
C) basis is somebody's turn to do predetermined addressing parameter, and is carried out this acquisition requirement of interactive process related service by the master server processing requirements of this submission two server; And
D) when the inefficacy of this master server, this acquisition that requires to carry out the interactive process related service is required to transfer to the backup server of this submission two server.
15, method according to claim 14, wherein the processing requirements step c) of carrying out this acquisition requirement of interactive process related service also includes:
C1) this acquisition of this master server of this backup server by branch being tasked this submitted two server requires to note, and carries out synchronously with this master server in this submitted two server; And
C2) this backup server produces one of this system database server, get off as the result's of this acquisition requirement that requires to carry out the interactive process related service data recording.
16, method according to claim 15, the step d) that wherein this acquisition that requires to carry out the interactive process related service is required to transfer to the backup server of this submission two server also includes:
D1) data that these are recorded transfer to this client; And
D2) abandon the data that these are noted.
17, method according to claim 14, wherein the processing requirements step c) of carrying out this acquisition requirement of interactive process related service also includes:
C1) this acquisition of this master server of this backup server by branch being tasked this submitted two server requires to note, and carries out synchronously with this master server in this submitted two server;
C2) this master server submits to communications protocol to submit to one of this system database server with two-phase;
C3) submit communications protocol to according to this two-phase, this database server produce as require to carry out the interactive process related service should acquisition requirement result data; And
C4) this backup server produces this database server, get off as the result's of this acquisition requirement that requires to carry out the interactive process related service data recording.
18, method according to claim 17, the step d) that wherein this acquisition that requires to carry out the interactive process related service is required to transfer to the backup server of this submission two server also includes:
D1) data that these are recorded transfer to this client; And
D2) abandon the data that these are noted.
19, method according to claim 14, wherein the processing requirements step c) of carrying out this acquisition requirement of interactive process related service also includes:
C1) this acquisition of this master server of this backup server by assignment being given this submitted two server requires to note, and carries out synchronously with this master server in this submitted two server;
C2) this master server submits to communications protocol to submit to one of this system database server with a two-phase;
C3) submit communications protocol to according to this two-phase, this database server produce as require to carry out the interactive process related service should acquisition requirement result data; And
C4) this backup server utilization produces this database server, get off as the result's of this acquisition requirement that requires to carry out the interactive process related service data recording; And
The step d) that this acquisition that requires to carry out the interactive process related service is required to transfer to the backup server of this submission two server also includes:
D1) data that these are recorded transfer to this client; And
D2) abandon the data that these are noted.
CNB011421770A 2001-09-14 2001-09-14 Zero-loss information network service system and method Expired - Fee Related CN1182477C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB011421770A CN1182477C (en) 2001-09-14 2001-09-14 Zero-loss information network service system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB011421770A CN1182477C (en) 2001-09-14 2001-09-14 Zero-loss information network service system and method

Publications (2)

Publication Number Publication Date
CN1405698A true CN1405698A (en) 2003-03-26
CN1182477C CN1182477C (en) 2004-12-29

Family

ID=4676680

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB011421770A Expired - Fee Related CN1182477C (en) 2001-09-14 2001-09-14 Zero-loss information network service system and method

Country Status (1)

Country Link
CN (1) CN1182477C (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102325154A (en) * 2011-07-13 2012-01-18 百度在线网络技术(北京)有限公司 Network system with disaster-tolerant backup function and method for realizing disaster-tolerant backup function
CN104683254A (en) * 2013-11-29 2015-06-03 英业达科技有限公司 Route control method and route control device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102325154A (en) * 2011-07-13 2012-01-18 百度在线网络技术(北京)有限公司 Network system with disaster-tolerant backup function and method for realizing disaster-tolerant backup function
CN102325154B (en) * 2011-07-13 2014-11-05 百度在线网络技术(北京)有限公司 Network system with disaster-tolerant backup function and method for realizing disaster-tolerant backup function
CN104683254A (en) * 2013-11-29 2015-06-03 英业达科技有限公司 Route control method and route control device

Also Published As

Publication number Publication date
CN1182477C (en) 2004-12-29

Similar Documents

Publication Publication Date Title
KR100243637B1 (en) Computer server system
US11175913B2 (en) Elastic application framework for deploying software
US20020199014A1 (en) Configurable and high-speed content-aware routing method
US6219692B1 (en) Method and system for efficiently disbursing requests among a tiered hierarchy of service providers
US5867706A (en) Method of load balancing across the processors of a server
US7426546B2 (en) Method for selecting an edge server computer
EP1116112B1 (en) Load balancing in a network environment
EP1315349B1 (en) A method for integrating with load balancers in a client and server system
US20020152307A1 (en) Methods, systems and computer program products for distribution of requests based on application layer information
US20050204020A1 (en) Shared internet storage resource, user interface system, and method
US20020169889A1 (en) Zero-loss web service system and method
US20030204622A1 (en) Dynamic invocation of web services
CN107835437B (en) Dispatching method based on more cache servers and device
CN101296176B (en) Data processing method and apparatus based on cluster
CN1592303A (en) Methods and systems for application instance level workload distribution affinities
CN102577237A (en) Method for scheduling web hosting service, method for processing application access, apparatus and system thereof
KR100656222B1 (en) Dynamic addressing in transient networks
CN1647482B (en) System and method for network communication management
US7580989B2 (en) System and method for managing access points to distributed services
CN100580665C (en) Method for supporting index server to file sharing applications and index server
CN106982247A (en) A kind of picture distributed memory system based on WEB
CN1182477C (en) Zero-loss information network service system and method
CN115516842A (en) Orchestration broker service
JP2001067325A (en) Method and system for managing distributed object
CN115766736A (en) Independent operation system and method for multiple front-end developers

Legal Events

Date Code Title Description
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20041229

Termination date: 20150914

EXPY Termination of patent right or utility model