CN107465735B - Distributed messaging system - Google Patents

Distributed messaging system Download PDF

Info

Publication number
CN107465735B
CN107465735B CN201710637643.0A CN201710637643A CN107465735B CN 107465735 B CN107465735 B CN 107465735B CN 201710637643 A CN201710637643 A CN 201710637643A CN 107465735 B CN107465735 B CN 107465735B
Authority
CN
China
Prior art keywords
copy
theme
data
subject
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710637643.0A
Other languages
Chinese (zh)
Other versions
CN107465735A (en
Inventor
胡悦
吴文龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Duomai Electronic Commerce Co ltd
Original Assignee
Hangzhou Duomai Electronic Commerce Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Duomai Electronic Commerce Co ltd filed Critical Hangzhou Duomai Electronic Commerce Co ltd
Priority to CN201710637643.0A priority Critical patent/CN107465735B/en
Publication of CN107465735A publication Critical patent/CN107465735A/en
Application granted granted Critical
Publication of CN107465735B publication Critical patent/CN107465735B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of message processing, in particular to a distributed message system. Including a production cluster for producing subject message data, a service cluster for storing the subject message data, a consumption cluster for consuming the subject message data, and a management cluster for managing the consumption cluster and the service cluster. The service cluster comprises a theme partition used for storing theme message data, the theme partition comprises a main copy and a secondary copy which are distributed in different service ends of the service cluster, and the secondary copy is a redundant backup of the main copy; the production end in the production cluster accesses the main copy of the theme partition to store the theme message data generated by the production end, and the consumption end in the consumption cluster accesses the main copy of the theme partition to consume the message data of the theme partition. During consumption, the consumption end only takes data from the main copy of the subject partition for consumption, and the message consumption information does not have the requirement of synchronization to the auxiliary copy under the normal condition, so that the burden of the server end is reduced.

Description

Distributed messaging system
Technical Field
The invention relates to the field of message processing, in particular to a distributed message system.
Background
There are many message systems currently used for message (e.g., log) processing, and distributed message systems are more popular.
A distributed messaging system framework is shown in fig. 1. The system comprises a Producer (PD), an Agent (CS), a Consumer (CS) and a third-party management cluster, wherein each role can be multiple. The Producer sends the message to the Agent, the message is stored in the Agent in a persistent mode, and the Consumer obtains the message from the Agent for processing. The third party management cluster is used for storing some state information of the Producer, the Consumer and the Agent.
The distributed messaging system manages messages based on the Topic of the message (Topic). The storage in the message storage device is also based on the subject. Each Topic message may be stored in one or more storage partitions (partitions) of the Agent. All consumers and which Topic is stored in each Agent, and how many partitions the Topic has are stored on the third party management cluster. When message processing is performed, a plurality of consumers generally cooperate with each other to process a message of a certain subject, and the third-party management cluster further stores information on a memory partition to be allocated to each of the consumers relating to the subject and to be subjected to message processing. And a plurality of message service queues (storage partitions) are arranged in each Agent belonging to the same storage partition. For example, in the distributed messaging system shown in fig. 2, the services a and B are redundant to each other in order to ensure the safety of data. The Consumer accesses the message service queue A or B for message consumption. Not only the messages themselves but also the message consumption information need to be strictly synchronized between the message queues a and B, thereby causing a great performance bottleneck to the server.
Disclosure of Invention
In order to solve the technical problems, the invention provides a distributed message system, which comprises a production cluster for producing subject message data, a service cluster for storing the subject message data, a consumption cluster for consuming the subject message data, and a management cluster for managing the consumption cluster and the service cluster; the method is characterized in that: the service cluster comprises a theme zone for storing the theme message data, wherein the theme zone comprises a primary copy and a secondary copy distributed in different service terminals of the service cluster, and the secondary copy is a redundant backup of the primary copy; and the production end in the production cluster accesses the main copy of the theme partition to store the theme message data generated by the production end, and the consumption end in the consumption cluster accesses the main copy of the theme partition to consume the message data of the theme partition. When the consumption is carried out, the consumption end in a certain consumption cluster only takes data from the main copy of the subject partition for consumption, and the message consumption information does not have the requirement of synchronization to the slave copy under the normal condition, so that the burden of the server end is reduced.
Preferably, the production end in the production cluster polls the primary copy of each topic partition to distributively store the generated topic message data in each topic partition; the master replica receives and stores subject message data from the producer and updates its slave replica so that the slave replica is synchronized with the master replica.
Preferably, the subject partition includes a set of kept synchronized replicas made up of the slave replicas; when the number of the slave copies in the set of kept synchronized copies is less than a preset minimum number of synchronized copies, the subject partition does not receive subject message data from the production cluster.
Preferably, after receiving the subject message data sent by the production end, the master copy sends message submission success information to the production end.
Preferably, the master copy synchronizes to the slave copy after receiving the subject message data sent by the production end, and sends message submission success information to the production end after receiving the determined synchronization information of the first slave copy.
Preferably, the master copy synchronizes to the slave copy after receiving the subject message data sent by the production end, and sends message submission success information to the production end after receiving the determined synchronization information of the preset number of slave copies larger than the number of slave copies of the set of kept synchronization copies.
Preferably, the subject partition maps the subject message data sent by the production end into the memory of the service end after receiving the subject message data, and the primary copy provides an offset address of the subject message data in the memory to respond to a consumption request of a consumption end for the subject message data.
Preferably, the subject partition includes a data paragraph skip table, the data paragraph skip table including a plurality of data layers; the number of nodes of the next data layer is greater than that of nodes of the previous data layer, the next data layer comprises all nodes of the previous data layer, and the data layer positioned at the bottom layer comprises all nodes of the data paragraph skip table; the node includes paragraph data that includes the subject message data.
Preferably, the node includes pointer data of the node in the next data layer and pointer data of the node next to the data layer where the node is located.
Preferably, the paragraph data includes an index file, and the index file records a one-to-one mapping relationship between a logic version number of the subject message data and an offset address of the subject message data in a memory; the logic version number reflects the sequence of the theme message data reaching the service cluster.
Drawings
FIG. 1 is a prior art messaging system framework.
Fig. 2 prior art message system consumption pattern.
Fig. 3 is a distributed messaging system framework of the present invention.
FIG. 4 is a distributed message system consumption pattern of the present invention.
Fig. 5 is a diagram of a data paragraph skip table structure according to the present invention.
Detailed Description
The following specific examples are given by way of illustration only and not by way of limitation, and it will be apparent to those skilled in the art from this disclosure that various changes and modifications can be made in the examples without inventive faculty, and yet still be protected by the scope of the claims.
Example one
Fig. 3 shows a distributed message system framework of the present invention, which includes a production cluster for producing subject message data, a service cluster for storing the subject message data, a consumption cluster for consuming the subject message data, and a management cluster for managing the consumption cluster and the service cluster. The service cluster comprises a subject partition for storing subject message data, the subject partition comprises a master copy (Leader) and a slave copy (Follower) distributed in different service ends of the service cluster, and the slave copy is a redundant backup of the master copy. The service cluster is composed of one or more Agent servers (service terminals), and each service terminal is responsible for external service or data redundancy of a plurality of theme partitions. The server provides the submitting/consuming/partitioning addressing function of the message to the outside. One of the service terminals (agents) in the same subject partition is the master copy, and the rest are the slave copies. The Leader of each topic partition is distributed among the agents of the service cluster, and the same is true of the Follower. A producing end (Producer) in the production cluster accesses the main copy of the subject partition to store the subject message data generated by the producing end, and a consuming end (Consumer) in the consuming cluster accesses the main copy of the subject partition to consume the message data of the subject partition. The production end pushes the message to the server end to enable the message to be sent to the server end to be stored at the highest speed; the consumer can consume the message at a proper speed according to the consumption capacity of the consumer by the aid of the pull message of the server, and the possibility of network congestion is reduced.
The management cluster detects whether the server side and the consumption side are alive or not by detecting heartbeat signals of the server side and the consumption side, and addresses the server side according to the theme and the theme zone. Subject partition information under each subject is maintained, including subject partition addressing and master and slave copy information for the subject partition. And maintaining information of each server, including the number of master copies and the number of slave copies of the owned subject partition. When the information of the master copy and the slave copy of a certain theme partition is changed, the corresponding master copy or slave copy is informed.
Logical structure of main body message file
The theme message data is persisted in the hard disk of the server in the form of a theme message file. A topic may logically be considered a queue, and each consumption must specify its topic. I.e. it has to be indicated in which queue this message is put. In order to enable the throughput rate of the message system to be expanded horizontally, the theme message is physically divided into one or more theme partitions, each theme partition physically corresponds to a theme message folder, and all messages and index files of the theme partition are stored in the theme message folder. Specifically, the structure of a theme message file is as follows:
1. the message data are distinguished according to the Topic (Topic), and the messages in the same class belong to the same Topic (Topic). Topic message data refers to message data belonging to the same topic. The subject of the message data needs to be divided as required when in use.
2. The theme is divided into a plurality of theme partitions, the number of the partitions needs to be specified when the theme is created, and the theme partitions can be expanded after the theme is created.
3. Each topic partition is divided into a plurality of data segments (segments) according to the actual situation (namely the data size of the message data of the topic). Each data segment is approximately equal in size and each data segment is in one-to-one correspondence with an index file. The size of the data paragraph needs to be determined when the topic is created, and can also be changed after the topic is created.
4. Subject message data within each subject partition each subject message data is appended to the subject partition and assigned a logical version number in the order in which it arrives at the server, the logical version number being successively incremented. Each logical version number uniquely identifies the subject message data to which it corresponds and corresponds to the physical offset of the subject message data at the server. The subject message data is written into the disk sequentially, and the efficiency is very high.
Creation of a theme
Theme zone data and target service instances are specified when creating a theme. Theme creation is performed by sending commands to these service instances. The specific process is as follows:
1. the theme creation tool obtains Agent information for the service cluster from a third party management cluster (e.g., zookeeper). The Agent information mainly comprises the number of the Leader and the Follower of the theme partition owned by each Agent.
2. Take the example of creating a theme with 4 partitions and 3 copies. The theme creation tool selects the 3 least loaded servers (agents) as the 3 copies of the first partition of the new theme based on the load. The several agents are then notified to create the theme zone.
3. After the three agents successfully create the partition copy, entering a process of electing a master copy (Leader) (the process of electing the master copy refers to the disaster recovery strategy part).
4. The theme creation tool detects the creation of the first theme zone of the theme through the management cluster. If the creation is successful and the primary replica is enumerated, then the creation of the first subject partition is successful. Steps 2-4 are then repeated to create the remaining three subject partitions.
If the selected Agent is offline in the process of creating the Topic partition, the following conditions are divided into: if none of the selected agents are alive (i.e., all selected agents are down), then steps 2-4 are re-executed to recreate the new partition. If the selected Agent is still in stock, waiting for the successful creation of the Topic by checking the status of the management cluster.
Message redundancy backup
The main copy (Leader) of the subject partition is used as a receiving and consuming center of the message and directly provides external services. Several slave copies (Follower) acquire messages from the master copy (Leader) and update local message combinations. The master receives and stores the subject message data from the Producer (Producer) and updates the slave so that the slave is synchronized with the master. And the master/slave copy reports the minimum logic version number (minimum effective logic version number) of the master/slave copy and the received maximum logic version number (maximum effective logic version number) to the management cluster at regular time.
The master replica of each subject partition needs to maintain a set of slave replicas (a set of keep-alive replicas). The slave replicas within the set have updated message data relative to other slave replicas. When the number of the slave copies in the synchronous copy set is kept smaller than the minimum number of the synchronous copies, the master copy does not provide the service for receiving the message to the outside any more (note that only the topic partition under the topic does not provide the service for receiving the message any more, but the service end in the whole service cluster does not provide the service for receiving the message).
The service end (Agent) provides three guarantee levels for the production end (Producer) to send messages:
A. the master copy immediately sends information of successful message submission to the production end after receiving the theme message data sent by the production end in the level of UNSAFE;
B. in a FAST (FAST) level, a master copy synchronizes to all slave copies after receiving subject message data sent by a production end, and sends message submission success information to the production end after receiving determined synchronization information of a first slave copy;
C. and in the Safety (SAFE) level, the master copy synchronizes to all the slave copies after receiving the subject message data sent by the production end, and sends message submission success information to the production end after receiving the determined synchronization information of the slave copies with the number greater than or equal to the minimum number of the synchronization copies. And all slave copies that reply with the determined synchronization information are elements in the set of kept synchronized copies.
At the security level, a piece of subject message data is considered committed only if all of the secondary copies in the set of synchronized copies have been copied. The data loss caused by offline of partial data written into the master copy but not written into the slave copy is avoided, so that the data cannot be consumed by a consumption end. The data security and the message system throughput rate are well balanced. The secondary copy can copy data from the primary copy in batches, so that the copying performance is greatly improved, and the difference of data synchronization in the secondary copy and the primary copy is greatly reduced.
Disaster recovery strategy
In order to prevent the topic partition from being unavailable when the primary copy is offline, the disaster recovery strategy is adopted:
1. the service end in the service cluster needs to maintain a routing table, which indicates which service ends are available in a certain theme zone of a certain current theme. The routing table may also be used for service addressing, the updating of which is taken care of by the management cluster.
2. When the master copy goes offline, the new master copy election process is as follows: keeping the slave copies in the synchronous copy set to report the maximum logic version number (the maximum effective logic version number) of the slave copies to the management cluster; and selecting the slave copy corresponding to the maximum logic version number as a master copy according to the respective maximum logic version numbers (maximum valid logic version numbers) reported by all the slave copies in the copy synchronization set, and reporting the master copy to the management cluster. And if the number of the slave copies with the maximum logical version number is multiple, selecting the slave copy which is firstly reported to the management cluster as the master copy.
3. If a slave replica goes offline and the slave replica no longer maintains the set of synchronized replicas, then the offline of the slave replica has no effect on the entire service cluster.
4. If a slave replica in the set of kept synchronized replicas goes offline, the master replica needs to delete the slave replica from the set of kept synchronized replicas and report to the third party management cluster. The subject partition stops servicing when the number of slave replicas in the set of kept synchronized replicas is less than the minimum number of synchronized replicas.
Processing method of message system
After determining the subject, the production end tries to connect the master copies of all the subject districts of the subject, and then sequentially submits the messages in a polling mode.
Like a log, binlog of mysql, and the like. In this embodiment, the server topic message data is stored by using a hard disk file as a storage carrier. The theme message data has the characteristics of large volume and long storage time.
Hard disk IO is generally slow, and if the writing is performed little by little, such as several bytes at a time or random writing, the delay caused by the hard disk is considerable, and finally, the response speed and throughput of the application are reduced.
Therefore, in this embodiment, a mode that the memory buffers a batch of data first and then writes the hard disks in batch order is adopted to improve throughput.
With memory caching messages, there are typically two parameters to specify. The first is the size of the cache, and the second is the maximum time interval to flush the disk. It is necessary for the server to specify these two parameters when creating the theme. These two parameters may also be modified after the creation of the theme. Two parameters representing the message volume are included for the server: the effective logic version number of the message is the latest logic version number of the subject message data which is durably stored to the hard disk, and is visible to the consumption end; and the caching logic version number of the message is the logic version number of the latest theme message data received by the main copy of the theme partition and is invisible to the consumption end.
FIG. 4 shows a topic consumption model named A1, the topic having two topic partitions, P1 and P2. The message processing in P1 and P2 belong to the same subject and have no relation, and are completely different. Each subject partition is provided with a master copy and a slave copy for redundant data, and data safety is guaranteed. A routing table is required to be maintained inside each topic partition, and the valid logical version number of the messages in the master copy and the slave copy of the topic partition and the consumption group information of the master copy are maintained in the routing table. Each topic partition allows at most one consuming side in the same consuming cluster to consume the topic message data at the same time.
The valid logical version numbers of the messages of the master and slave copies of each subject partition are reported to the management cluster periodically (e.g., within 10 seconds) for later monitoring. When consuming, the consuming end in a certain consuming cluster only consumes the subject message data from the primary copy of the subject partition, and the primary copy records the consumption level of the consuming cluster (i.e. the logic version number of the consumed subject message data) to ensure that the consuming end in the same consuming cluster does not repeatedly consume the subject message data. Meanwhile, as the subject message data is consumed only from the primary copy of the subject partition, the consumption level does not need to be synchronized to other servers under normal conditions, and only the primary copy needs to maintain the consumption level locally. And the consumption end polls each theme partition to achieve the effect of consumption data with balanced load. The consumption level maintained by the master copy is reported to the management cluster periodically so as to avoid repeated consumption as far as possible after monitoring and reselecting the master copy of the theme zone. The consumption mode of the consumer can be two types:
A. in FAST consumption mode (FAST), after the primary copy sends the subject message data to the consumption end, the consumption level information of the consumption cluster where the consumption end is located is immediately updated.
B. And in a safety mode (SAFE), after the main copy sends the theme message data to the consumption end and receives the receiving confirmation information of the consumption end, the consumption level of the consumption cluster where the consumption end is located is updated. In this mode, a response timeout time needs to be established, and when the response timeout time is exceeded, the consumption level of the consumption cluster is updated by the primary copy even if the acknowledgement information of the consumption end is not received.
When the theme partition of the theme to which the server belongs changes, the server notifies the production end connected with the server of the change situation of the theme partition (such as the expansion of the theme partition). The production end dynamically adds and validates the theme partition and checks whether the off-line theme partition is on-line again or not. And the production end can sense the expansion of the theme partition and the offline of the main copy of the main partition. When the main copy of one theme partition is offline, the production end temporarily submits the theme message data to the main copies of other theme partitions which are not offline. And after the main copy of the offline subject partition is on-line again or a new main copy is elected, submitting a message to the subject partition. When one subject partition is in an unavailable state (i.e., the number of slave copies of the set of kept synchronized copies is less than the minimum number of synchronized copies), the production end submits the message to the subject partition in the remaining available state.
When a consuming side joins or goes offline, the consuming cluster needs to redistribute the partitions to achieve the purpose that the theme is normally consumed. The method specifically comprises the following steps: the server records the consumption level (i.e. the logical version number of the consumed subject message data) and the load level (i.e. the number of consumed subject partitions) of the consuming end in a certain consumption cluster of the subject. When the consumption end is on line, the primary copy of a certain theme partition of the theme is randomly selected for connection. The main copy of the connected subject partition is judged according to the load level of the current consumption cluster:
and if the theme partition managed by the primary copy is not consumed by any other consuming side which belongs to the same consuming cluster with the consuming side, directly accepting the connection request of the consuming side. And judging whether the consuming end needs to be connected with other theme partitions or not according to the consuming cluster load, and if so, informing the consuming end that redundant partitions need to be connected.
If the primary copy is already consumed by a certain consumer of the same consumption cluster, the primary copy needs to perform corresponding processing according to the consumption level of the consumer: if the consuming end only consumes the theme partition, the main copy returns a reasonable theme partition (the reasonable theme partition refers to whether the number of the consumed main partition of the consuming cluster in the theme partition is within an allowable range) to the consuming end and informs the consuming end of the connection information (such as the connection address) of the main copy to the current consumption; if the consumption end consumes a plurality of theme partitions, the master copy informs the consumption end of disconnecting and accepting the connection request of the consumption end.
When a consuming end goes offline, a consuming end with the lightest load (namely the consuming end with the least number of connected subject partitions) is selected according to the load level of the current consuming group, and the consuming end is informed that redundant partitions need to be connected.
When the primary copy of a theme partition goes offline, the consumption end connected with the primary copy randomly selects the primary copy of the theme partition to enter a polling waiting process (the polling interval can be set to be 1S), and waits for the primary copy of the theme partition to go online again or a new spare theme partition to go online. And if all the currently remaining theme partitions are in one-to-one correspondence with the consuming end, informing the consuming end to continue polling.
Example two
For simplicity, the parts of the device that are the same as those of the first embodiment are not repeated herein, and only the different parts of the second embodiment and the first embodiment are described.
In this embodiment, there are two storage modes of the theme message data at the server. One is the same file mode as the first embodiment, and the other is a memory mode. The file mode is suitable for storing message data with longer expiration time and larger volume, including logs, binlog of mysql, simple activity information of applications and the like.
The memory mode is suitable for pure fast message delivery, and these messages usually have short life cycle (perhaps only a few minutes), and require a relatively fast arrival time at the consuming end with a small transmission delay. This type of theme message data is not convenient to store using a file. Firstly, file storage is designed for messages with relatively long life cycles, and outdated data cannot be eliminated in time; secondly, for file storage, the data that can be consumed must be the subject message data that has landed (i.e. persisted to the hard disk with a valid logical version number), and there is a time difference between the subject message data being received and landed in the file storage mode, and the subject message data cannot be consumed in time.
The memory mode needs to be specified when creating a theme, and the mode of a theme cannot be changed once specified. A default expiration time may be specified when creating memory mode themes. If no message expiration time is specified when the message is produced, the message is eliminated by the default expiration time.
And storing the message data by depending on the memory in the memory mode, and mapping the theme zone to the memory for storage after receiving the theme message data generated by the production end. The memory usage of a theme zone needs to be specified (or default) when creating the theme. When allocating a theme partition, it is necessary to know the memory usage of the current server to avoid overflow.
Each topic partition is stored in memory in the form of a data paragraph skip table (see fig. 5). The data paragraph hopping table includes a plurality of data layers, each data layer including a data node. The number of nodes of the next data layer is greater than that of the nodes of the previous data layer, the next data layer comprises all the nodes of the previous data layer, and the data layer positioned at the bottom layer comprises all the nodes of the data paragraph skip table. Each node stores a paragraph data (Segment) of the subject message data of the subject partition, the node including its pointer data in the next data layer, and the pointer data of the next node of the data layer in which the node is located. The number of nodes spaced between the nodes of each layer is random. When the subject message data of the subject partition is indexed according to the logic version number of the subject message data, the top layer of the data paragraph jump table is searched downwards layer by layer, and the searching efficiency is greatly improved. For example, in fig. 5, the subject message data with the logical version number 117 needs to be searched:
1) comparing with the first node 21 of the top level (lever 3), finding the node with 117 greater than 21;
2) 117 is greater than 37 compared to the second node 37 at the top level, and 37 is the maximum value of the linked list at the level, then the search is started from the level below 37 (lever 2);
3) if 117 is greater than 71 and 71 is the maximum of the link list for that level, as compared to a node 71 following 37 in the second level, then the lookup starts at the level below 71 (lever 3);
4) if 117 is greater than 85, then look ahead, as compared to a node 85 behind 71 in the third level;
5) this node is found, 117 equals 117, compared to a node 117 following 85 in the third layer.
The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims (5)

1. A distributed messaging system comprising a production cluster for producing subject message data, a service cluster for storing said subject message data, a consumption cluster for consuming said subject message data, and a management cluster for managing said consumption cluster and said service cluster; the method is characterized in that: the service cluster comprises a theme zone for storing the theme message data, wherein the theme zone comprises a primary copy and a secondary copy distributed in different service terminals of the service cluster, and the secondary copy is a redundant backup of the primary copy; a production end in the production cluster accesses the main copy of the subject partition to store the subject message data generated by the production end, and a consumption end in the consumption cluster accesses the main copy of the subject partition to consume the message data of the subject partition; the production end in the production cluster polls the primary copy of each theme partition so as to store the generated theme message data in each theme partition in a distributed manner; the master copy receives and stores the subject message data from the production end, and updates the slave copy thereof so that the slave copy is synchronous with the master copy; the subject partition includes a set of kept synchronized replicas made up of the slave replicas, the slave replicas within the set of kept synchronized replicas having updated message data relative to other slave replicas; when the number of the slave copies in the synchronous copy keeping set is smaller than a preset minimum synchronous copy number, the subject partition does not receive subject message data from the production cluster;
under the condition of no guarantee level, after receiving the theme message data sent by the production end, the primary copy sends message submission success information to the production end;
under the rapid level, the master copy synchronizes to the slave copy after receiving the subject message data sent by the production end, and sends message submission success information to the production end after receiving the determined synchronization information of the first slave copy;
under the security level, the master copy synchronizes to the slave copy after receiving the subject message data sent by the production end, and sends message submission success information to the production end after receiving the determined synchronization information of the slave copies with the number larger than the preset value of the slave copies of the synchronization-maintaining copy set;
when the master copy goes offline, each slave copy in the set of kept synchronous copies reports the maximum logic version number to the management cluster, and the management cluster selects the slave copy corresponding to the maximum logic version number as the master copy according to the maximum logic version numbers of all the slave copies in the set of kept synchronous copies.
2. A distributed messaging system according to claim 1, wherein: and the theme partition maps the theme message data sent by the production end into the memory of the service end after receiving the theme message data, and the main copy provides an offset address of the theme message data in the memory so as to respond to a consumption request of a consumption end for the theme message data.
3. A distributed messaging system according to claim 2, wherein: the subject partition includes a data paragraph skip table, the data paragraph skip table including a plurality of data layers; the number of nodes of the next data layer is greater than that of nodes of the previous data layer, the next data layer comprises all nodes of the previous data layer, and the data layer positioned at the bottom layer comprises all nodes of the data paragraph skip table; the node includes paragraph data that includes the subject message data.
4. A distributed messaging system according to claim 3, wherein: the node includes its pointer data in the next data layer, pointer data of the next node of the data layer where the node is located.
5. A distributed messaging system according to claim 4, wherein: the paragraph data comprises an index file, and the index file records a one-to-one mapping relation between the logic version number of the subject message data and the offset address of the subject message data in the memory; the logic version number reflects the sequence of the theme message data reaching the service cluster.
CN201710637643.0A 2017-07-31 2017-07-31 Distributed messaging system Active CN107465735B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710637643.0A CN107465735B (en) 2017-07-31 2017-07-31 Distributed messaging system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710637643.0A CN107465735B (en) 2017-07-31 2017-07-31 Distributed messaging system

Publications (2)

Publication Number Publication Date
CN107465735A CN107465735A (en) 2017-12-12
CN107465735B true CN107465735B (en) 2020-08-14

Family

ID=60547046

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710637643.0A Active CN107465735B (en) 2017-07-31 2017-07-31 Distributed messaging system

Country Status (1)

Country Link
CN (1) CN107465735B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108199912B (en) * 2017-12-15 2020-09-22 北京奇艺世纪科技有限公司 Method and device for managing and consuming distributed messages of multiple activities in different places
CN108170527B (en) * 2017-12-15 2021-06-22 北京奇艺世纪科技有限公司 Remote multi-activity distributed message consumption method and device
CN109388677B (en) * 2018-08-23 2022-10-11 顺丰科技有限公司 Method, device and equipment for synchronizing data among clusters and storage medium thereof
CN109347655B (en) * 2018-09-11 2022-03-01 上海天旦网络科技发展有限公司 Network data based fault recovery system and method and storage medium
CN109729148A (en) * 2018-11-30 2019-05-07 北京奇艺世纪科技有限公司 A kind of message treatment method, system and equipment
CN111818112B (en) * 2019-04-11 2022-10-04 中国移动通信集团四川有限公司 Kafka system-based message sending method and device
CN110113420B (en) * 2019-05-08 2020-06-05 重庆大学 NVM-based distributed message queue management system
CN111459686B (en) * 2020-03-17 2023-06-27 华云数据控股集团有限公司 Queue message storing and forwarding method, system and computer device with operating system
CN112527520A (en) * 2020-12-01 2021-03-19 中国建设银行股份有限公司 Method and device for deploying message middleware
WO2023280127A1 (en) * 2021-07-09 2023-01-12 阿里云计算有限公司 Message communication method and apparatus

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103534988A (en) * 2013-06-03 2014-01-22 华为技术有限公司 Publish and subscribe messaging method and apparatus
CN104579905A (en) * 2013-10-15 2015-04-29 阿里巴巴集团控股有限公司 Message passing method and system, MOM (message oriented middleware) server and receiving terminal
CN106953901A (en) * 2017-03-10 2017-07-14 重庆邮电大学 A kind of trunked communication system and its method for improving message transmission performance

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7664818B2 (en) * 2004-04-21 2010-02-16 Sap (Ag) Message-oriented middleware provider having multiple server instances integrated into a clustered application server infrastructure

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103534988A (en) * 2013-06-03 2014-01-22 华为技术有限公司 Publish and subscribe messaging method and apparatus
CN104579905A (en) * 2013-10-15 2015-04-29 阿里巴巴集团控股有限公司 Message passing method and system, MOM (message oriented middleware) server and receiving terminal
CN106953901A (en) * 2017-03-10 2017-07-14 重庆邮电大学 A kind of trunked communication system and its method for improving message transmission performance

Also Published As

Publication number Publication date
CN107465735A (en) 2017-12-12

Similar Documents

Publication Publication Date Title
CN107465735B (en) Distributed messaging system
US11360854B2 (en) Storage cluster configuration change method, storage cluster, and computer system
US11320991B2 (en) Identifying sub-health object storage devices in a data storage system
CN107533438B (en) Data replication in a memory system
JP5714571B2 (en) Cache data processing using cache clusters in configurable mode
US8554762B1 (en) Data replication framework
US8108623B2 (en) Poll based cache event notifications in a distributed cache
US20140108532A1 (en) System and method for supporting guaranteed multi-point delivery in a distributed data grid
US20070061379A1 (en) Method and apparatus for sequencing transactions globally in a distributed database cluster
US20100023564A1 (en) Synchronous replication for fault tolerance
US20140059315A1 (en) Computer system, data management method and data management program
US9659078B2 (en) System and method for supporting failover during synchronization between clusters in a distributed data grid
CN107153660B (en) Fault detection processing method and system for distributed database system
KR20060117505A (en) A recovery method using extendible hashing based cluster log in a shared-nothing spatial database cluster
US20100023532A1 (en) Remote file system, terminal device, and server device
WO2018010501A1 (en) Global transaction identifier (gtid) synchronization method, apparatus and system, and storage medium
CN112052230B (en) Multi-machine room data synchronization method, computing device and storage medium
CN111858190B (en) Method and system for improving availability of cluster
CN114124650A (en) Master-slave deployment method of SPTN (shortest Path bridging) network controller
CN110807039A (en) Data consistency maintenance system and method in cloud computing environment
CN110661841B (en) Data consistency method for distributed service discovery cluster in micro-service architecture
CN107295106B (en) Message data service cluster
CN105323271B (en) Cloud computing system and processing method and device thereof
CN111400065B (en) Pulsar message long-distance multi-live method and system for separating global zookeeper
CN117331755A (en) High availability system and method for master-slave backup and fragmentation strategy of vector database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant