CN111966736B - High-throughput low-delay large-capacity Flume channel and transmission method thereof - Google Patents
High-throughput low-delay large-capacity Flume channel and transmission method thereof Download PDFInfo
- Publication number
- CN111966736B CN111966736B CN202010728788.3A CN202010728788A CN111966736B CN 111966736 B CN111966736 B CN 111966736B CN 202010728788 A CN202010728788 A CN 202010728788A CN 111966736 B CN111966736 B CN 111966736B
- Authority
- CN
- China
- Prior art keywords
- storage interval
- memory
- cluster
- storage
- event packet
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 230000005540 biological transmission Effects 0.000 title claims abstract description 26
- 238000001514 detection method Methods 0.000 claims abstract description 20
- 238000012546 transfer Methods 0.000 claims abstract description 12
- 238000012163 sequencing technique Methods 0.000 claims abstract description 5
- 238000012545 processing Methods 0.000 claims description 12
- 230000008569 process Effects 0.000 description 14
- 238000010586 diagram Methods 0.000 description 6
- 230000008901 benefit Effects 0.000 description 5
- 239000000284 extract Substances 0.000 description 4
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000000712 assembly Effects 0.000 description 1
- 238000000429 assembly Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/252—Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/542—Event management; Broadcasting; Multicasting; Notifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5021—Priority
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a high-throughput low-delay large-capacity Flume channel, which comprises a memory, wherein the memory is connected with a detection module; the memory is connected with the cluster through a data pulling module; the cluster and the memory are connected with a data source through a switching port; the transfer port is provided with an arbitration module, and the arbitration module is connected with the detection module. A transmission method of a high-throughput low-latency large-capacity Flume channel comprises the steps of configuring a first storage interval and a second storage interval in a memory; configuring the residual delay time of the event packets, and sequencing the event packets according to the residual delay time; the data source sends the event packet to a transfer port, and the transfer port sends the event packet to a first storage interval, a cluster or a second storage interval; the cluster sends the event packet to a first storage area or a second storage area through a data pulling module; the first storage interval sends the event packet to a second storage interval; and the first storage interval and the second storage interval send event packets to the Sink.
Description
Technical Field
The invention relates to the technical field of flash data transmission, in particular to a high-throughput low-delay large-capacity flash channel and a transmission method thereof.
Background
Flume is an excellent data collection tool. The framework for acquiring data by the flash comprises a data source, a channel and a Sink, wherein the data source can acquire data from various data sources such as log files, network ports, kafka clusters and the like, is packaged into an Event and is written into the channel. After the data is successfully written into the channel, the Sink actively pulls the data from the channel and writes the data into a plurality of large data assemblies such as HDFS, HBase, hive, ES and the like. The channel is a passive memory and is responsible for temporarily storing data.
The channels in flash include a File channel, a Memory channel, and a Kafka channel, each of which buffers data in a different manner. The Memory channel uses a Memory to cache data, the throughput rate is extremely high, but the Memory space is limited by the constraints of a RAM and a JVM (Java virtual machine), and a large amount of data cannot be cached in a short time; the File channel caches data through a local disk File, in order to ensure the reliability of a storage process, the process of adding a read lock and a write lock to a File is involved in caching the data in batches each time, a lock competition phenomenon occurs, and the data needs to be stored in the local disk, so that the overall throughput of the File channel is low, the local storage space is limited, and the capacity expansion is difficult to a certain extent; the Kafka channel realizes data caching based on a Kafka message queue, and as Kafka components have the characteristics of high throughput and low delay, the Kafka components store data in a cluster mode, the storage capacity can be continuously expanded, but because network communication among clusters is involved in the storage process, the throughput is slightly lower than that of a Memory channel, but is far higher than that of a File channel. One channel can receive data from multiple data sources, and in many scenarios, a very large data flow is assumed. If the stream of data collected by the flash needs high throughput and low delay, and needs a larger and expandable cache space, the existing flash channel cannot meet the requirement.
Disclosure of Invention
The invention provides a Memory-Kafka Channel of flash, and aims to realize that the flash has high throughput and low delay and has larger and expandable cache space.
To achieve the above objects, referring to FIG. 1, the present invention provides a high throughput low latency large capacity Flume channel, comprising
The memory is connected with the detection module;
the memory is connected with the cluster through a data pulling module;
the cluster and the memory are connected with a data source through a switching port;
the transfer port is provided with an arbitration module, and the arbitration module is connected with the detection module.
Preferably, the detection module polls and detects the memory status, the arbitration module obtains the memory status from the detection module, and the arbitration module controls the transit port to communicate the memory and the data source or controls the transit port to communicate the memory and the cluster according to the memory status.
Preferably, the cluster is a kafka cluster, and the cluster is deployed on a storage device.
The invention provides a transmission method of a high-throughput low-delay large-capacity Flume channel, which is applied to the high-throughput low-delay large-capacity Flume channel and comprises the following steps
S100, configuring a first storage interval and a second storage interval in a memory, wherein the first storage interval is correspondingly configured with a first data interface, and the second storage interval is correspondingly provided with a second data interface;
s200, configuring the residual delay time of the event packets, sequencing the event packets in the memory and the cluster according to the residual delay time, and processing the event packets by the first data interface, the second data interface and the data pulling module according to the sequencing;
s300, a data source sends an event packet to a switching port, and the switching port sends the event packet to a first storage interval, a cluster or a second storage interval;
s400, the cluster sends the event packet to a first storage area or a second storage area through a data pulling module;
s500, the first storage interval sends the event packet to a second storage interval;
s600, the first storage interval sends the event packet to the Sink through the first data interface, and the second storage interval sends the event packet to the Sink through the second data interface.
Preferably, S300 comprises the steps of:
s301, configuring a second threshold related to the residual delay time;
s302, the residual delay time of the event packet at the transfer port is smaller than a second threshold value; the arbitration module controls the switching port to preferentially send the event packet to the second storage interval;
s303, the residual delay time of the event packet at the transfer port is greater than or equal to a second threshold, a first threshold related to the memory occupancy rate is configured in the arbitration module, the arbitration module obtains the occupancy rate of a first storage interval measured by the detection module, and the arbitration module compares the first threshold with the occupancy rate of the first storage interval;
s304, the occupancy rate of a first storage interval is smaller than the first threshold value, and the arbitration module controls the transit port to send the event packet to the first storage interval;
s305, the occupancy rate of the first storage interval is greater than or equal to the first threshold, and the arbitration module controls the transit port to send the event packet to the cluster.
Preferably, S400 comprises the steps of:
s401, configuring a third threshold related to the residual delay time;
s402, the remaining delay time of the event packet at the cluster is smaller than a third threshold, and the data pulling module preferentially sends the event packet to the second storage interval;
s403, the remaining delay time of the event packet at the cluster is greater than or equal to a third threshold; and the data pulling module sends the event packet to the first storage interval.
Preferably, S500 comprises the steps of:
s501, configuring a fourth threshold related to the residual delay time;
s502, the remaining delay time of the event packet at the first storage interval is smaller than a fourth threshold, and the event packet is preferentially sent to the second storage interval by the first storage interval;
s503, the remaining delay time of the event packet at the first storage interval is greater than or equal to a fourth threshold value; and queuing the event packets in the first storage interval.
Preferably, the detecting module detects an occupancy rate of a second storage interval, configures a fifth threshold related to a memory occupancy rate, where the occupancy rate of the second storage interval is smaller than the fifth threshold, and executes a sending process of an event packet to be sent to the second storage interval, where the occupancy rate of the second storage interval is greater than or equal to the fifth threshold, the event packets to be sent to the second storage interval in the first storage interval and the cluster are queued in place, and the event packet to be sent to the second storage interval at the transit port is executed according to S304 or S305.
Preferably, the event package is configured with a maximum delay time of the event package, the event package is configured with a timing module, the timing module starts timing through the first data source, and the remaining delay time is obtained by subtracting the time measured by the timing module from the maximum delay time.
The high-throughput low-delay large-capacity Flume channel and the transmission method thereof have the following beneficial effects:
the high-throughput low-delay large-capacity Flume channel is provided with the memory and the cluster, the Sink can quickly fetch data from the memory and has the advantage of high throughput, and the cluster expands the storage space and has the advantage of large capacity; the transmission method of the high-throughput low-delay large-capacity Flume channel orderly processes the event packets by configuring the residual delay time as the index of the event packet processing sequence; and a special second storage interval is opened in the memory for processing the event packet with the temporary residual delay time, so that a channel is opened for the event packet which is temporary but cannot be processed in time by the first storage interval, and the timely transmission of the temporary event packet is effectively ensured. Thereby ensuring a relatively low latency in the delivery of event packets.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
FIG. 1 is a diagram of the architecture of a high throughput low latency large capacity Flume channel in an embodiment of the present invention;
FIG. 2 is a diagram illustrating a transmission of event packets from a data source in accordance with an embodiment of the present invention;
FIG. 3 is a schematic diagram of a transmission of a critical event packet from a data source according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of another transmission of an event packet from a data source in an embodiment of the present invention;
FIG. 5 is a schematic diagram of a transmission of a critical event packet in a cluster according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating the transmission of a critical event packet within a first storage interval according to an embodiment of the present invention;
FIG. 7 is a flow chart of controlling data transmission of a data source according to remaining delay time in an embodiment of the present invention;
FIG. 8 is a flow chart illustrating controlling data source data transmission according to a first storage interval status according to an embodiment of the present invention;
FIG. 9 is a flow chart illustrating controlling transmission of a critical event packet in a first storage interval according to a remaining delay time according to an embodiment of the present invention;
FIG. 10 is a flow chart of controlling transmission of a contingent event packet within a cluster according to a remaining delay time in an embodiment of the present invention;
fig. 11 is a flowchart illustrating the process of determining whether to transmit the provisional event package according to the embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to FIG. 1, the present invention provides a high throughput low latency large capacity Flume channel comprising
The system comprises a memory, a detection module and a control module, wherein a first storage interval with larger capacity and a second storage interval with smaller capacity are configured in the memory, the first storage interval is configured with a first data interface, the second storage interval is configured with a second concurrent storage interface, and the memory is connected with the detection module; the detection module detects the state of the memory through instruction polling, specifically, the instruction polls and detects the occupancy rate of the first storage interval according to the address of the first storage interval and the use condition, and the instruction polls and detects the occupancy rate of the second storage interval according to the address of the first storage interval and the use condition.
The memory is connected with the cluster through a data pulling module; the cluster is a kafka cluster, and the kafka cluster is deployed on three servers.
The cluster and the memory are connected with a data source through a transfer port;
the switching port is provided with an arbitration module, the arbitration module is connected with the detection module, the arbitration module acquires the memory state from the detection module, and the arbitration module controls the switching port to be communicated with the first storage interval and the data source according to the memory state, or controls the switching port to be communicated with the second storage interval and the data source, or controls the switching port to be communicated with the data source and the cluster.
The invention provides a transmission method of a high-throughput low-delay large-capacity Flume channel, which is applied to the high-throughput low-delay large-capacity Flume channel and mainly comprises the following steps:
s100, configuring a first storage interval and a second storage interval in a memory, where the first storage interval is correspondingly configured with a first data interface, the first data interface extracts an event packet from the first storage interval, the second storage interval is correspondingly configured with a second data interface, and the second data interface extracts an event packet from the second storage interval, where the data interface in fig. 1 includes a first data interface represented by a number and a second data interface represented by a letter.
S200, configuring the residual delay time of the event package, wherein the specific implementation process comprises the steps of configuring the maximum delay time of the event package in the event package, configuring a timing module in the event package, starting timing by the timing module through a first data source, and obtaining the residual delay time by subtracting the time measured by the timing module from the maximum delay time. The event packets are sorted according to the residual delay time in the memory and the cluster, and the lower the residual delay time of the event packets is, the higher the processing priority of the event packets is. Specifically, the first data interface first calls an event packet with a high priority level in a first storage interval to the Sink, the second data interface first calls an event packet with a high priority level in a second storage interval to the Sink, and the data pulling module first calls an event packet with a high priority level in the cluster to the memory.
S300, a data source sends an event packet to a switching port, and the switching port sends the event packet to a first storage interval, a cluster or a second storage interval; in the specific implementation process, referring to fig. 7 and 8, S300 includes the following steps:
s301, configuring a second threshold related to the residual delay time;
s302, when the event packet is at the transfer port, the residual delay time of the event packet is compared with the second threshold, if the residual delay time is less than the second threshold, the event packet is in the temporary period, and the event packet needs to be processed as soon as possible; the arbitration module controls the switching port to preferentially connect a data source with the second storage interval, the data source sends an event packet to the switching port, and the switching port preferentially sends the event packet to the second storage interval; with reference to fig. 11, in a specific implementation process, the detecting module detects an occupancy rate of a second storage interval, the arbitrating module obtains the occupancy rate of the second storage interval, and configures a fifth threshold related to a memory occupancy rate in the arbitrating module in advance, if the occupancy rate of the second storage interval is smaller than the fifth threshold, the transit port is connected to the second storage interval, and the transit port sends an event packet to the second storage interval, if the occupancy rate of the second storage interval is greater than or equal to the fifth threshold, the arbitrating module compares the occupancy rate of the first storage interval with the first threshold, and the occupancy rate of the first storage interval is smaller than the first threshold, the transit port is connected to the first storage interval, and the transit port sends the event packet to the first storage interval; if the occupancy rate of the first storage interval is greater than or equal to a first threshold value, the transit port is connected to the cluster, and the transit port sends the event packet to the cluster.
S303, a first threshold value related to memory occupancy rate is preconfigured in the arbitration module through a configuration file, when an event packet is at a transfer port, the remaining delay time of the event packet is greater than or equal to a second threshold value, the arbitration module obtains the occupancy rate of a first storage interval measured by the detection module, and the arbitration module compares the first threshold value with the occupancy rate of the first storage interval;
s304, if the occupancy rate of a first storage interval is smaller than the first threshold, the arbitration module controls the transit port to connect a data source with the first storage interval, the data source sends an event packet to the transit port, and the transit port sends the event packet to the first storage interval for storage;
s305, if the occupancy rate of the first storage interval is greater than or equal to the first threshold, the arbitration module controls the transit port to connect the data source with the cluster, the data source sends the event packet to the transit port, and the transit port sends the event packet to the cluster for storage.
S400, the cluster sends the event packet to a first storage area or a second storage area through a data pulling module; in a specific implementation process, the data pulling module is provided with a plurality of groups of data pulling threads, part of the data pulling threads are connected to the first storage interval, the other part of the data pulling threads are connected to the second storage interval, and the number of the data pulling module for processing the event packets in parallel is equal to the sum of the number of the first data interface for processing the event packets in parallel and the number of the second data interface for processing the event packets in parallel. Specifically, referring to fig. 10, S400 includes the following steps:
s401, configuring a third threshold related to the residual delay time; the third threshold value is set in consideration of the delay of the cluster to the transmission of the memory network, and the third threshold value is greater than or equal to the second threshold value.
S402, if the remaining delay time of the event packet at the cluster is smaller than a third threshold value, the data pulling thread connected to the second storage interval preferentially sends the event packet to the second storage interval; with reference to fig. 11, in a specific implementation process, the detecting module detects an occupancy rate of a second storage interval, the arbitrating module obtains the occupancy rate of the second storage interval, the arbitrating module sends the occupancy rate of the second storage interval to the cluster through a transit port, a fifth threshold related to the occupancy rate of the storage is configured in the cluster in advance, if the occupancy rate of the second storage interval is smaller than the fifth threshold, a data pull thread connected to the second storage interval sends an event packet to the second storage interval, and if the occupancy rate of the second storage interval is greater than or equal to the fifth threshold, the event packet sent to the second storage interval in the cluster is queued in the cluster until the data pull thread connected to the first storage interval is sent to the first storage interval according to a priority order or until the second storage interval is empty, the data pull thread connected to the second storage interval is sent to the second storage interval.
S403, if the remaining delay time of the event packet at the cluster is greater than or equal to a third threshold value; and the data pulling thread connected to the first storage interval sends the event packet to the first storage interval.
S500, the first storage interval sends the event packet to a second storage interval; specifically, referring to fig. 9, S500 includes the following steps:
s501, configuring a fourth threshold related to the residual delay time;
s502, the remaining delay time of the event packet at the first storage interval is less than a fourth threshold, but the first data interface is full load processing the event packet, and the first storage interval preferentially sends the event packet to the second storage interval; specifically, referring to fig. 11, the memory obtains the occupancy rate of the second storage interval, configures a fifth threshold related to the occupancy rate of the memory in advance, if the occupancy rate of the second storage interval is smaller than the fifth threshold, the first storage interval sends the event packet whose remaining delay time is smaller than the fourth threshold to the second storage interval, and if the occupancy rate of the second storage interval is greater than or equal to the fifth threshold, the event packet that should be sent to the second storage interval in the first storage interval is queued in the first storage interval until the event packet is sent to the Sink by connecting the first data interface according to the priority order or until the second storage interval has a vacancy, the event packet is sent to the second storage interval in the first storage interval.
S503, the remaining delay time of the event packet at the first storage interval is greater than or equal to a fourth threshold value; and queuing the event packets in the first storage interval.
S600, the first storage interval sends the event packet to the Sink through the first data interface, and the second storage interval sends the event packet to the Sink through the second data interface.
The invention provides a high-throughput low-delay large-capacity Flume channel and a transmission method thereof.A first-layer storage interval is a main temporary storage position of an event packet, a first threshold value is an index of the occupancy rate of the first storage interval, the occupancy rate of the first storage interval higher than the first threshold value indicates that the occupancy rate of the first storage interval is high, and further indicates that the data volume is larger, and the occupancy rate of the first storage interval lower than the first threshold value indicates that the occupancy rate of the first storage interval is low, and further indicates that the data volume is smaller. The second threshold is one of the metrics of the remaining delay time of the event packets, the setting of the second threshold is far smaller than the remaining delay time of the event packets at the transfer port, the event packets are mainly distributed to the first storage interval and the cluster through filtering, the second storage interval is relatively idle, and more capacity is ensured to provide a storage space capable of processing the event packets with high priority level at any time. The third threshold is one of the metrics of the remaining delay time of the event packets, the setting of the third threshold is mainly used for distinguishing the event packets which are in the cluster and are in the future, the event packets can exceed the delay time after passing through the cluster to the first storage interval and then passing through the first storage interval to the Sink, the first storage interval is full and cannot process the event packets, and the event packets are controlled by the judgment of the third threshold to be directly transmitted through the second storage interval. The fourth threshold is one of the metrics of the remaining delay time of the event packets, the setting of the fourth threshold is mainly used for distinguishing the event packets which are in the first storage interval and are in the future, the event packets cannot be transmitted because the first data interface is occupied, and the event packets are controlled to be directly transmitted through the second storage interval by judging the fourth threshold. The fifth threshold is a measure of the occupancy rate of the second storage interval, generally, event packages processed in the second storage interval are screened layer by layer, the flow rate of the event packages is relatively low, but in order to avoid introducing the fifth threshold into the second storage interval due to full load caused by a peak value of the flow rate of the event packages, the occupancy rate of the second storage interval is higher than the fifth threshold, and an event package processing method which is to be sent to the second storage interval is set. When the data flow is low, because the capacity of the first storage area is enough to be used, as shown in fig. 2, the data Source (Source) transmits the event packet to the first storage area, and the event packet is transmitted from the first storage area to the Sink; and when the remaining delay time of the event packet at the transit port is less than the second threshold, as shown in fig. 3, the data source transmits the event packet to the second storage area, and the event packet is transmitted to the Sink from the first storage area. When there are many data streams, because the capacity of the first storage area is insufficient, the Sink often needs to transmit the event packet in the first storage area to have capacity, and at this time, referring to fig. 4, the data source transmits the event packet to the cluster, and then the data pull module extracts the event packet from the cluster to the first storage area; if the event packet in the cluster is imminent, that is, the remaining delay time is less than the third threshold, referring to fig. 5, the data pulling module extracts the event packet from the cluster to the second storage area; if the first storage area is temporary, that is, the remaining delay time of the event packet is less than the fourth threshold, and the first data interface is full and cannot process the event packet, referring to fig. 6, the event packet in the first storage area is transferred to the second storage area and transferred to the Sink through the second storage area.
The high-throughput low-delay large-capacity Flume channel is provided with the memory and the cluster, the Sink can quickly fetch data from the memory and has the advantage of high throughput, and the cluster expands the storage space and has the advantage of large capacity; the transmission method of the high-throughput low-delay large-capacity Flume channel orderly processes the event packets by configuring the residual delay time as the index of the event packet processing sequence; and a special second storage interval is opened up in the memory for processing the event packet with the temporary residual delay time, so that a channel is opened for the event packet which is not processed by the first storage interval in time in the temporary period, and the timely transmission of the temporary event packet is effectively ensured. Thereby ensuring a relatively low latency in the delivery of event packets.
The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.
It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.
The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of the embodiments of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.
Claims (6)
1. A high throughput low latency large capacity Flume channel is characterized by comprising
The system comprises a memory, a first data interface and a second concurrent storage interface, wherein a first storage interval with larger capacity and a second storage interval with smaller capacity are configured in the memory, the first storage interval is configured with the first data interface, and the second storage interval is configured with the second concurrent storage interface;
the memory is connected with a detection module; the detection module detects the memory state through instruction polling, wherein the memory state comprises the occupancy rates of a first storage interval and a second storage interval;
the memory is connected with the cluster through a data pulling module;
the cluster and the memory are connected with a data source through a switching port;
the switching port is provided with an arbitration module, and the arbitration module is connected with the detection module; the arbitration module acquires the memory state from the detection module, and controls the switching port to communicate with the first storage interval and the data source, or controls the switching port to communicate with the second storage interval and the data source, or controls the switching port to communicate with the data source and the cluster according to the memory state.
2. The high-throughput low-latency high-capacity Flume channel according to claim 1, wherein the detection module polls and detects a memory status, the arbitration module obtains the memory status from the detection module, and the arbitration module controls the switch port to communicate the memory and the data source or controls the switch port to communicate the memory and the cluster according to the memory status.
3. The high-throughput low-latency high-capacity Flume channel according to claim 1, wherein the cluster is a kafka cluster, and the cluster is deployed in a storage device.
4. A transmission method of a high-throughput low-latency large-capacity Flume channel, which is applied to the high-throughput low-latency large-capacity Flume channel as claimed in claim 1, is characterized by comprising the following steps
S100, configuring a first storage interval and a second storage interval in a memory, wherein the first storage interval is correspondingly configured with a first data interface, and the second storage interval is correspondingly provided with a second data interface;
s200, configuring the residual delay time of the event packets, sequencing the event packets in the memory and the cluster according to the residual delay time, and processing the event packets by the first data interface, the second data interface and the data pulling module according to the sequencing;
s300, a data source sends an event packet to a switching port, and the switching port sends the event packet to a first storage interval, a cluster or a second storage interval;
s400, the cluster sends the event packet to a first storage area or a second storage area through a data pulling module;
s500, the first storage interval sends the event packet to a second storage interval;
s600, the first storage interval sends the event packet to the Sink through the first data interface, and the second storage interval sends the event packet to the Sink through the second data interface.
5. The transmission method of high-throughput low-latency large-capacity Flume channel according to claim 4, wherein S300 comprises the following steps:
s301, configuring a second threshold related to the residual delay time;
s302, the residual delay time of the event packet at the transfer port is smaller than a second threshold value; the arbitration module controls the switching port to preferentially send the event packet to the second storage interval;
s303, the residual delay time of the event packet at the transfer port is greater than or equal to a second threshold, a first threshold related to the memory occupancy rate is configured in the arbitration module, the arbitration module obtains the occupancy rate of a first storage interval measured by the detection module, and the arbitration module compares the first threshold with the occupancy rate of the first storage interval;
s304, the occupancy rate of a first storage interval is smaller than the first threshold value, and the arbitration module controls the transit port to send the event packet to the first storage interval;
s305, if the occupancy rate of the first storage interval is greater than or equal to the first threshold value, the arbitration module controls the transit port to send the event packet to the cluster.
6. The transmission method of high-throughput low-latency large-capacity Flume channel according to claim 5, wherein S400 comprises the following steps:
s401, configuring a third threshold related to the residual delay time;
s402, the remaining delay time of the event packet at the cluster is smaller than a third threshold, and the data pulling module preferentially sends the event packet to the second storage interval;
s403, the remaining delay time of the event packet at the cluster is greater than or equal to a third threshold; and the data pulling module sends the event packet to the first storage interval.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010728788.3A CN111966736B (en) | 2020-07-27 | 2020-07-27 | High-throughput low-delay large-capacity Flume channel and transmission method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010728788.3A CN111966736B (en) | 2020-07-27 | 2020-07-27 | High-throughput low-delay large-capacity Flume channel and transmission method thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111966736A CN111966736A (en) | 2020-11-20 |
CN111966736B true CN111966736B (en) | 2022-12-09 |
Family
ID=73362996
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010728788.3A Active CN111966736B (en) | 2020-07-27 | 2020-07-27 | High-throughput low-delay large-capacity Flume channel and transmission method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111966736B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116366519A (en) * | 2021-12-28 | 2023-06-30 | 北京灵汐科技有限公司 | Route transmission method, route control method, event processing method and device |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108073349A (en) * | 2016-11-08 | 2018-05-25 | 北京国双科技有限公司 | The transmission method and device of data |
US20180329600A1 (en) * | 2016-03-02 | 2018-11-15 | Tencent Technology (Shenzhen) Company Limited | Data processing method and apparatus |
-
2020
- 2020-07-27 CN CN202010728788.3A patent/CN111966736B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180329600A1 (en) * | 2016-03-02 | 2018-11-15 | Tencent Technology (Shenzhen) Company Limited | Data processing method and apparatus |
CN108073349A (en) * | 2016-11-08 | 2018-05-25 | 北京国双科技有限公司 | The transmission method and device of data |
Non-Patent Citations (2)
Title |
---|
《Dynamic Load Balancing and Channel Strategy for Apache Flume Collecting Real-Time Data Stream》;Buqing Shu 等;《2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC)》;20171231;全文 * |
《基于Flume的MySQL数据自动收集系统》;于金良 等;《计算机技术与发展》;20161231;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN111966736A (en) | 2020-11-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9444740B2 (en) | Router, method for controlling router, and program | |
US9426099B2 (en) | Router, method for controlling router, and program | |
US7227841B2 (en) | Packet input thresholding for resource distribution in a network switch | |
CA2329542C (en) | System and method for scheduling message transmission and processing in a digital data network | |
US8135004B2 (en) | Multi-plane cell switch fabric system | |
US8601181B2 (en) | System and method for read data buffering wherein an arbitration policy determines whether internal or external buffers are given preference | |
US7729258B2 (en) | Switching device | |
US7406041B2 (en) | System and method for late-dropping packets in a network switch | |
US20120072635A1 (en) | Relay device | |
US7613849B2 (en) | Integrated circuit and method for transaction abortion | |
US20230164078A1 (en) | Congestion Control Method and Apparatus | |
JP2002512460A (en) | System and method for regulating message flow in a digital data network | |
CN107770090B (en) | Method and apparatus for controlling registers in a pipeline | |
CN114521253B (en) | Dual-layer deterministic interprocess communication scheduler for input-output deterministic in solid-state drives | |
CN114257559B (en) | Data message forwarding method and device | |
JP2007524917A (en) | System and method for selectively influencing data flow to and from a memory device | |
CN111966736B (en) | High-throughput low-delay large-capacity Flume channel and transmission method thereof | |
CN114531488A (en) | High-efficiency cache management system facing Ethernet exchanger | |
CN106911740A (en) | A kind of method and apparatus of cache management | |
CN107483405B (en) | scheduling method and scheduling system for supporting variable length cells | |
EP3487132B1 (en) | Packet processing method and router | |
CN114363269A (en) | Message transmission method, system, equipment and medium | |
CN114401235B (en) | Method, system, medium, equipment and application for processing heavy load in queue management | |
CN115955441A (en) | Management scheduling method and device based on TSN queue | |
CN112559400B (en) | Multi-stage scheduling device, method, network chip and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |