CN111177194B - Streaming data caching method and device - Google Patents

Streaming data caching method and device Download PDF

Info

Publication number
CN111177194B
CN111177194B CN201911302664.2A CN201911302664A CN111177194B CN 111177194 B CN111177194 B CN 111177194B CN 201911302664 A CN201911302664 A CN 201911302664A CN 111177194 B CN111177194 B CN 111177194B
Authority
CN
China
Prior art keywords
flow
data
data stream
preset
configuration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911302664.2A
Other languages
Chinese (zh)
Other versions
CN111177194A (en
Inventor
王绪亮
聂铁铮
黄菊
闫铭森
李迪
刘畅
Original Assignee
东北大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 东北大学 filed Critical 东北大学
Priority to CN201911302664.2A priority Critical patent/CN111177194B/en
Publication of CN111177194A publication Critical patent/CN111177194A/en
Application granted granted Critical
Publication of CN111177194B publication Critical patent/CN111177194B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a method and a device for caching streaming data, relates to the technical field of data processing, and aims to solve the problem of cache data loss at the peak time of data streaming output. The method mainly comprises the following steps: collecting a historical data stream sequence received by the cacheable device according to a preset sampling frequency and a preset observation time length; predicting a current predicted data stream at a current sampling time according to the historical data stream sequence; searching the size of a single cache data packet corresponding to the current predicted data stream according to the flow configuration mapping relation; and according to the searching result, caching the current actual data stream. The invention is mainly applied to the process of data caching.

Description

Streaming data caching method and device
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and an apparatus for buffering streaming data.
Background
With the continuous development of information technology, data generated by human activities are expanding rapidly at a speed exceeding the geometric progression, forming a data set which cannot be captured, managed and processed by conventional software tools within a certain time range, namely, massive, high-growth-rate and diversified information assets which need a new processing mode to have stronger decision-making ability, insight discovery ability and process optimization ability, namely, big data. Streaming is a major big data processing mode, wherein the data source of streaming is real-time streaming data, and the real-time requirement is high.
Message middleware is typically required to function as data buffers in streaming data processing systems. The back pressure problem of streaming data, which is a major problem to be solved in streaming data processing, requires a data buffer to hold data that has been received but not processed when the data is received at a speed greater than the data processing speed but the data is not desired to be lost. Such buffering mechanisms are often implemented using message middleware or message queuing tools. In many application scenarios, the upstream streaming data source of the message-middleware outputs unstable, non-uniform, bursty streaming data, and the message-middleware cannot spontaneously adapt to changes in data traffic, i.e. the use of the message-middleware does not mean that the risk of data loss is completely eliminated. At peak times of stream data output, the rate of stream data generation may be higher than the throughput capability of the message middleware, thereby causing data loss or affecting the performance of the cache middleware.
Disclosure of Invention
In view of this, the present invention provides a method and apparatus for buffering stream data, and is mainly aimed at solving the problem of data loss during peak time of data stream output in the prior art.
According to one aspect of the present invention, there is provided a streaming data buffering method, including:
collecting a historical data stream sequence received by the cacheable device according to a preset sampling frequency and a preset observation time length;
predicting a current predicted data stream at a current sampling time according to the historical data stream sequence;
searching the size of a single cache data packet corresponding to the current predicted data stream according to the flow configuration mapping relation;
and according to the searching result, caching the current actual data stream.
According to another aspect of the present invention, there is provided a streaming data buffering apparatus, including:
the acquisition module is used for acquiring the historical data stream sequence received by the cacheable device according to the preset sampling frequency and the preset observation time length;
the prediction module is used for predicting a current predicted data stream at the current sampling moment according to the historical data stream sequence;
the searching module is used for searching the size of the single cache data packet corresponding to the current predicted data stream according to the flow configuration mapping relation;
and the caching module is used for caching the current actual data stream according to the searching result.
According to still another aspect of the present invention, there is provided a computer storage medium having at least one executable instruction stored therein, the executable instruction causing a processor to perform operations corresponding to the streaming data caching method described above.
According to still another aspect of the present invention, there is provided a computer apparatus including: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the streaming data caching method.
By means of the technical scheme, the technical scheme provided by the embodiment of the invention has at least the following advantages:
the invention provides a method and a device for caching streaming data, which are characterized in that firstly, according to preset sampling frequency and preset observation time length, a historical data stream sequence received by cacheable equipment is collected, then, according to the historical data stream sequence, a current predicted data stream at the current sampling time is predicted, then, according to a flow configuration mapping relation, the size of a single cache data packet corresponding to the current predicted data stream is searched, and finally, according to a searching result, the current actual data stream is cached. Compared with the prior art, the embodiment of the invention predicts the data flow of the cacheable device, adaptively adjusts the size of the single cache data packet in a prediction mode by using the optimal cache configuration corresponding to the predicted data flow, predicts and processes the data flow burst in advance, and improves the back pressure capability of the caching process under the condition of high-flux data flow, thereby avoiding the loss of cache data at the peak moment of data flow output.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
fig. 1 shows a flowchart of a method for buffering streaming data according to an embodiment of the present invention;
FIG. 2 is a flowchart of another method for buffering streaming data according to an embodiment of the present invention;
fig. 3 shows a block diagram of a buffering device for streaming data according to an embodiment of the present invention;
FIG. 4 is a block diagram illustrating another streaming data buffering device according to an embodiment of the present invention;
fig. 5 shows a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The embodiment of the invention provides a streaming data caching method, as shown in fig. 1, which comprises the following steps:
101. and acquiring the historical data stream sequence received by the cacheable device according to the preset sampling frequency and the preset observation time length.
And in the preset observation time period, acquiring the flow rate of the data stream received by the cacheable device every time the preset sampling frequency is passed, and recording the flow rate value. All flow velocity values recorded during the observation period together constitute a historical data flow sequence.
102. And predicting the current predicted data stream at the current sampling time according to the historical data stream sequence.
And taking the historical data stream sequence as a circulation rule sequence of the data stream received by the cacheable device, and assuming a certain moment as a starting moment, and predicting a predicted data stream with a first flow speed value in the historical data stream sequence as the starting moment. And calculating the relative position of the current moment in the observation time according to the current moment, the starting moment, the preset sampling frequency and the preset observation time, and taking the relative position as the current predicted data stream according to the flow velocity value corresponding to the relative position. The current predicted data stream, i.e. the predicted flow rate value at the current sampling instant.
103. And searching the size of the single cache data packet corresponding to the current predicted data stream according to the flow configuration mapping relation.
The flow configuration mapping relation is the corresponding relation between the flow speed of the data flow and the size of the single cache data packet. The size of the data packet cached once has great influence on the average throughput rate and the average sending delay in the caching process, and has a determining effect on the caching speed. The flow configuration mapping relationship can be preset, and can be obtained by monitoring the actual running condition of the cacheable device, which is not limited in the embodiment of the present invention. And searching the size of the single cache data packet corresponding to the current predicted data stream in the flow configuration mapping relation.
104. And according to the search result, caching the current actual data flow.
And caching the actual data stream according to the size of the single cache data packet corresponding to the current predicted data stream.
The invention provides a streaming data caching method, which comprises the steps of firstly collecting a historical data stream sequence received by cacheable equipment according to preset sampling frequency and preset observation time length, then predicting a current predicted data stream at a current sampling time according to the historical data stream sequence, searching a single cache data packet size corresponding to the current predicted data stream according to a flow configuration mapping relation, and finally caching the current actual data stream according to a searching result. Compared with the prior art, the embodiment of the invention predicts the data flow of the cacheable device, adaptively adjusts the size of the single cache data packet in a prediction mode by using the optimal cache configuration corresponding to the predicted data flow, predicts and processes the data flow burst in advance, and improves the back pressure capability of the caching process under the condition of high-flux data flow, thereby avoiding the loss of cache data at the peak moment of data flow output.
The embodiment of the invention provides another streaming data caching method, as shown in fig. 2, which comprises the following steps:
201. and acquiring the historical data stream sequence received by the cacheable device according to the preset sampling frequency and the preset observation time length.
And in the preset observation time period, acquiring the flow rate of the data stream received by the cacheable device every time the preset sampling frequency is passed, and recording the flow rate value. All flow velocity values recorded during the observation period together constitute a historical data flow sequence.
202. And predicting the current predicted data stream at the current sampling time according to the historical data stream sequence.
Predicting the current predicted data stream specifically includes: calculating a sampling position of the current sampling moment corresponding to the preset observation duration according to the preset sampling frequency and the preset observation duration; and searching the current pre-stored data stream corresponding to the sampling position in the historical data stream sequence.
203. And testing the flow configuration mapping relation between the data flow speed of the cacheable device and the size of the single cache data packet.
The test process specifically comprises the following steps: setting a speed test sequence and a cache configuration test sequence in the cacheable device, wherein the test sequence comprises a plurality of data flow rates with different values, the data flow rates are arranged in a sequence from small to large, the cache configuration test sequence comprises a plurality of configuration cache data packet sizes with different values, and the configuration cache data packet sizes are arranged in a sequence from small to large; caching each data flow velocity in the velocity test sequence according to the size of each configuration cache data packet in the cache configuration test sequence, and testing and recording the average throughput velocity and the average sending delay in the caching process; fitting a flow throughput rate relation corresponding to the average throughput rate of each data flow velocity and a flow transmission delay relation corresponding to the average transmission delay of each data flow velocity by adopting a preset fitting algorithm according to the average transmission delay; calculating a non-disadvantaged configuration interval of each data stream flow speed in the speed test sequence by adopting a non-dominant sorting genetic algorithm NSGA by taking the traffic throughput rate relation and the traffic transmission delay relation as constraint conditions, wherein the non-disadvantaged configuration interval refers to a configurable data range interval of the configured cache data packet size enabling the data stream flow speed to meet the constraint conditions in a caching process; and calculating the size of the configuration cache data packet corresponding to the data flow velocity by adopting a regression model fitting algorithm according to the configuration intermediate value of the non-disadvantaged configuration interval, and obtaining the flow configuration mapping relation between the data flow velocity and the size of the single cache data packet.
204. And searching the size of the single cache data packet corresponding to the current predicted data stream according to the flow configuration mapping relation.
The flow configuration mapping relation is the corresponding relation between the flow speed of the data flow and the size of the single cache data packet. The size of the data packet cached once has great influence on the average throughput rate and the average sending delay in the caching process, and has a determining effect on the caching speed. The flow configuration mapping relationship can be preset, and can be obtained by monitoring the actual running condition of the cacheable device, which is not limited in the embodiment of the present invention. And searching the size of the single cache data packet corresponding to the current predicted data stream in the flow configuration mapping relation.
205. And according to the search result, caching the current actual data flow.
And caching the actual data stream according to the size of the single cache data packet corresponding to the current predicted data stream. The actual cacheable device needs to continuously change the cached flow value, so that the predicted data stream is more similar to the actual data stream, and after the historical data stream sequence is collected, the historical data stream needs to be updated, so that the cacheable device further comprises: monitoring the actual current data flow at the current moment; calculating the flow difference between the predicted current flow data and the actual current data flow; if the flow difference is larger than a first preset threshold value, adjusting the preset sampling frequency and the preset observation duration according to a first preset learning rate, and re-acquiring the historical data flow sequence; and if the flow difference is smaller than a second preset threshold value, adjusting the preset sampling frequency and the preset observation duration according to a second preset learning rate, and re-acquiring the historical data flow sequence.
Both the first preset learning rate and the second preset learning rate include adjustments to the preset sampling frequency and the preset observation time. For example, the first preset learning rate is 50% greater than both the preset sampling frequency and the predicted observation time, and the first preset learning rate is 20% less than both the preset sampling frequency and the predicted observation time. The queue for storing the historical data stream sequence is started to be an empty queue, the data value is continuously filled into the empty queue in the process of acquiring the historical data stream sequence, and the historical data stream sequence is re-acquired and then kicked out of the original historical data stream sequence re-queue, so that the current predicted data stream is predicted by using the re-acquired historical data stream sequence.
The invention provides a streaming data caching method, which comprises the steps of firstly collecting a historical data stream sequence received by cacheable equipment according to preset sampling frequency and preset observation time length, then predicting a current predicted data stream at a current sampling time according to the historical data stream sequence, searching a single cache data packet size corresponding to the current predicted data stream according to a flow configuration mapping relation, and finally caching the current actual data stream according to a searching result. Compared with the prior art, the embodiment of the invention predicts the data flow of the cacheable device, adaptively adjusts the size of the single cache data packet in a prediction mode by using the optimal cache configuration corresponding to the predicted data flow, predicts and processes the data flow burst in advance, and improves the back pressure capability of the caching process under the condition of high-flux data flow, thereby avoiding the loss of cache data at the peak moment of data flow output.
Further, as an implementation of the method shown in fig. 1, an embodiment of the present invention provides a streaming data caching apparatus, as shown in fig. 3, where the apparatus includes:
the acquisition module 31 is configured to acquire a historical data stream sequence received by the cacheable device according to a preset sampling frequency and a preset observation duration;
a prediction module 32, configured to predict a current predicted data stream at a current sampling time according to the historical data stream sequence;
the searching module 33 is configured to search the size of the single cache data packet corresponding to the current predicted data stream according to the flow configuration mapping relationship;
and the caching module 34 is configured to cache the current actual data stream according to the search result.
The invention provides a streaming data caching device, which is characterized in that firstly, according to preset sampling frequency and preset observation time length, a historical data stream sequence received by cacheable equipment is collected, then, according to the historical data stream sequence, a current predicted data stream at the current sampling time is predicted, then, according to a flow configuration mapping relation, the size of a single cache data packet corresponding to the current predicted data stream is searched, and finally, according to a searching result, the current actual data stream is cached. Compared with the prior art, the embodiment of the invention predicts the data flow of the cacheable device, adaptively adjusts the size of the single cache data packet in a prediction mode by using the optimal cache configuration corresponding to the predicted data flow, predicts and processes the data flow burst in advance, and improves the back pressure capability of the caching process under the condition of high-flux data flow, thereby avoiding the loss of cache data at the peak moment of data flow output.
Further, as an implementation of the method shown in fig. 2, another streaming data caching apparatus is provided in an embodiment of the present invention, as shown in fig. 4, where the apparatus includes:
the acquisition module 41 is configured to acquire a historical data stream sequence received by the cacheable device according to a preset sampling frequency and a preset observation duration;
a prediction module 42, configured to predict a current predicted data stream at a current sampling time according to the historical data stream sequence;
the searching module 43 is configured to search the size of the single cache packet corresponding to the current predicted data stream according to the flow configuration mapping relationship;
and the caching module 44 is configured to cache the current actual data stream according to the search result.
Further, the prediction module 42 includes:
a calculating unit 421, configured to calculate, according to the preset sampling frequency and the preset observation duration, a sampling position of the current sampling time corresponding to the preset observation duration;
and a searching unit 422, configured to search the current pre-stored data stream corresponding to the sampling position in the historical data stream sequence.
Further, the device further comprises:
a monitoring module 45, configured to monitor an actual current data stream at the current moment after buffering the current actual data stream according to the search result;
a calculating module 46, configured to calculate a flow difference between the predicted current flow data and the actual current data flow;
a reset module 47, configured to adjust the preset sampling frequency and the preset observation period according to a first preset learning rate and re-acquire the historical data stream sequence if the flow difference is greater than a first preset threshold;
the resetting module 47 is further configured to adjust the preset sampling frequency and the preset observation period according to a second preset learning rate and re-acquire the historical data stream sequence if the flow difference is smaller than a second preset threshold.
Further, the apparatus further comprises:
and the test module 48 is configured to test a flow configuration mapping relationship between a data flow speed of the cacheable device and a size of the single-time buffered data packet before searching the size of the single-time buffered data packet corresponding to the actual current data flow according to the flow configuration mapping relationship.
Further, the test module 48 includes:
a setting unit 481, configured to set a speed test sequence and a cache configuration test sequence in the cacheable device, where the test sequence includes a plurality of data flow rates with different values, and the data flow rates are arranged in order from small to large, the cache configuration test sequence includes a plurality of configuration cache packet sizes with different values, and the configuration cache packet sizes are arranged in order from small to large;
a recording unit 482, configured to cache each data flow rate in the speed test sequence with a size of each configured cache packet in the cache configuration test sequence, test and record an average throughput rate and an average transmission delay in a caching process;
a fitting unit 483, configured to fit, according to the average transmission delay and the sum, a traffic throughput rate relation corresponding to the average throughput rate of each data flow rate by using a preset fitting algorithm, and fit a traffic transmission delay relation corresponding to the average transmission delay of each data flow rate;
a calculating unit 484, configured to calculate, using the traffic throughput rate relation and the traffic transmission delay relation as constraint conditions, a non-disadvantaged configuration interval of each data flow rate in the speed test sequence by using a non-dominant sorting genetic algorithm NSGA, where the non-disadvantaged configuration interval refers to a configurable data range interval of the configured cache packet size that enables the data flow rate to conform to the constraint conditions in a caching process;
and an obtaining unit 485, configured to calculate a single-buffer optimal data packet size corresponding to the data flow velocity according to the configuration intermediate value of the non-disadvantageous configuration interval by adopting a regression model fitting algorithm, and obtain the flow configuration mapping relationship between the data flow velocity and the single-buffer data packet size.
The invention provides a streaming data caching device, which is characterized in that firstly, according to preset sampling frequency and preset observation time length, a historical data stream sequence received by cacheable equipment is collected, then, according to the historical data stream sequence, a current predicted data stream at the current sampling time is predicted, then, according to a flow configuration mapping relation, the size of a single cache data packet corresponding to the current predicted data stream is searched, and finally, according to a searching result, the current actual data stream is cached. Compared with the prior art, the embodiment of the invention predicts the data flow of the cacheable device, adaptively adjusts the size of the single cache data packet in a prediction mode by using the optimal cache configuration corresponding to the predicted data flow, predicts and processes the data flow burst in advance, and improves the back pressure capability of the caching process under the condition of high-flux data flow, thereby avoiding the loss of cache data at the peak moment of data flow output.
According to one embodiment of the present invention, there is provided a computer storage medium storing at least one executable instruction for performing the method for buffering streaming data in any of the above method embodiments.
Fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present invention, and the specific embodiment of the present invention is not limited to the specific implementation of the computer device.
As shown in fig. 5, the computer device may include: a processor 502, a communication interface (Communications Interface) 504, a memory 506, and a communication bus 508.
Wherein: processor 502, communication interface 504, and memory 506 communicate with each other via communication bus 508.
A communication interface 504 for communicating with network elements of other devices, such as clients or other servers.
The processor 502 is configured to execute the program 510, and may specifically perform relevant steps in the above-described streaming data buffering method embodiment.
In particular, program 510 may include program code including computer-operating instructions.
The processor 502 may be a central processing unit CPU, or a specific integrated circuit ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included in the computer device may be the same type of processor, such as one or more CPUs; but may also be different types of processors such as one or more CPUs and one or more ASICs.
A memory 506 for storing a program 510. Memory 506 may comprise high-speed RAM memory or may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 510 may be specifically operable to cause the processor 502 to:
collecting a historical data stream sequence received by the cacheable device according to a preset sampling frequency and a preset observation time length;
predicting a current predicted data stream at a current sampling time according to the historical data stream sequence;
searching the size of a single cache data packet corresponding to the current predicted data stream according to the flow configuration mapping relation;
and according to the searching result, caching the current actual data stream.
It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a memory device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps within them may be fabricated into a single integrated circuit module for implementation. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (6)

1. The method for caching the streaming data is characterized by comprising the following steps of:
collecting a historical data stream sequence received by the cacheable device according to a preset sampling frequency and a preset observation time length;
predicting a current predicted data stream at a current sampling time according to the historical data stream sequence;
searching the size of a single cache data packet corresponding to the current predicted data stream according to the flow configuration mapping relation;
caching the current actual data stream according to the searching result; before searching the size of the single cache data packet corresponding to the current predicted data stream according to the flow configuration mapping relation, the method further comprises:
testing the flow configuration mapping relation between the data flow speed of the cacheable device and the size of the single cache data packet;
the testing the flow configuration mapping relation between the data flow velocity of the cacheable device and the size of the single cache data packet comprises the following steps:
setting a speed test sequence and a cache configuration test sequence in the cacheable device, wherein the test sequence comprises a plurality of data flow rates with different values, the data flow rates are arranged in a sequence from small to large, the cache configuration test sequence comprises a plurality of configuration cache data packet sizes with different values, and the configuration cache data packet sizes are arranged in a sequence from small to large;
caching each data flow velocity in the velocity test sequence according to the size of each configuration cache data packet in the cache configuration test sequence, and testing and recording the average throughput velocity and the average sending delay in the caching process;
according to the average throughput rate and the average sending delay in the caching process, a preset fitting algorithm is adopted to fit a flow throughput rate relation corresponding to the average throughput rate of each data flow velocity, and a flow sending delay relation corresponding to the average sending delay of each data flow velocity;
calculating a non-disadvantaged configuration interval of each data stream flow speed in the speed test sequence by adopting a non-dominant sorting genetic algorithm NSGA by taking the traffic throughput rate relation and the traffic transmission delay relation as constraint conditions, wherein the non-disadvantaged configuration interval refers to a configurable data range interval of the configured cache data packet size enabling the data stream flow speed to meet the constraint conditions in a caching process;
and calculating the size of the configuration cache data packet corresponding to the data flow velocity by adopting a regression model fitting algorithm according to the configuration intermediate value of the non-disadvantaged configuration interval, and obtaining the flow configuration mapping relation between the data flow velocity and the size of the single cache data packet.
2. The method of claim 1, wherein predicting the current predicted data stream for the current sample time based on the historical data stream sequence comprises:
calculating a sampling position of the current sampling moment corresponding to the preset observation duration according to the preset sampling frequency and the preset observation duration;
and searching the current pre-stored data stream corresponding to the sampling position in the historical data stream sequence.
3. The method of claim 1, wherein after buffering the current actual data stream based on the search result, the method further comprises:
monitoring an actual current data stream at a current moment;
calculating the flow difference between the current predicted data flow and the actual current data flow;
if the flow difference is larger than a first preset threshold value, adjusting the preset sampling frequency and the preset observation duration according to a first preset learning rate, and re-acquiring the historical data flow sequence;
and if the flow difference is smaller than a second preset threshold value, adjusting the preset sampling frequency and the preset observation duration according to a second preset learning rate, and re-acquiring the historical data flow sequence.
4. A streaming data caching apparatus, comprising:
the acquisition module is used for acquiring a historical data stream sequence received by the cacheable device according to the preset sampling frequency and the preset observation time length;
the prediction module is used for predicting a current predicted data stream at the current sampling moment according to the historical data stream sequence;
the searching module is used for searching the size of the single cache data packet corresponding to the current predicted data stream according to the flow configuration mapping relation;
the caching module is used for caching the current actual data stream according to the searching result;
the testing module is used for testing the flow configuration mapping relation between the data flow speed of the cacheable device and the size of the single cache data packet before searching the size of the single cache data packet corresponding to the current predicted data flow according to the flow configuration mapping relation; the test module comprises:
a setting unit, configured to set a speed test sequence and a cache configuration test sequence in the cacheable device, where the test sequence includes a plurality of data flow rates with different values, and the data flow rates are arranged in order from small to large, the cache configuration test sequence includes a plurality of configuration cache data packet sizes with different values, and the configuration cache data packet sizes are arranged in order from small to large;
the recording unit is used for caching the flow rate of each data stream in the speed test sequence with the size of each configuration cache data packet in the cache configuration test sequence, testing and recording the average throughput rate and the average sending delay in the caching process;
the fitting unit is used for fitting a flow throughput rate relation corresponding to the average throughput rate of each data flow velocity and a flow transmission delay relation corresponding to the average transmission delay of each data flow velocity by adopting a preset fitting algorithm according to the average throughput rate and the average transmission delay in the caching process;
the calculating unit is configured to calculate a non-disadvantaged configuration interval of each data flow velocity in the velocity test sequence by using the traffic throughput rate relation and the traffic transmission delay relation as constraint conditions and adopting a non-dominant sorting genetic algorithm NSGA, where the non-disadvantaged configuration interval refers to a configurable data range interval of the configured cache data packet size that enables the data flow velocity to conform to the constraint conditions in a caching process;
and the acquisition unit is used for calculating the size of the single cache optimal data packet corresponding to the data flow velocity by adopting a regression model fitting algorithm according to the configuration intermediate value of the non-disadvantaged configuration interval, and acquiring the flow configuration mapping relation between the data flow velocity and the size of the single cache data packet.
5. The apparatus of claim 4, wherein the prediction module comprises:
the calculating unit is used for calculating the sampling position of the current sampling moment corresponding to the preset observation duration according to the preset sampling frequency and the preset observation duration;
and the searching unit is used for searching the current pre-stored data stream corresponding to the sampling position in the historical data stream sequence.
6. The apparatus of claim 4, wherein the apparatus further comprises:
the monitoring module is used for monitoring the actual current data flow at the current moment after caching the current actual data flow according to the searching result;
the calculating module is used for calculating the flow difference between the current predicted data flow and the actual current data flow;
the resetting module is used for adjusting the preset sampling frequency and the preset observing duration according to a first preset learning rate and re-collecting the historical data stream sequence if the flow difference is larger than a first preset threshold value;
and the reset module is further configured to adjust the preset sampling frequency and the preset observation duration according to a second preset learning rate if the flow difference is smaller than a second preset threshold, and re-acquire the historical data stream sequence.
CN201911302664.2A 2019-12-17 2019-12-17 Streaming data caching method and device Active CN111177194B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911302664.2A CN111177194B (en) 2019-12-17 2019-12-17 Streaming data caching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911302664.2A CN111177194B (en) 2019-12-17 2019-12-17 Streaming data caching method and device

Publications (2)

Publication Number Publication Date
CN111177194A CN111177194A (en) 2020-05-19
CN111177194B true CN111177194B (en) 2024-03-15

Family

ID=70650185

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911302664.2A Active CN111177194B (en) 2019-12-17 2019-12-17 Streaming data caching method and device

Country Status (1)

Country Link
CN (1) CN111177194B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106685515A (en) * 2017-01-05 2017-05-17 清华大学 Allocation method and device for satellite resources in space information network
CN107783721A (en) * 2016-08-25 2018-03-09 华为技术有限公司 The processing method and physical machine of a kind of data

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9251052B2 (en) * 2012-01-12 2016-02-02 Intelligent Intellectual Property Holdings 2 Llc Systems and methods for profiling a non-volatile cache having a logical-to-physical translation layer
US10742704B2 (en) * 2017-07-05 2020-08-11 Cinova Media Method and apparatus for an adaptive video-aware streaming architecture with cloud-based prediction and elastic rate control

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107783721A (en) * 2016-08-25 2018-03-09 华为技术有限公司 The processing method and physical machine of a kind of data
CN106685515A (en) * 2017-01-05 2017-05-17 清华大学 Allocation method and device for satellite resources in space information network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
张一丹 ; 籍风磊 ; .面向业务的带宽预测及分配方法的研究.通讯世界.2016,(01),30-31. *
李丽娜 ; 魏晓辉 ; 李翔 ; 王兴旺 ; .流数据处理中负载突发感知的弹性资源分配.计算机学报.2017,(10),21-36. *
蔡凌 ; 王兴伟 ; 汪晋宽 ; 黄敏 ; .基于概念漂移学习的ICN自适应缓存策略.软件学报.2019,(12),191-207. *

Also Published As

Publication number Publication date
CN111177194A (en) 2020-05-19

Similar Documents

Publication Publication Date Title
US20200136982A1 (en) Congestion avoidance in a network device
US7890620B2 (en) Monitoring system and monitoring method
CN104715020B (en) Data cached delet method and server
CN107135088B (en) Method and device for processing logs in cloud computing system
AU2014201784B2 (en) Server system for providing current data and past data to clients
US7324535B1 (en) Methods and apparatus for maintaining a queue
US9210058B2 (en) Systems and methods for assessing jitter buffers
US20210176174A1 (en) Load balancing device and method for an edge computing network
CN111352967A (en) Frequency control method, system, device and medium for sliding window algorithm
CN110417609B (en) Network traffic statistical method and device, electronic equipment and storage medium
US7130915B1 (en) Fast transaction response time prediction across multiple delay sources
CN111177194B (en) Streaming data caching method and device
CN111385218A (en) Packet loss and flow control method, storage medium and equipment for message queue overload
WO2017169948A1 (en) Communication system, available-bandwidth estimation apparatus, available-bandwidth estimation method, and recording medium having available-bandwidth estimation program stored thereon
US20030058879A1 (en) Scalable hardware scheduler time based calendar search algorithm
CN114244781B (en) Message de-duplication processing method and device based on DPDK
JP4535275B2 (en) Bandwidth control device
WO2017088582A1 (en) Network congestion control method and apparatus, and storage medium
US10601444B2 (en) Information processing apparatus, information processing method, and recording medium storing program
Wang et al. A probability-guaranteed adaptive timeout algorithm for high-speed network flow detection
JP2014112779A (en) Data transmission controller, data transmission control method, and computer program
WO2019222287A1 (en) Scalable and real-time anomaly detection
EP2487868A1 (en) An arrangement and method for handling data to and from a processing engine
CN116048819B (en) High concurrency data storage method and system
CN111404781B (en) Speed limiting method, apparatus and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant