CN111309612A

CN111309612A - Distributed file system based data current limiting test method and system

Info

Publication number: CN111309612A
Application number: CN202010094784.4A
Authority: CN
Inventors: 张东东
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2020-02-16
Filing date: 2020-02-16
Publication date: 2020-06-19

Abstract

The invention provides a data current-limiting test method and a data current-limiting test system based on a distributed file system, which are used for testing cluster stability before and after data current limiting, and comprise the steps of checking the current reading and writing and task scale of a cluster, the reading and writing time and bandwidth occupation record of large-scale concurrent execution data, the large-scale concurrent execution calculation, the flow task time and the bandwidth occupation record of the cluster, and carrying out data comparison before and after data current limiting, so that whether the large-scale HDFS cluster data current limiting in a production environment can achieve the data current limiting benefit or not is evaluated, the effect of a data current limiting strategy is accurately evaluated, an evaluation result is provided for the HDFS data current limiting technical innovation, and one dimension is improved for the stability of the large data.

Description

Distributed file system based data current limiting test method and system

Technical Field

The invention relates to the technical field of server clusters, in particular to a data current-limiting test method and system based on a distributed file system.

Background

With the development of Hadoop community technology, the HDFS continuously supports different storage strategies to deal with data at different temperatures, SSM is adopted to realize more intelligent storage management, the HDFS is high in availability and continuously perfect, and the federal dealing with large-scale cluster data at higher data level is realized. With the increasing of data volume, the improvement of high storage efficiency and high reliability is all the previous step, but today the number of large-scale clusters increases exponentially, a state of no concentration has appeared on bottom layer data nodes, data is continuously stored in the clusters, tasks are operated, the storage efficiency and the computing power of a software layer have been improved by SSM and Spark, Flink and the like with higher computing power, but continuous data flow and tasks occupy a large amount of network bandwidth, data reading and writing in the large-scale clusters are very frequent, the data transmission number is large, the computing and streaming tasks are large in quantity, the network bandwidth of one machine is necessarily limited, and if the bandwidth is used up by some tasks on the machine, normal task network transmission data is influenced. If the bandwidth is filled for a long time, machine IO alarms may also be caused, and the purpose of current limiting is in this place. It is not necessarily a malicious program or service that can cause the network bandwidth to quickly become full, and an inadvertent process or small error in the program can cause large-scale data transfers.

In order to solve the problem that the network bandwidth of a machine room is instantly filled due to several large tasks running in the current HDFS large-scale cluster, so that the on-line part of service is jittered and other service operation is influenced, a limiting scheme at a dataode end has been proposed in a Hadoop community, but a series of related function release is not completely perfected. The large-scale cluster optimization technology represented by the Hadoop system ecosphere can improve one dimension of the stability of the current HDFS cluster and prevent the HDFS cluster from happening. And limiting data flow limitation related similar operations to the DataNode so as to ensure cluster stability. With the increasing of big data, the data flow limitation will be more and more perfect with the perfection of the function, the updating of Hadoop community patches and the release of subsequent new versions, and for such an intelligent and complex tuning scheme, how to evaluate whether the data flow limitation can achieve corresponding benefits, whether data operation and tasks in a cluster are intelligently limited and managed, and how to evaluate the benefits of a data flow limitation strategy is an important problem that needs to be solved by technical personnel in the field.

Disclosure of Invention

The invention aims to provide a data current-limiting test method and system based on a distributed file system, and aims to solve the problem that the prior art lacks data current-limiting strategy evaluation, achieve the effect of accurately evaluating a data current-limiting strategy and improve the stability of a big data cluster.

In order to achieve the technical purpose, the invention provides a data current-limiting test method based on a distributed file system, which comprises the following operations:

respectively executing cluster stability tests before and after data current limiting, wherein the cluster stability tests comprise checking the current reading and writing and task scale of a cluster, recording large-scale concurrent execution data reading and writing time and bandwidth occupation, recording large-scale concurrent execution calculation, streaming task time and bandwidth occupation;

and comparing the read-write data, the calculation data and the stream data before and after data current limiting, and evaluating whether the current data current limiting strategy meets the requirements.

Preferably, the recording of the large-scale concurrent execution data read-write time and the bandwidth occupation specifically includes:

before data current limiting: executing a random-size file concurrent read-write task, recording the current concurrent read-write time T1-0, and recording the cluster bandwidth occupancy rate BW1-0 in the task execution process;

after data current limiting: and executing concurrent read-write tasks of the files with the same quantity and random sizes, recording the current concurrent read-write time T1-1, and recording the cluster bandwidth occupancy BW1-1 in the task execution process.

Preferably, the recording of the large-scale concurrent execution calculation, the stream task time and the bandwidth occupation specifically includes:

before data current limiting: executing a random size file Wordcount task, recording the time T2-0 used by the current Wordcount task, and recording the cluster bandwidth occupancy rate BW2-0 in the task execution process; executing a Hive table duplicate removal task with random size, storing the completed table into an HDFS through Kafka, recording the time T3-0 used by the current Hive duplicate removal task, and recording the cluster bandwidth occupancy rate BW3-0 in the task execution process;

after data current limiting: executing the Wordcount tasks of the random files with the same quantity and the random sizes, recording the time T2-1 used by the current Wordcount task, and recording the cluster bandwidth occupancy rate BW2-1 in the task execution process; executing the same number of Hive table deduplication tasks with random sizes, storing the completed tables into an HDFS through Kafka, recording the time T3-1 used by the current Hive deduplication tasks, and recording the cluster bandwidth occupancy ratio BW3-1 in the task execution process.

Preferably, when the following conditions exist in the read-write class data:

t1-1> T1-0 and BW1-1< BW1-0

The data flow limitation achieves the effect, the data read-write task is limited after the flow limitation, the task execution time is longer after the data flow limitation, but the bandwidth occupancy rate is lower;

when the following conditions exist for computing class data:

t2-1> T2-0 and BW2-1< BW2-0

The data flow limitation achieves the effect, the calculation tasks are limited after the flow limitation, the task execution time is longer after the data flow limitation, but the bandwidth occupancy rate is lower;

when the following conditions exist in the stream class data:

t3-1> T3-0 and BW3-1< BW3-0

The data current limit achieves the effect, the current class task is limited after the current limit, the task time of the data current limit execution is longer, but the broadband occupancy rate is lower.

Preferably, the method further comprises:

log saves all records of the test procedure to log Xianliu _ test.

The invention also provides a data current-limiting test system based on the distributed file system, which comprises:

the stability test module before and after current limiting is used for respectively executing cluster stability tests before and after data current limiting, and comprises the steps of checking the current reading and writing and task scale of a cluster, recording the reading and writing time and bandwidth occupation of large-scale concurrent execution data, and recording the large-scale concurrent execution calculation, the streaming task time and the bandwidth occupation;

and the data comparison module is used for performing data comparison on the read-write data, the calculation data and the stream data before and after data current limiting and evaluating whether the current data current limiting strategy meets the requirements or not.

Preferably, the system further comprises:

and the log storage module is used for storing all records of the test process into a log Xianliu _ test.

The invention also provides a data current-limiting test device based on the distributed file system, which comprises:

a memory for storing a computer program;

and the processor is used for executing the computer program to realize the distributed file system data current limiting test method.

The invention also provides a readable storage medium for storing a computer program, wherein the computer program is executed by a processor to implement the distributed file system data current limiting test method.

The effect provided in the summary of the invention is only the effect of the embodiment, not all the effects of the invention, and one of the above technical solutions has the following advantages or beneficial effects:

compared with the prior art, the cluster stability is tested before and after data current limiting, the cluster stability testing method comprises the steps of checking the current reading and writing and task scale of the cluster, large-scale concurrent execution data reading and writing time and bandwidth occupation records, large-scale concurrent execution calculation, stream task time and bandwidth occupation records, and comparing data before and after data current limiting, so that whether the large-scale HDFS cluster data current limiting in a production environment can achieve the data current limiting benefit or not is evaluated, the effect achieved by a data current limiting strategy is accurately evaluated, an evaluation result is provided for the HDFS data current limiting technology innovation, and one dimension is improved for the large data cluster stability.

Drawings

Fig. 1 is a flowchart of a distributed file system based data current limiting test method provided in an embodiment of the present invention;

fig. 2 is a block diagram of a distributed file system based data current limiting test system according to an embodiment of the present invention.

Detailed Description

In order to clearly explain the technical features of the present invention, the following detailed description of the present invention is provided with reference to the accompanying drawings. The following disclosure provides many different embodiments, or examples, for implementing different features of the invention. To simplify the disclosure of the present invention, the components and arrangements of specific examples are described below. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. It should be noted that the components illustrated in the figures are not necessarily drawn to scale. Descriptions of well-known components and processing techniques and procedures are omitted so as to not unnecessarily limit the invention.

The following describes a data current limiting test method and system based on a distributed file system in detail with reference to the accompanying drawings.

As shown in fig. 1, the present invention discloses a data current limiting test method based on a distributed file system, wherein the method comprises the following operations:

The embodiment of the invention simulates the operations of reading and writing, calculating, streaming tasks and the like of a big data cluster before and after changing the data current-limiting strategy, records each task time, the bandwidth occupation ratio in the task process and the on-line service jitter degree, judges whether the data current-limiting strategy can achieve the benefits or not according to the data comparison of the task execution time before and after the data current-limiting strategy is started, the cluster bandwidth occupation ratio in the task process and the on-line service jitter degree, reserves the log in the test process, and performs retrospective analysis on the execution record.

Performing the following operations prior to data throttling:

and checking a data node current limiting strategy, detecting whether the data current limiting strategy of thousands or tens of thousands of DataNode nodes of the large-scale cluster is started, and if the data current limiting strategy is started, closing the data current limiting.

A cluster stability test is performed.

Checking the current reading and writing and task scale of the cluster, checking the data reading and writing scale of the current cluster through Edit.log, and checking the task scale of current cluster calculation, flow and the like through Yarn to ensure that the current indexes of the cluster are in a normal range.

Recording the read-write time and the bandwidth occupation of large-scale concurrent execution data, executing 10 ten thousand concurrent read-write tasks of files with random sizes from 100M to 10G, recording the current 10 ten thousand concurrent read-write time T1-0 with random sizes from 100M to 10G, and recording the cluster bandwidth occupation rate BW1-0 in the task execution process.

Recording large-scale concurrent execution calculation, streaming task time and bandwidth occupation, executing 10 ten thousand Wordcount tasks of 100M to 10G random-size files, recording the time T2-0 used by the current Wordcount task, and recording the cluster bandwidth occupation rate BW2-0 in the task execution process; executing 10 ten thousand 100M to 10G Hive table deduplication tasks with random sizes, storing the completed table into an HDFS (Hadoop distributed File System) through Kafka, recording the time T3-0 used by the current Hive deduplication tasks, and recording the cluster bandwidth occupancy BW3-0 in the task execution process.

And checking the cluster network bandwidth and the online service, wherein the checking comprises checking whether the cluster network bandwidth is stable and checking whether the online service has jitter and instability.

Performing the following operations after data throttling:

and checking a data node current limiting strategy, detecting whether the data current limiting strategy of thousands or tens of thousands of DataNode nodes of the large-scale cluster is started, and starting data current limiting if the data current limiting strategy is not started.

A cluster stability test is performed.

Recording large-scale concurrent execution data read-write time and bandwidth occupation, executing 10 ten thousand random-size file concurrent read-write tasks of 100M to 10G, recording the current 10 ten thousand random-size concurrent read-write time T1-1 of 100M to 10G, and recording the cluster bandwidth occupation rate BW1-1 in the task execution process.

Recording large-scale concurrent execution calculation, streaming task time and bandwidth occupation, executing 10 ten thousand Wordcount tasks of 100M to 10G random-size files, recording the time T2-1 used by the current Wordcount task, and recording the cluster bandwidth occupation rate BW2-1 in the task execution process; executing 10 ten thousand 100M to 10G Hive table deduplication tasks with random sizes, storing the completed table into an HDFS (Hadoop distributed File System) through Kafka, recording the time T3-1 used by the current Hive deduplication tasks, and recording the cluster bandwidth occupancy BW3-1 in the task execution process.

And comparing data before and after the data current limiting, including data reading and writing data comparison, calculation data comparison and stream data comparison.

When the following conditions exist in the read-write type data:

t1-1> T1-0 and BW1-1< BW1-0

The data flow limitation achieves the effect, the data read-write task is limited after the flow limitation, the task execution time is longer after the data flow limitation, but the bandwidth occupancy rate is lower.

When the following conditions exist for computing class data:

t2-1> T2-0 and BW2-1< BW2-0

The data flow limitation achieves the effect, the calculation tasks are limited after the flow limitation, the task execution time is longer after the data flow limitation, and the bandwidth occupancy rate is lower.

When the following conditions exist in the stream class data:

t3-1> T3-0 and BW3-1< BW3-0

Log saves all records of the test procedure to log Xianliu _ test.

The embodiment of the invention tests the cluster stability before and after data current limiting, and comprises the steps of checking the current reading and writing and task scale of the cluster, recording the large-scale concurrent execution data reading and writing time and bandwidth occupation, recording the large-scale concurrent execution calculation, the flow task time and the bandwidth occupation, and comparing the data before and after the data current limiting, so as to evaluate whether the large-scale HDFS cluster data current limiting in the production environment can achieve the data current limiting benefit, accurately evaluate the effect achieved by the data current limiting strategy, provide an evaluation result for the HDFS data current limiting technical innovation, and improve one dimension for the stability of the large data cluster.

As shown in fig. 2, an embodiment of the present invention further discloses a data current limiting test system based on a distributed file system, where the system includes:

Performing the following operations prior to data throttling:

A cluster stability test is performed.

Performing the following operations after data throttling:

A cluster stability test is performed.

When the following conditions exist in the read-write type data:

t1-1> T1-0 and BW1-1< BW1-0

When the following conditions exist for computing class data:

t2-1> T2-0 and BW2-1< BW2-0

When the following conditions exist in the stream class data:

t3-1> T3-0 and BW3-1< BW3-0

The system also comprises a log saving module which is used for saving all records of the test process into a log Xianliu _ test.

The embodiment of the invention also discloses a data current-limiting test device based on the distributed file system, which comprises:

a memory for storing a computer program;

The embodiment of the invention also discloses a readable storage medium for storing a computer program, wherein the computer program is executed by a processor to realize the distributed file system data current limiting test method.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. A data current limiting test method based on a distributed file system is characterized by comprising the following operations:

2. The distributed file system data flow-limiting test method according to claim 1, wherein the recording of the large-scale concurrent execution data read-write time and the bandwidth occupation is specifically:

3. The distributed file system data flow-limiting-based testing method of claim 1, wherein the recording of large-scale concurrent execution calculation, streaming task time, and bandwidth occupancy specifically comprises:

4. The distributed file system data flow limit-based testing method of claim 3, wherein when the following conditions exist in the read-write data:

t1-1> T1-0 and BW1-1< BW1-0

when the following conditions exist for computing class data:

t2-1> T2-0 and BW2-1< BW2-0

when the following conditions exist in the stream class data:

t3-1> T3-0 and BW3-1< BW3-0

5. The distributed file system data flow-limiting-based testing method of claim 1, wherein the method further comprises:

log saves all records of the test procedure to log Xianliu _ test.

6. A distributed file system based data current limiting test system, the system comprising:

7. The distributed file system data flow-restriction based test system of claim 6, wherein the system further comprises:

8. A data current limiting test device based on a distributed file system is characterized by comprising:

a memory for storing a computer program;

a processor for executing the computer program to implement the distributed file system data flow limitation testing method according to any of claims 1 to 5.

9. A readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the distributed file system data throttling testing method according to any one of claims 1 to 5.