CN115061898B - Adaptive speed limiting method, device, equipment and medium based on Hadoop analysis platform - Google Patents

Adaptive speed limiting method, device, equipment and medium based on Hadoop analysis platform Download PDF

Info

Publication number
CN115061898B
CN115061898B CN202210984130.8A CN202210984130A CN115061898B CN 115061898 B CN115061898 B CN 115061898B CN 202210984130 A CN202210984130 A CN 202210984130A CN 115061898 B CN115061898 B CN 115061898B
Authority
CN
China
Prior art keywords
hadoop
analysis platform
data information
time interval
configuration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210984130.8A
Other languages
Chinese (zh)
Other versions
CN115061898A (en
Inventor
王玉叶
杨梦龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
DBAPPSecurity Co Ltd
Original Assignee
DBAPPSecurity Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by DBAPPSecurity Co Ltd filed Critical DBAPPSecurity Co Ltd
Priority to CN202210984130.8A priority Critical patent/CN115061898B/en
Publication of CN115061898A publication Critical patent/CN115061898A/en
Application granted granted Critical
Publication of CN115061898B publication Critical patent/CN115061898B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Debugging And Monitoring (AREA)
  • Test And Diagnosis Of Digital Computers (AREA)

Abstract

The application discloses a self-adaptive speed limiting method, a self-adaptive speed limiting device, self-adaptive speed limiting equipment and a self-adaptive speed limiting medium based on a Hadoop analysis platform, which relate to the technical field of data analysis, and the method comprises the following steps: performing performance test on the Hadoop analysis platform and standardizing a test result into a configuration format so as to arrange the configuration format in a database of the Hadoop analysis platform; acquiring actual parameters of a Hadoop analysis platform at regular time based on a first preset time interval and matching the actual parameters with a configuration format to obtain matched effective configuration; detecting data information running on the Hadoop analysis platform at regular time based on a second preset time interval, and judging whether the data information exceeds effective configuration; and if the data information exceeds the effective configuration, setting a speed limiter through a first preset interface, and setting speed limiter parameters according to the data information so as to match the speed limiter parameters with the configuration format for self-adaptive speed limiting. By the technical scheme, self-adaptive speed limiting can be performed based on a Hadoop analysis platform, and the application is strong.

Description

Adaptive speed limiting method, device, equipment and medium based on Hadoop analysis platform
Technical Field
The invention relates to the technical field of data analysis, in particular to a self-adaptive speed limiting method, a self-adaptive speed limiting device, self-adaptive speed limiting equipment and a self-adaptive speed limiting medium based on a Hadoop analysis platform.
Background
Hadoop is a distributed system infrastructure developed by the Apache Foundation. Users can develop the distributed program without knowing the details of the distributed bottom layer, and the power of the cluster is fully utilized to carry out high-speed operation and storage. At present, a Hadoop distributed system infrastructure is widely applied to various big data analysis platforms and network security detection platforms, along with the development trend of social digital networking intellectualization and explosive increase of data, for various analysis platforms, the performance and stability of the platforms are greatly affected by the rising of data volume, if the performance is abnormal, serious problems such as data loss, abnormal operation of a computing engine, hanging of cluster service and the like are caused, and further serious consequences such as loss of business data, network attack utilization events, property loss and the like are caused.
Therefore, how to prevent performance abnormity of the analysis platform caused by overlarge data amount in the application environment of the Hadoop analysis platform is a problem to be solved at present.
Disclosure of Invention
In view of the above, the present invention provides a method, an apparatus, a device, and a medium for adaptive speed limiting based on a Hadoop analysis platform, so as to solve the problem of performance abnormality of the analysis platform caused by an excessive data amount in an application environment of the Hadoop analysis platform. The specific scheme is as follows:
in a first aspect, the application discloses a self-adaptive speed limiting method based on a Hadoop analysis platform, which comprises the following steps:
carrying out performance test on a Hadoop analysis platform, and standardizing a test result of the performance test into a configuration format so as to arrange the configuration format in a database of the Hadoop analysis platform;
acquiring actual parameters of the Hadoop analysis platform at regular time based on a first preset time interval, and matching the actual parameters with the configuration format to obtain matched effective configuration;
detecting data information running on the Hadoop analysis platform at regular time based on a second preset time interval, and judging whether the data information exceeds the effective configuration;
and if the data information exceeds the effective configuration, setting a speed limiter through a first preset interface, and setting speed limiter parameters according to the data information so as to match the speed limiter parameters with the configuration format for self-adaptive speed limiting.
Optionally, the obtaining the actual parameters of the Hadoop analysis platform at regular time based on the first preset time interval includes:
acquiring the number of current Yarn cluster nodes through a Yarn client interface at regular time based on a first preset time interval;
acquiring the number of current Kafka cluster nodes through a Kafka client interface based on the first preset time interval;
and obtaining the current physical machine memory through a memory checking command based on the first preset time interval.
Optionally, the periodically detecting data information running on the Hadoop analysis platform based on a second preset time interval includes:
and detecting the data rate and the data size running on the Hadoop analysis platform at regular time based on a second preset time interval.
Optionally, the periodically detecting data information running on the Hadoop analysis platform based on a second preset time interval includes:
detecting data information running on the Hadoop analysis platform through a second preset interface based on a second preset time interval timing;
and regularly sampling the log file based on a second preset time interval to obtain the data size on the Hadoop analysis platform.
Optionally, the determining whether the data information exceeds the effective configuration includes:
and if the data information does not exceed the effective configuration, not limiting the speed of the Hadoop analysis platform so that the Hadoop analysis platform can operate according to the data information.
Optionally, if the data information exceeds the effective configuration, setting a speed limiter through a first preset interface, and setting a speed limiter parameter according to the data information, including:
if the data information exceeds the effective configuration, setting a speed limiter through a Kafka Consumer interface;
and configuring the parameter size of the speed limiter as the product of the data rate and the data size to obtain the speed limiter parameter.
Optionally, the adaptive speed-limiting method based on the Hadoop analysis platform further includes:
when the Hadoop analysis platform is expanded during operation and/or cluster nodes in the Hadoop analysis platform are optimized, the step of regularly acquiring the actual parameters of the Hadoop analysis platform based on the first preset time interval is executed to perform self-adaptive speed limiting.
In a second aspect, the application discloses a self-adaptation speed limiting device based on a Hadoop analysis platform, including:
the performance testing module is used for carrying out performance testing on the Hadoop analysis platform;
the test result configuration module is used for standardizing the test result of the performance test into a configuration format so as to arrange the configuration format in a database of the Hadoop analysis platform;
the matching validation module is used for regularly acquiring actual parameters of the Hadoop analysis platform based on a first preset time interval and matching the actual parameters with the configuration format to obtain the matched validation configuration;
the judgment detection module is used for regularly detecting data information running on the Hadoop analysis platform based on a second preset time interval and judging whether the data information exceeds the effective configuration;
and the self-adaptive speed limiting module is used for setting a speed limiter through a first preset interface if the data information exceeds the effective configuration, and setting speed limiter parameters according to the data information so as to match the speed limiter parameters with the configuration format for self-adaptive speed limiting.
In a third aspect, the present application discloses an electronic device comprising a processor and a memory; wherein the memory is used for storing a computer program which is loaded and executed by the processor to realize the adaptive speed limiting method based on the Hadoop analysis platform.
In a fourth aspect, the present application discloses a computer readable storage medium for storing a computer program; wherein the computer program when executed by a processor implements the Hadoop analysis platform based adaptive speed limiting method as described above.
In the application, a Hadoop analysis platform is subjected to performance test, a test result of the performance test is standardized into a configuration format, and the configuration format is built in a database of the Hadoop analysis platform; acquiring actual parameters of the Hadoop analysis platform at regular time based on a first preset time interval, and matching the actual parameters with the configuration format to obtain matched effective configuration; detecting data information running on the Hadoop analysis platform at regular time based on a second preset time interval, and judging whether the data information exceeds the effective configuration; and if the data information exceeds the effective configuration, setting a speed limiter through a first preset interface, and setting speed limiter parameters according to the data information so as to match the speed limiter parameters with the configuration format for self-adaptive speed limiting. Therefore, firstly, the method is based on the actual platform configuration matching performance parameters, has wide application and is suitable for various Hadoop architecture-based platforms; secondly, the test result of the performance test is standardized into a configuration format and is arranged in a database of a Hadoop analysis platform, parameters are configured, manual configuration is omitted, and the process is simpler, controllable and clear; thirdly, data information running on the analysis platform is dynamically detected in real time based on a preset time interval, self-adaption speed limitation is carried out according to detection results, the applicability is strong, the delay of data consumption and the delay of analysis and storage are shortened as much as possible while stability of each cluster and service is ensured, meanwhile, negative effects caused by rapid increase of data quantity can be avoided, namely, performance abnormity of the analysis platform caused by overlarge data quantity is prevented.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a flow chart of an adaptive speed limiting method based on a Hadoop analysis platform disclosed in the present application;
FIG. 2 is a schematic diagram of an adaptive speed limiting method disclosed in the present application;
FIG. 3 is a flow chart of a specific adaptive speed limiting method based on a Hadoop analysis platform disclosed in the present application;
FIG. 4 is a schematic structural diagram of an adaptive speed limiting device based on a Hadoop analysis platform disclosed in the present application;
fig. 5 is a block diagram of an electronic device disclosed in the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
At present, a Hadoop distributed system infrastructure is widely applied to various big data analysis platforms and network security detection platforms, along with the development trend of social digital networking intellectualization and explosive increase of data, for various analysis platforms, the performance and stability of the platforms are greatly affected by the rising of data volume, if the performance is abnormal, serious problems such as data loss, abnormal operation of a computing engine, hanging of cluster service and the like are caused, and further serious consequences such as loss of business data, network attack utilization events, property loss and the like are caused.
Therefore, the self-adaptive speed-limiting scheme based on the Hadoop analysis platform can prevent the performance abnormity of the analysis platform caused by overlarge data volume in the application environment of the Hadoop analysis platform.
The embodiment of the invention discloses a self-adaptive speed limiting method based on a Hadoop analysis platform, which is shown in a figure 1 and comprises the following steps:
step S11: and carrying out performance test on the Hadoop analysis platform, and standardizing the test result of the performance test into a configuration format so as to arrange the configuration format in a database of the Hadoop analysis platform.
In the embodiment of the application, a performance test is performed on a Hadoop analysis platform which is actually applied, and the performance test dimension may include a hardware parameter, a number of Yarn/Kafka cluster nodes, a data rate, a data size, and the like, which is not specifically limited herein. Obtaining the maximum or optimal performance result of the Hadoop analysis platform according to the performance test result, standardizing the performance test result into a configuration format, and embedding the configuration into a platform database.
It should be noted that, in the embodiment of the present application, the relevant parameters of the Hadoop analysis platform may also be standardized into the configuration format according to the user requirement, so that the parameter configuration is realized, and manual configuration is omitted, so that the process is simpler, easier, controllable, and clearer.
Step S12: and acquiring actual parameters of the Hadoop analysis platform at regular time based on a first preset time interval, and matching the actual parameters with the configuration format to obtain the matched effective configuration.
In the embodiment of the application, after the performance test result of the Hadoop analysis platform is configured, the actual parameters of the Hadoop analysis platform are obtained. Specifically, the current number of the yann cluster nodes is obtained through a yann client interface at regular time based on a first preset time interval; acquiring the number of current Kafka cluster nodes through a Kafka client interface based on the first preset time interval; and obtaining the current physical machine memory through a memory checking command based on the first preset time interval.
In the first specific embodiment, since the yan (Another Resource coordinator) is a new Hadoop Resource manager, which is a universal Resource management system and can provide unified Resource management and scheduling for upper-layer applications, the introduction of the yann client interface to periodically obtain the current number of nodes of the yann cluster brings great benefits to the cluster in terms of utilization rate, unified Resource management, data sharing and the like.
In a second specific embodiment, kafka is an open source stream processing platform developed by the Apache software foundation, written in Scala and Java. Kafka is a high-throughput distributed publish-subscribe messaging system, and can process all action flow data of a consumer in a website by acquiring the number of current Kafka cluster nodes at regular time through a Kafka client interface.
In a third specific embodiment, the current physical machine memory may be obtained through some memory checking commands, for example, by checking a free or cat/proc/imem command to obtain the physical machine memory at regular time, so as to obtain the actual parameter information.
In the embodiment of the application, after the actual parameters of the corresponding Hadoop analysis platform are obtained, the obtained actual parameters are matched with the built-in configuration, and then the actual parameters take effect according to the matching result.
Step S13: and detecting data information running on the Hadoop analysis platform at regular time based on a second preset time interval, and judging whether the data information exceeds the effective configuration.
In the embodiment of the application, after the actual parameters are matched with the built-in configuration, the actual parameters take effect according to the matching result, and then the data rate and the data size on the Hadoop analysis platform applied in practice are dynamically detected. That is, the data rate and the data size running on the Hadoop analysis platform are detected regularly based on a second preset time interval.
It is understood that Apache Flink is an open source streaming framework developed by the Apache software foundation, with the core of the distributed streaming data streaming engine written in Java and Scala. Flink executes arbitrary stream data programs in a data parallel and pipelined manner, flink's pipelined runtime system can execute batch and stream processing programs, and furthermore, flink's runtime itself supports the execution of iterative algorithms. Therefore, in a specific embodiment, the real-time flow rate of data can be detected through a Flink interface at regular time, and the size of a log is obtained through sampling, so that data information running on the Hadoop analysis platform can be detected dynamically.
Step S14: and if the data information exceeds the effective configuration, setting a speed limiter through a first preset interface, and setting speed limiter parameters according to the data information so as to match the speed limiter parameters with the configuration format for self-adaptive speed limiting.
In the embodiment of the application, whether the detection result exceeds the configuration is judged, that is, the detected data information is compared with the effective configuration which takes effect according to the matching result. And if the detected data information exceeds the effective configuration which takes effect according to the matching result, the speed limit is required to be carried out, and then the matching is carried out according to the speed limit and the configuration format so as to obtain the matched effective configuration.
Specifically, a speed limiter is set through a second preset interface, for example, a speed limiter is set through a Kafka Consumer interface, then a speed limiter parameter is set according to the data information, the speed limiter parameter is a flow size, that is, the parameter size of the speed limiter is configured as a product of the data rate and the data size, so as to obtain the speed limiter parameter, and the speed limiter parameter is stored after the configuration is completed. The platform performs self-adaptive speed limiting according to the matched configuration.
It can be understood that, if the data information does not exceed the effective configuration, the current data flow rate is not limited, that is, the Hadoop analysis platform is not limited in speed, so that the Hadoop analysis platform operates according to the data information.
In the application, a Hadoop analysis platform is subjected to performance test, a test result of the performance test is standardized into a configuration format, and the configuration format is built in a database of the Hadoop analysis platform; acquiring actual parameters of the Hadoop analysis platform at regular time based on a first preset time interval, and matching the actual parameters with the configuration format to obtain matched effective configuration; detecting data information running on the Hadoop analysis platform at regular time based on a second preset time interval, and judging whether the data information exceeds the effective configuration; and if the data information exceeds the effective configuration, setting a speed limiter through a first preset interface, and setting speed limiter parameters according to the data information so as to match the speed limiter parameters with the configuration format for self-adaptive speed limiting. The method is wide in application and suitable for various Hadoop-based architecture platforms; secondly, the test result of the performance test is standardized into a configuration format and is arranged in a database of a Hadoop analysis platform, parameters are configured, manual configuration is omitted, and the process is simpler, controllable and clear; thirdly, data information running on the analysis platform is dynamically detected in real time based on a preset time interval, self-adaptive speed limitation is carried out according to detection results, the method has the advantages of being reliable, strong in applicability, capable of ensuring stability of each cluster and service, simultaneously shortening delay of data consumption and delay of analysis and storage as much as possible, meanwhile, negative effects caused by surge of data quantity can be avoided, and namely, performance abnormity of the analysis platform caused by overlarge data quantity is prevented.
Illustratively, as shown in fig. 2, it is a flow chart diagram of the overall scheme. Firstly, performing performance test, namely performing performance test on an analysis platform in practical application to obtain the maximum or optimal performance result of the analysis platform; then configuring a performance test result, standardizing the performance test result into a configuration format, and embedding a platform database in the configuration; further, the analysis platform, by obtaining actual parameters of the platform: the method comprises the steps of obtaining the number of current Yarn cluster nodes at fixed time through a Yarn client interface, obtaining the number of current Kafka cluster nodes at fixed time through a Kafka client interface, and obtaining the actual parameters of a platform by checking free or cat/proc/imem and other commands at fixed time; and then matching the actual parameters of the platform with the built-in configuration, and taking effect after matching. Then dynamically detecting the data rate and the data size, judging whether the dynamic detection result exceeds the effective configuration, and if the dynamic detection result exceeds the effective configuration, limiting the speed according to the effective configuration; and if the effective configuration is not exceeded, consuming, analyzing and storing according to the actual speed without limiting the speed. Therefore, the self-adaptive speed limit is carried out based on the actual performance test result, and the self-adaptive speed limit method has the advantages of good dependence and strong applicability.
The embodiment of the application discloses a specific adaptive speed limiting method based on a Hadoop analysis platform, and as shown in a figure 3, the method comprises the following steps:
step S21: and carrying out performance test on the Hadoop analysis platform, and standardizing the test result of the performance test into a configuration format so as to arrange the configuration format in a database of the Hadoop analysis platform.
Step S22: and acquiring actual parameters of the Hadoop analysis platform at regular time based on a first preset time interval, and matching the actual parameters with the configuration format to obtain the matched effective configuration.
Step S23: and detecting data information running on the Hadoop analysis platform at regular time based on a second preset time interval, and judging whether the data information exceeds the effective configuration.
Step S24: and if the data information exceeds the effective configuration, setting a speed limiter through a first preset interface, and setting speed limiter parameters according to the data information so as to match the speed limiter parameters with the configuration format for self-adaptive speed limiting.
For more specific processing procedures of the above steps S21, S22, S23, and S24, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.
Step S25: when the Hadoop analysis platform is expanded during operation and/or cluster nodes in the Hadoop analysis platform are optimized, the step of regularly acquiring the actual parameters of the Hadoop analysis platform based on the first preset time interval is executed to perform self-adaptive speed limiting.
In a specific embodiment, when the Hadoop analysis platform expands the capacity during the operation period, that is, a deployed physical machine is added or a memory is added, the platform does not need to be upgraded or configured to be updated, the step of obtaining the actual parameters of the Hadoop analysis platform at regular time based on the first preset time interval is executed, the adaptive speed limit is continued according to the steps, and details are not repeated herein.
In another specific embodiment, when the Hadoop analysis platform optimizes the cluster nodes during the operation period, for example, after the number of the cluster nodes is increased/decreased, the platform does not need to be upgraded or configured to be updated, the step of obtaining the actual parameters of the Hadoop analysis platform at regular time based on the first preset time interval is executed, the adaptive speed limiting is continued according to the steps, and details are not repeated herein.
Therefore, the method and the device solve the problems of inconsistent hardware parameters of the platform and difficult maintenance of multiple versions, do not need to upgrade programs or update configuration when the platform expands or optimizes cluster nodes, and can realize self-adaption speed limitation by matching built-in parameters again to take effect.
In the application, a Hadoop analysis platform is subjected to performance test, a test result of the performance test is standardized into a configuration format, and the configuration format is built in a database of the Hadoop analysis platform; acquiring actual parameters of the Hadoop analysis platform at regular time based on a first preset time interval, and matching the actual parameters with the configuration format to obtain matched effective configuration; regularly detecting data information running on the Hadoop analysis platform based on a second preset time interval, and judging whether the data information exceeds the effective configuration; and if the data information exceeds the effective configuration, setting a speed limiter through a first preset interface, and setting speed limiter parameters according to the data information so as to match the speed limiter parameters with the configuration format for self-adaptive speed limiting. Therefore, firstly, the method is based on the actual platform configuration matching performance parameters, has wide application and is suitable for various Hadoop architecture-based platforms; secondly, the test result of the performance test is standardized into a configuration format and is arranged in a database of a Hadoop analysis platform, parameters are configured, manual configuration is omitted, and the process is simpler, controllable and clear; thirdly, data information running on the analysis platform is dynamically detected in real time based on a preset time interval, self-adaptive speed limitation is carried out according to detection results, the method has the advantages of being reliable, strong in applicability, capable of ensuring stability of each cluster and service, simultaneously shortening delay of data consumption and delay of analysis and storage as much as possible, meanwhile, negative effects caused by surge of data quantity can be avoided, and namely, performance abnormity of the analysis platform caused by overlarge data quantity is prevented.
Correspondingly, the embodiment of the present application further discloses a self-adaptive speed limiting device based on a Hadoop analysis platform, as shown in fig. 4, the device includes:
the performance testing module 11 is used for performing performance testing on the Hadoop analysis platform;
the test result configuration module 12 is configured to normalize the test result of the performance test into a configuration format, so as to place the configuration format in a database of the Hadoop analysis platform;
the matching validation module 13 is configured to periodically obtain actual parameters of the Hadoop analysis platform based on a first preset time interval, and match the actual parameters with the configuration format to obtain a matched validation configuration;
the judgment detection module 14 is configured to detect data information running on the Hadoop analysis platform at regular time based on a second preset time interval, and judge whether the data information exceeds the effective configuration;
and the self-adaptive speed limit module 15 is configured to set a speed limiter through a first preset interface if the data information exceeds the effective configuration, and set a speed limiter parameter according to the data information, so as to match the speed limiter parameter with the configuration format for self-adaptive speed limit.
For more specific working processes of the modules, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.
Therefore, according to the scheme of the embodiment, the performance of the Hadoop analysis platform is tested, and the test result of the performance test is standardized into the configuration format, so that the configuration format is built in the database of the Hadoop analysis platform; acquiring actual parameters of the Hadoop analysis platform at regular time based on a first preset time interval, and matching the actual parameters with the configuration format to obtain matched effective configuration; regularly detecting data information running on the Hadoop analysis platform based on a second preset time interval, and judging whether the data information exceeds the effective configuration; and if the data information exceeds the effective configuration, setting a speed limiter through a first preset interface, and setting speed limiter parameters according to the data information so as to match the speed limiter parameters with the configuration format for self-adaptive speed limiting. Therefore, firstly, the method is based on the actual platform configuration matching performance parameters, has wide application and is suitable for various Hadoop architecture-based platforms; secondly, the test result of the performance test is standardized into a configuration format and is arranged in a database of a Hadoop analysis platform, parameters are configured, manual configuration is omitted, and the process is simpler, controllable and clear; thirdly, data information running on the analysis platform is dynamically detected in real time based on a preset time interval, self-adaption speed limitation is carried out according to detection results, the applicability is strong, the delay of data consumption and the delay of analysis and storage are shortened as much as possible while stability of each cluster and service is ensured, meanwhile, negative effects caused by rapid increase of data quantity can be avoided, namely, performance abnormity of the analysis platform caused by overlarge data quantity is prevented.
Further, an electronic device is disclosed in the embodiments of the present application, and fig. 5 is a block diagram of the electronic device 20 according to an exemplary embodiment, which should not be construed as limiting the scope of the application.
Fig. 5 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present disclosure. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. The memory 22 is used for storing a computer program, and the computer program is loaded and executed by the processor 21 to implement relevant steps in the adaptive speed limiting method based on the Hadoop analysis platform disclosed in any of the foregoing embodiments. In addition, the electronic device 20 in the present embodiment may be a computer.
In this embodiment, the power supply 23 is configured to provide a working voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and a communication protocol followed by the communication interface is any communication protocol applicable to the technical solution of the present application, and is not specifically limited herein; the input/output interface 25 is configured to obtain external input data or output data to the outside, and a specific interface type thereof may be selected according to specific application requirements, which is not specifically limited herein.
In addition, the memory 22 is used as a carrier for storing resources, and may be a read-only memory, a random access memory, a magnetic disk, an optical disk, or the like, the resources stored thereon may include an operating system 221, a computer program 222, data 223, and the like, and the data 223 may include various data. The storage means may be transient storage or permanent storage.
The operating system 221 is used for managing and controlling each hardware device on the electronic device 20 and the computer program 222, and may be Windows Server, netware, unix, linux, or the like. The computer program 222 may further include a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the adaptive speed limiting method based on the Hadoop analysis platform executed by the electronic device 20 disclosed in any of the foregoing embodiments.
Further, embodiments of the present application disclose a computer-readable storage medium, where the computer-readable storage medium includes a Random Access Memory (RAM), a Memory, a Read-Only Memory (ROM), an electrically programmable ROM, an electrically erasable programmable ROM, a register, a hard disk, a magnetic disk, or an optical disk or any other form of storage medium known in the art. When being executed by a processor, the computer program realizes the self-adaptive speed limiting method based on the Hadoop analysis platform. For the specific steps of the method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, which are not described herein again.
In the present specification, the embodiments are described in a progressive manner, and each embodiment focuses on differences from other embodiments, and the same or similar parts between the embodiments are referred to each other. The device disclosed in the embodiment corresponds to the method disclosed in the embodiment, so that the description is simple, and the relevant points can be referred to the description of the method part.
The steps of the Hadoop analysis platform based adaptive rate limiting or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
The method, the device, the equipment and the medium for self-adaptive speed limit based on the Hadoop analysis platform are introduced in detail, a specific example is applied in the text to explain the principle and the implementation mode of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (8)

1. A self-adaptive speed limiting method based on a Hadoop analysis platform is characterized by comprising the following steps:
carrying out performance test on a Hadoop analysis platform, and standardizing a test result of the performance test into a configuration format so as to arrange the configuration format in a database of the Hadoop analysis platform;
acquiring actual parameters of the Hadoop analysis platform at regular time based on a first preset time interval, and matching the actual parameters with the configuration format to obtain matched effective configuration;
detecting data information running on the Hadoop analysis platform at regular time based on a second preset time interval, and judging whether the data information exceeds the effective configuration;
if the data information exceeds the effective configuration, setting a speed limiter through a first preset interface, and setting speed limiter parameters according to the data information so as to match the speed limiter parameters with the configuration format for self-adaptive speed limiting;
the method for regularly acquiring the actual parameters of the Hadoop analysis platform based on the first preset time interval comprises the following steps: acquiring the number of current Yarn cluster nodes through a Yarn client interface at regular time based on a first preset time interval; acquiring the number of current Kafka cluster nodes through a Kafka client interface based on the first preset time interval; obtaining the current physical machine memory through a memory viewing command at regular time based on the first preset time interval;
the regularly detecting the data information running on the Hadoop analysis platform based on the second preset time interval comprises the following steps: and detecting the data rate and the data size running on the Hadoop analysis platform at regular time based on a second preset time interval.
2. The Hadoop analysis platform-based adaptive speed limiting method according to claim 1, wherein the step of periodically detecting data information running on the Hadoop analysis platform based on a second preset time interval comprises the steps of:
detecting the data rate on the Hadoop analysis platform through a second preset interface based on a second preset time interval timing;
and regularly sampling the log file based on a second preset time interval to obtain the data size on the Hadoop analysis platform.
3. The Hadoop analysis platform-based adaptive speed limiting method according to claim 1, wherein the determining whether the data information exceeds the validation configuration comprises:
and if the data information does not exceed the effective configuration, not limiting the speed of the Hadoop analysis platform so that the Hadoop analysis platform can operate according to the data information.
4. The Hadoop analysis platform-based adaptive speed limiting method as claimed in claim 1, wherein if the data information exceeds the effective configuration, setting a speed limiter via a first preset interface, and setting speed limiter parameters according to the data information, comprises:
if the data information exceeds the effective configuration, setting a speed limiter through a Kafka Consumer interface;
and configuring the parameter size of the speed limiter as the product of the data rate and the data size to obtain the speed limiter parameter.
5. The Hadoop analysis platform-based adaptive speed limiting method according to any one of claims 1 to 4, characterized by further comprising:
when the Hadoop analysis platform is expanded during operation and/or cluster nodes in the Hadoop analysis platform are optimized, the step of regularly acquiring the actual parameters of the Hadoop analysis platform based on the first preset time interval is executed to perform self-adaptive speed limiting.
6. The utility model provides a self-adaptation speed limiting device based on Hadoop analysis platform which characterized in that includes:
the performance testing module is used for carrying out performance testing on the Hadoop analysis platform;
the test result configuration module is used for standardizing the test result of the performance test into a configuration format so as to arrange the configuration format in a database of the Hadoop analysis platform;
the matching validation module is used for regularly acquiring actual parameters of the Hadoop analysis platform based on a first preset time interval and matching the actual parameters with the configuration format to obtain a matched validation configuration;
the judgment detection module is used for regularly detecting data information running on the Hadoop analysis platform based on a second preset time interval and judging whether the data information exceeds the effective configuration;
the self-adaptive speed limiting module is used for setting a speed limiter through a first preset interface if the data information exceeds the effective configuration, and setting speed limiter parameters according to the data information so as to match the speed limiter parameters with the configuration format for self-adaptive speed limiting;
the matching validation module is used for acquiring the number of current Yarn cluster nodes through a Yarn client interface at regular time based on a first preset time interval; acquiring the number of current Kafka cluster nodes through a Kafka client interface based on the first preset time interval; obtaining the current physical machine memory through a memory viewing command at regular time based on the first preset time interval;
and the judgment detection module is used for detecting the data rate and the data size running on the Hadoop analysis platform at regular time based on a second preset time interval.
7. An electronic device, comprising a processor and a memory; wherein the memory is used for storing a computer program which is loaded and executed by the processor to realize the Hadoop analysis platform-based adaptive speed limiting method according to any one of claims 1 to 5.
8. A computer-readable storage medium for storing a computer program; wherein the computer program when executed by a processor implements the Hadoop analysis platform based adaptive rate limiting method according to any one of claims 1 to 5.
CN202210984130.8A 2022-08-17 2022-08-17 Adaptive speed limiting method, device, equipment and medium based on Hadoop analysis platform Active CN115061898B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210984130.8A CN115061898B (en) 2022-08-17 2022-08-17 Adaptive speed limiting method, device, equipment and medium based on Hadoop analysis platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210984130.8A CN115061898B (en) 2022-08-17 2022-08-17 Adaptive speed limiting method, device, equipment and medium based on Hadoop analysis platform

Publications (2)

Publication Number Publication Date
CN115061898A CN115061898A (en) 2022-09-16
CN115061898B true CN115061898B (en) 2022-11-08

Family

ID=83208398

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210984130.8A Active CN115061898B (en) 2022-08-17 2022-08-17 Adaptive speed limiting method, device, equipment and medium based on Hadoop analysis platform

Country Status (1)

Country Link
CN (1) CN115061898B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106021411A (en) * 2016-05-13 2016-10-12 大连理工大学 Storm task deployment and configuration platform with cluster adaptability
CN107800588A (en) * 2017-10-19 2018-03-13 上海市共进通信技术有限公司 Cross-platform network aptitude test system and method based on Y.1731 agreement
CN109445935A (en) * 2018-10-10 2019-03-08 杭州电子科技大学 A kind of high-performance big data analysis system self-adaption configuration method under cloud computing environment
CN112069029A (en) * 2020-09-04 2020-12-11 北京计算机技术及应用研究所 Performance acquisition monitoring system of domestic platform PMU self-adaptation
CN112269697A (en) * 2020-10-23 2021-01-26 苏州浪潮智能科技有限公司 Equipment storage performance testing method, system and related device
CN112631246A (en) * 2020-12-11 2021-04-09 国汽(北京)智能网联汽车研究院有限公司 Test evaluation information determination method, device, equipment and computer storage medium
CN112953767A (en) * 2021-02-05 2021-06-11 深圳前海微众银行股份有限公司 Resource allocation parameter setting method and device based on Hadoop platform and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8887056B2 (en) * 2012-08-07 2014-11-11 Advanced Micro Devices, Inc. System and method for configuring cloud computing systems
US9262231B2 (en) * 2012-08-07 2016-02-16 Advanced Micro Devices, Inc. System and method for modifying a hardware configuration of a cloud computing system
US20170124497A1 (en) * 2015-10-28 2017-05-04 Fractal Industries, Inc. System for automated capture and analysis of business information for reliable business venture outcome prediction
US11921736B2 (en) * 2020-12-17 2024-03-05 Microsoft Technology Licensing, Llc System for unsupervised direct query auto clustering for location and network quality
CN113590576A (en) * 2021-02-05 2021-11-02 华中科技大学 Database parameter adjusting method and device, storage medium and electronic equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106021411A (en) * 2016-05-13 2016-10-12 大连理工大学 Storm task deployment and configuration platform with cluster adaptability
CN107800588A (en) * 2017-10-19 2018-03-13 上海市共进通信技术有限公司 Cross-platform network aptitude test system and method based on Y.1731 agreement
CN109445935A (en) * 2018-10-10 2019-03-08 杭州电子科技大学 A kind of high-performance big data analysis system self-adaption configuration method under cloud computing environment
CN112069029A (en) * 2020-09-04 2020-12-11 北京计算机技术及应用研究所 Performance acquisition monitoring system of domestic platform PMU self-adaptation
CN112269697A (en) * 2020-10-23 2021-01-26 苏州浪潮智能科技有限公司 Equipment storage performance testing method, system and related device
CN112631246A (en) * 2020-12-11 2021-04-09 国汽(北京)智能网联汽车研究院有限公司 Test evaluation information determination method, device, equipment and computer storage medium
WO2022121248A1 (en) * 2020-12-11 2022-06-16 国汽(北京)智能网联汽车研究院有限公司 Test evaluation information determining method, apparatus and device, and computer storage medium
CN112953767A (en) * 2021-02-05 2021-06-11 深圳前海微众银行股份有限公司 Resource allocation parameter setting method and device based on Hadoop platform and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《Spark大数据平台性能优化方法研究》;王国路;《中国优秀硕士学位论文全文数据库信息科技辑(电子期刊)》;20190515;全文 *
《Towards Performance Optimization for Hadoop MapReduce Applications》;Than Than Htay et al.;《 2020 17th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON)》;20200804;全文 *
高性能计算平台的IO性能测试与分析;李亮等;《计算机与现代化》;20110615(第06期);全文 *

Also Published As

Publication number Publication date
CN115061898A (en) 2022-09-16

Similar Documents

Publication Publication Date Title
US8782215B2 (en) Performance testing in a cloud environment
US10320623B2 (en) Techniques for tracking resource usage statistics per transaction across multiple layers of protocols
US10805171B1 (en) Understanding network entity relationships using emulation based continuous learning
CN111124819B (en) Method and device for full link monitoring
CN107528858B (en) Login method, device and equipment based on webpage and storage medium
CN112989330B (en) Container intrusion detection method, device, electronic equipment and storage medium
US20220029888A1 (en) Detect impact of network maintenance in software defined infrastructure
US8661456B2 (en) Extendable event processing through services
CN111258627A (en) Interface document generation method and device
US11934287B2 (en) Method, electronic device and computer program product for processing data
CN109739711B (en) Interface test method, device, equipment and storage medium
JP2018129027A (en) System and method for executing anti-virus scan of web page
CN111147310A (en) Log tracking processing method, device, server and medium
US10721260B1 (en) Distributed execution of a network vulnerability scan
US11635972B2 (en) Multi-tenant java agent instrumentation system
CN114553960A (en) Data caching method, device, equipment and storage medium
WO2019168715A1 (en) Event to serverless function workflow instance mapping mechanism
CN110457132B (en) Method and device for creating functional object and terminal equipment
US20220113987A1 (en) Intelligent launch of applications
CN115061898B (en) Adaptive speed limiting method, device, equipment and medium based on Hadoop analysis platform
US11768889B1 (en) Evaluating configuration files for uniform resource indicator discovery
CN113810342B (en) Intrusion detection method, device, equipment and medium
CN110955579A (en) Ambari-based large data platform monitoring method
US11860752B2 (en) Agentless system and method for discovering and inspecting applications and services in compute environments
CN113297158B (en) Cloud security product management method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant